Difference between revisions of "ORM best practices"

Revision as of 23:10, 22 March 2010

This page documents a number of best practices when it comes to persisting RIM-based objects, and *O-CP/RP cell transitions.

ER database

This section discusses some of the best practices for those applications that map RIM objects to an E-R database structure.
Note to readers: the reader is assumed to be aware of the characteristics of the object data model, the entity-relationship (ER) data model and the entity-attribute-value (EAV) data model.

See ORM for references.

Specific References

http://www.slideshare.net/AShakir/rim-based-relational-database-design-tutorial-september-2008

Introduction

The purpose of this white paper is to describe a process for transforming the Health Level Seven (HL7) Reference Information Model (RIM) into a Logical Data Model (LDM) and a Physical Data Model (PDM). The HL7 RIM is a conceptual data model built and maintained by HL7 to be used as a source of the data content portion of HL7 standards. Because it is a conceptual data model, the RIM is not well suited to be used as the design of a relational database schema. The RIM contains abstract classes, complex datatypes, and generic business rules that need to be resolved and reconciled to form a LDM. LDM can be mapped to its PDM via Hibernate o-r mapping tool for the target databases MySQL and Oracle.

A series of refinement steps is required to transform the RIM into a LDM that would be useful as input to design of a database schema. The steps are designed to render a less abstract model with normalized data structures and realm specific business rules. The resulting LDM is well suited for use in database design activities. The LDM is technology independent and to some extent application independent. As application design begins the LDM will be used as input and further transformed into a PDM. The PDM will consider the technological implications of the database schema including items such as the capabilities of the target database management system (DBMS), tuned for a typical OLTP application with 300 concurrent end users logged in.

Scope

The classes and datatypes in the HL7 RIM are grouped into logical packages. There are six core packages in the RIM, other packages are used for administrative and model management purposes only. The Entities, Roles, and Acts packages contain the respective entity, role, and act generalization hierarchies and dependent classes. The Datatypes package contains all of the defined version 3.0 datatypes. The Structured Documents and Message Control packages contain classes that represent artifacts peculiar to HL7 messaging and clinical document architecture standards. For the purpose of our example model only a subset of classes of Acts, Datatype, Entities, and Roles packages will be considered within scope of the paper.

To keep this paper simple, let's select only the relevant attributes:

Resolving Generic Datatypes

These Data Types are included into the JavaSIG package: org.hl7.types.impl The relative java wrappers (WrappingBagAccessor, WrappingSetAccessor, WrappingListAccessor) are included into the package: org.hl7.hibernate

“IVL”, “EIVL” or “PIVL” Datatype

Entity.existence_time : IVL <TS>
- Entity.existence_start_time : TS
- Entity/existence_end_time : TS
Manufactured_material.stability_time : IVL <TS>
- Manufactured_material.stability_time_start : TS
- Manufactured_material.stability_time_stop : TS
Act.repeat_number : IVL<TS>
- Act.repeat_number_start : TS
- Act.repeat_number_stop : TS
Supply.expected_use_time : IVL<TS>
- Supply.expected_use_time_start : TS
- Supply.expected_use_time_stop : TS
SubstanceAdmnistration.rate_qty: IVL<PQ>
- SubstanceAdmnistration.rate_qty_start : PQ
- SubstanceAdmnistration.rate_qty_stop : PQ
SubstanceAdmnistration.dose_qty: IVL<PQ>
- SubstanceAdmnistration.dose_qty_start : PQ
- SubstanceAdmnistration.dose_qty_stop : PQ
Account.allowed_balance_qty: IVL<MO>
- Account.allowed_balance_qty_start: MO
- Account.allowed_balance_qty_stop: MO
RoleLink.effective_time: IVL<TS>
- RoleLink.effective_time_start: TS
- RoleLink.effective_time_stop: TS
Role.effective_time: IVL<TS>
- Role.effective_time_start: TS
- Role.effective_time_stop: TS
Participation.time:IVL<TS>
- Participation.time_start: TS
- Participation.time_stop: TS

“SET”, “DSET”, “QSET” or “List” Datatypes

Entity.id : SET<II> ==> Enity_Identifier.
Entity.status_cd : SET<CS> ==> Entity.status_cd : CS
Person.ethnic_group_cd : SET<CE> ==> Person.ethnic_group_cd : CE
Person.race_cd : SET<CE> ==> Person_Race.
Entity.qty : SET<PQ> ==> Entity.qty PQ
Person.religious_affiliation : SET<CE> ==> Person.religious_affiliation : CE
Act.id: SET<II> ==> Act_Identifier
Act.priority: SET<CE> ==> Act_Priority
Act.confidentiality_cd: SET<CE> ==> Act_Confidentiality
Act.reason_cd: SET<CE> ==> Act_Reason
PatientEncounter.special_courtesies_cd: SET<CE> ==> PatientEncounter_Courtesies
PatientEncounter.special_arrangement_cd: SET<CE> ==> PatientEncounter_Arrangement
Procedure.method_cd: SET<CE> ==> Procedure_method
Procedure.approach_site_cd: SET<CE> ==> Procedure_approach_site
Procedure.target_site_cd: SET<CE> ==> Procedure_target_site
Observation.interpretation_cd: SET<CE> ==> Observation_interpretation
Observation.method_cd: SET<CE> ==> Observation_method
Observation.target_site_cd: SET<CD> ==> Observation_target
SubstanceAdministration.approach_site_cd: SET<CD> ==> SubstanceAdministration_approach_site
SubstanceAdministration.dose_check_qty: SET<RTO> ==> SubstanceAdministration_dose_check
SubstanceAdministration.max_dose_qty: SET<RTO> ==> SubstanceAdministration_max_dose
SubstanceAdministration.method_cd: SET<CD> ==> SubstanceAdministration_method
InvoiceElement.modifier_cd: SET<CE> ==> InvoiceElement_modifier
DeviceTask.parameter_value: LIST<ANY> ==> DeviceTask_parameters
Role.id: SET<II> ==> Role_identifier
Role.confidentiality_cd: SET<CE> ==> Role_confidentiality
Role.position_number: LIST<INT> ==> Role_position
ManagedParticipation.id: SET<II> ==> ManagedParticipation_identifier
Document.bibliographic_designation_txt: SET<ED> ==> Document_Designation

Note: SET<CS> type for Entity.status_cd has been simplified to a single CS value (except Person.race_cd).

“BAG” Datatypes

Entity.nm : BAG<EN> ==> Enity_Name.
Entity.telecom : BAG<TEL> ==> Entity_Telecom_Address.
Organization.addr : BAG<AD> ==> Organization_Address.
Person.addr : BAG<AD> ==> Person_Address.
Role.name BAG<EN> ==> Role_name
Role.addr: BAG<AD> ==> Role_address

Add Association Attributes

Entity_identifier./identified_entity : ENTITY
Entity_name./named_entity : ENTITY
Entity_telcom_address./addressed_entity : ENTITY
Organization_address./addressed_organization : ORGANIZATION
Person_address./addressed_person : PERSON
Person_race./attributed_person : PERSON
Act_reason.act: Act
Act_Confidentiality: Act
Act_Priority: Act
Act_identifier: Act
PatientEncounter_Courtesies.pencounter: PatientEncounter
PatientEncounter_Arragement.pencounter: PatientEncounter
Procedure_method.procedure: Procedure
Procedure_approach_site.procedure: Procedure
Procedure_target_site.procedure: Procedure
Observation_interpretation.observation: Observation
Observation_target.observation: Observation
Observation_method.observation: Observation
SubstanceAdministration_approach_site.sub_admin: SubstanceAdministration
SubstanceAdministration_dose_check.sub_admin: SubstanceAdministration
SubstanceAdministration_method.sub_admin: SubstanceAdministration
SubstanceAdministration_max_dose.sub_admin: SubstanceAdministration
Role_identifier. identified_role : Role
Role_confidentiality.confidentiality_role : Role
Role_position. positioned_role : Role
Role_name.named_role: Role
Role_address.addressed_role: Role
Role_telcom.addressed_role: Role
ManagedParticipation.identified_mparticipation: ManagedParticipation
Document_Designation.document: Document

Delete Generalization Relationships

Entity § Living_subject
Entity § Organization
Entity § Place
Entity § Material
Entity § Group
Living_subject § Person
Living_subject § Non_Person_Living_subject
Material § Manufactured_material
Manufactured_material § Container
Manufactured_material § Device
Device § Imaging_modality
Act § PatientEncounter
Act § ControlAct
Act § Supply
Act § WorkingList
Act § Procedure
Act § Observation
Act § DeviceTask
Act § SubstanceAdministration
Act § FinancialContract
Act § Account
Act § FinancialTransaction
Act § InvoiceElement
Diet § Supply
Observation § PublicHealthCase
Observation § DiagnosticImage
Role § Employee
Role § Access
Role § QualifiedEntity
Role § LicensedEntity
Role § Patient
Participation § ManagedParticipation
ContextStructure § Document

Propagate Inherited Attributes

Propagate Inherited Associations

Simplify datatypes

Person.nm: EN ==> PN
Organization.nm: EN ==> ON
All attributes of type CE ==> CV

Datatype expansion

All classes with composed attributes are expanded in the composing parts. The types interested in this process are PN, ON, EN, AD, TEL, II, PQ.

EN

Every attribute of type EN is expanded as the following schema.

This schema models the compositional nature of the attribute EN expanding in a container relationship.

Tables that have a EN attribute, create a table of name tablename_Name_Part table then this table refers the Entity_Name table.

ON and EN

ON (Organization Name) and PN (Person Name) are specializations of EN that doesn’t add any attribute, so they are expanded in the same way as EN.

AD

AD represents the concept of address and defines it in terms of parts and subparts. It is expanded like in the in schema:

The Person entity then refer the Person_Person_Address.

TEL

The TEL datatypes represents a set of telephone reference, so it is modeled as a multipart relation that expands in a schema like the following, in this case is showed the relation applied to an Organization entity:

II

All attributes of type II are expanded in the three compositing sub attributes:

root: UID
extension: ST
authority: ST

PQ

This type of attribute specify a physical quantity, and has many sub attributes as shows in the following schema:

Every table with a PQ attribute has reference to the table PhysicalQuantity.

Datatype Promotion

In this phase the CV (Coded Value) and CS (Coded Simple Value) data types are expanded in the corresponding UML class schema, so the all attributes of this types implicitly refers to this inserted element.

LDM-2-PDM

Given a clean LDM, this guide assumes that Hibernate is used to generate the PDM. Hibernate offers 3 different O-R strategies.

We have chosen:

table per hierarchy for the 6 foundation classes (and relative sub-classes) and code values: Act, Entity, Role,Participation,ActRelationship & CodeValue.
table per concrete class for all other related classes
There is a group of 5 tables dedicated to Vocabolaries and DataTypes

 if you are going to use JavaSIG to persist RIM data into the PDM mind that you will have to adapt the wrappers.

HIBERNATE MAPPING FILE

PDM 4 MySQL

DATATYPE Mapping

REAL = DECIMANL(9,3)
ST = VARCHAR(5000)
UID = INT(11)
ED = VARCHAR(5000)
GTS = DATETIME
TS = TIMESTAMP
BL = BIT(1)
INT = INT(11)

PDM 4 ORACLE 11gR1

DATATYPE Mapping

REAL = NUMBER
ST = VARCHAR(5000)
UID = NUMBER
ED = VARCHAR(5000)
GTS = DATE
TS = DATE
BL = BOOLEAN
INT = NUMBER

WARNINGS

Object names cannot be longer than 30 characters, thus some table and index names have to be shorten.
"auto increment" capabilities are implemented via the creation of SEQUENCES, incremented by PRE-INSERT db triggers.
Mind that SORTing clause is managed differently by the 2 DB technologies:

SELECT * FROM table ORDER BY col1
is allowed in MySQL but forbidden in Oracle, which asks for a more precise statement:
SELECT col1, col2 FROM table ORDER BY col1

MySQL is CASE INSENSITIVE while Oracle is CASE SENSITIVE, that means re-writing in Oracle the WHERE clause and ORDER BY by making use of UPPER() function in order to make the indexed search retrieve data stored in mixed case.

set properly the characterset

CHARACTER SET AL32UTF8;
NATIONAL CHARACTER SET AL16UTF16;

ORACLE ENVIRONMENT

Oracle environment sul server Linux:
EDITOR=vi
export EDITOR
ORACLE_SID=phidb
export ORACLE_SID
ORACLE_BASE=/oracle/app
export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/11.1.0/db_1
export ORACLE_HOME
ORA_CRS_HOME=$ORACLE_BASE/product/11.1.0/crs_1
export ORA_CRS_HOME
LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH
PATH=$ORACLE_BASE/admin/rman/bin:$ORACLE_HOME/bin:
$ORA_CRS_HOME/bin:/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/X11R6/bin:.
export PATH
umask 022

Place all DATAFILES in $ORACLE_BASE/oradata/phidb
Place ".bash_profile" and ".oracle_profile" into /home/oracle

# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then . ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
if [ -f ~/.oracle_profile ]; then . ~/.oracle_profile
fi

.oracle_profile
PS1="`/bin/hostname `-> "
export PS1
EDITOR=vi
export EDITOR
ORACLE_SID=phidb
export ORACLE_SID
ORACLE_BASE=/oracle/app
export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/11.1.0/db_1
export ORACLE_HOME
ORA_CRS_HOME=$ORACLE_BASE/product/11.1.0/crs_1
export ORA_CRS_HOME
LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH PATH=$ORACLE_BASE/admin/rman/bin:$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/X11R6/bin:.
export PATH
umask 022

$ORACLE_BASE/admin has the following structure:
phidb
|-- adump
|-- ddl
| |-- arrdb.sql
| |-- dropall.sql
| |-- jbpm.sql
| |-- mkall.log
| |-- mkall.sql
| |-- mktablespaces.sql
| |-- mkusers.sql
| |-- qrtz.sql
| `-- rimpdm2.sql
|-- dpdump
| `-- dp.log
|-- pfile
| |-- crspfile.sql
| `-- init.ora
`-- scripts
|-- CreateDB.sql
|-- CreateDBCatalog.sql
|-- CreateDBFiles.sql
|-- JServer.sql
|-- apex.sql
|-- context.sql
|-- cwmlite.sql
|-- emRepository.sql
| -- init.ora
|-- interMedia.sql
|-- lockAccount.sql
|-- ordinst.sql
|-- owb.sql
|-- phidb.sh
|-- phidb.sql
|-- postDBCreation.sql
|-- spatial.sql
|-- ultraSearch.sql
|-- ultraSearchCfg.sql
`-- xdb_protocol.sql

XML database

This section discusses some of the best practices for those applications that map RIM objects to a XML database.
Note to readers: the reader is assumed to have basic knowledge of XML Databases.

It has the advantage that full fidelity (of the XML) is maintained. The use of an XML database makes sense if one only has to support CDA, in which case the XML version has to persisted, even if the data contained therein is stored in a structured fashion.
It has the advantage that one doesn't need any joins (depending on how big of an XML snippet one persists, could be the entire message/CDA message type, or it could just be a series of derived XML snippets, e.g. the CDA part which identifies the Patient).
It has the disadvantage that XQuery is a standard unfamiliar to most, one implementation had to export data to ER-based Datamarts - these Datamarts where queried using SQL to do statistical/clinical analysis of the data.

Best practices:

If one received data according to different Message Types, the clone names used for classes will be different. To use XQuery, one has to transform the XML (as received) into an alternative representation which is based on names of RIM classes. This allows for XQuerying across all data received, regardless of the clone names used in any particular model. If persistence isn't limited to one single message type (e.g. the CDA R-MIM for some applications) the use of the RIM ITS should be seen as a necessity.

@@ Line 335: / Line 335: @@
 *Object names cannot be longer than 30 characters, thus some table and index names have to be shorten.
 *"auto increment" capabilities are implemented via the creation of SEQUENCES, incremented by PRE-INSERT db triggers.
+*Mind that '''SORT'''ing clause is managed differently by the 2 DB technologies:
+ SELECT * FROM table ORDER BY col1
+ is allowed in MySQL but forbidden in Oracle, which asks for a more precise statement:
+ SELECT col1, col2 FROM table ORDER BY col1
+*MySQL is CASE INSENSITIVE while Oracle is CASE SENSITIVE, that means re-writing in Oracle the WHERE clause and ORDER BY by making use of UPPER() function in order to make the indexed search retrieve data stored in mixed case.
+*set properly the characterset
   CHARACTER SET AL32UTF8;
   NATIONAL CHARACTER SET AL16UTF16;

Difference between revisions of "ORM best practices"

Revision as of 23:10, 22 March 2010

ER database

XML database

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

groups

meetings

general

Tools