This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

ORM best practices

From HL7Wiki
Revision as of 08:10, 25 March 2015 by Rene spronk (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page documents a number of best practices when it comes to persisting RIM-based objects, and *O-CP/RP cell transitions.

ER database

This section discusses some of the best practices for those applications that map RIM objects to an E-R database structure.
Note to readers: the reader is assumed to be aware of the characteristics of the object data model, the entity-relationship (ER) data model and the entity-attribute-value (EAV) data model.


  • See ORM for references.

Specific References

Introduction

The purpose of this white paper is to describe a process for transforming the Health Level Seven (HL7) Reference Information Model (RIM) into a Logical Data Model (LDM) and a Physical Data Model (PDM). The HL7 RIM is a conceptual data model built and maintained by HL7 to be used as a source of the data content portion of HL7 standards. Because it is a conceptual data model, the RIM is not well suited to be used as the design of a relational database schema. The RIM contains abstract classes, complex datatypes, and generic business rules that need to be resolved and reconciled to form a LDM. LDM can be mapped to its PDM via Hibernate o-r mapping tool for the target databases MySQL and Oracle.

A series of refinement steps is required to transform the RIM into a LDM that would be useful as input to design of a database schema. The steps are designed to render a less abstract model with normalized data structures and realm specific business rules. The resulting LDM is well suited for use in database design activities. The LDM is technology independent and to some extent application independent. As application design begins the LDM will be used as input and further transformed into a PDM. The PDM will consider the technological implications of the database schema including items such as the capabilities of the target database management system (DBMS), tuned for a typical OLTP application with 300 concurrent end users logged in.

Scope

The classes and datatypes in the HL7 RIM are grouped into logical packages. There are six core packages in the RIM, other packages are used for administrative and model management purposes only. The Entities, Roles, and Acts packages contain the respective entity, role, and act generalization hierarchies and dependent classes. The Datatypes package contains all of the defined version 3.0 datatypes. The Structured Documents and Message Control packages contain classes that represent artifacts peculiar to HL7 messaging and clinical document architecture standards. For the purpose of our example model only a subset of classes of Acts, Datatype, Entities, and Roles packages will be considered within scope of the paper. Acimg1.GIF

To keep this paper simple, let's select only the relevant attributes:

Acimg2.GIF

Resolving Generic Datatypes

These Data Types are included into the JavaSIG package: org.hl7.types.impl The relative java wrappers (WrappingBagAccessor, WrappingSetAccessor, WrappingListAccessor) are included into the package: org.hl7.hibernate

“IVL”, “EIVL” or “PIVL” Datatype

  • Entity.existence_time : IVL <TS>
    • Entity.existence_start_time : TS
    • Entity/existence_end_time : TS
  • Manufactured_material.stability_time : IVL <TS>
    • Manufactured_material.stability_time_start : TS
    • Manufactured_material.stability_time_stop : TS
  • Act.repeat_number : IVL<TS>
    • Act.repeat_number_start : TS
    • Act.repeat_number_stop : TS
  • Supply.expected_use_time : IVL<TS>
    • Supply.expected_use_time_start : TS
    • Supply.expected_use_time_stop : TS
  • SubstanceAdmnistration.rate_qty: IVL<PQ>
    • SubstanceAdmnistration.rate_qty_start : PQ
    • SubstanceAdmnistration.rate_qty_stop : PQ
  • SubstanceAdmnistration.dose_qty: IVL<PQ>
    • SubstanceAdmnistration.dose_qty_start : PQ
    • SubstanceAdmnistration.dose_qty_stop : PQ
  • Account.allowed_balance_qty: IVL<MO>
    • Account.allowed_balance_qty_start: MO
    • Account.allowed_balance_qty_stop: MO
  • RoleLink.effective_time: IVL<TS>
    • RoleLink.effective_time_start: TS
    • RoleLink.effective_time_stop: TS
  • Role.effective_time: IVL<TS>
    • Role.effective_time_start: TS
    • Role.effective_time_stop: TS
  • Participation.time:IVL<TS>
    • Participation.time_start: TS
    • Participation.time_stop: TS

“SET”, “DSET”, “QSET” or “List” Datatypes

  • Entity.id : SET<II> ==> Enity_Identifier.
  • Entity.status_cd : SET<CS> ==> Entity.status_cd : CS
  • Person.ethnic_group_cd : SET<CE> ==> Person.ethnic_group_cd : CE
  • Person.race_cd : SET<CE> ==> Person_Race.
  • Entity.qty : SET<PQ> ==> Entity.qty PQ
  • Person.religious_affiliation : SET<CE> ==> Person.religious_affiliation : CE
  • Act.id: SET<II> ==> Act_Identifier
  • Act.priority: SET<CE> ==> Act_Priority
  • Act.confidentiality_cd: SET<CE> ==> Act_Confidentiality
  • Act.reason_cd: SET<CE> ==> Act_Reason
  • PatientEncounter.special_courtesies_cd: SET<CE> ==> PatientEncounter_Courtesies
  • PatientEncounter.special_arrangement_cd: SET<CE> ==> PatientEncounter_Arrangement
  • Procedure.method_cd: SET<CE> ==> Procedure_method
  • Procedure.approach_site_cd: SET<CE> ==> Procedure_approach_site
  • Procedure.target_site_cd: SET<CE> ==> Procedure_target_site
  • Observation.interpretation_cd: SET<CE> ==> Observation_interpretation
  • Observation.method_cd: SET<CE> ==> Observation_method
  • Observation.target_site_cd: SET<CD> ==> Observation_target
  • SubstanceAdministration.approach_site_cd: SET<CD> ==> SubstanceAdministration_approach_site
  • SubstanceAdministration.dose_check_qty: SET<RTO> ==> SubstanceAdministration_dose_check
  • SubstanceAdministration.max_dose_qty: SET<RTO> ==> SubstanceAdministration_max_dose
  • SubstanceAdministration.method_cd: SET<CD> ==> SubstanceAdministration_method
  • InvoiceElement.modifier_cd: SET<CE> ==> InvoiceElement_modifier
  • DeviceTask.parameter_value: LIST<ANY> ==> DeviceTask_parameters
  • Role.id: SET<II> ==> Role_identifier
  • Role.confidentiality_cd: SET<CE> ==> Role_confidentiality
  • Role.position_number: LIST<INT> ==> Role_position
  • ManagedParticipation.id: SET<II> ==> ManagedParticipation_identifier
  • Document.bibliographic_designation_txt: SET<ED> ==> Document_Designation


Note: SET<CS> type for Entity.status_cd has been simplified to a single CS value (except Person.race_cd).

“BAG” Datatypes

  • Entity.nm : BAG<EN> ==> Enity_Name.
  • Entity.telecom : BAG<TEL> ==> Entity_Telecom_Address.
  • Organization.addr : BAG<AD> ==> Organization_Address.
  • Person.addr : BAG<AD> ==> Person_Address.
  • Role.name BAG<EN> ==> Role_name
  • Role.addr: BAG<AD> ==> Role_address

Acimg3.GIF

Add Association Attributes

  • Entity_identifier./identified_entity : ENTITY
  • Entity_name./named_entity : ENTITY
  • Entity_telcom_address./addressed_entity : ENTITY
  • Organization_address./addressed_organization : ORGANIZATION
  • Person_address./addressed_person : PERSON
  • Person_race./attributed_person : PERSON
  • Act_reason.act: Act
  • Act_Confidentiality: Act
  • Act_Priority: Act
  • Act_identifier: Act
  • PatientEncounter_Courtesies.pencounter: PatientEncounter
  • PatientEncounter_Arragement.pencounter: PatientEncounter
  • Procedure_method.procedure: Procedure
  • Procedure_approach_site.procedure: Procedure
  • Procedure_target_site.procedure: Procedure
  • Observation_interpretation.observation: Observation
  • Observation_target.observation: Observation
  • Observation_method.observation: Observation
  • SubstanceAdministration_approach_site.sub_admin: SubstanceAdministration
  • SubstanceAdministration_dose_check.sub_admin: SubstanceAdministration
  • SubstanceAdministration_method.sub_admin: SubstanceAdministration
  • SubstanceAdministration_max_dose.sub_admin: SubstanceAdministration
  • Role_identifier. identified_role : Role
  • Role_confidentiality.confidentiality_role : Role
  • Role_position. positioned_role : Role
  • Role_name.named_role: Role
  • Role_address.addressed_role: Role
  • Role_telcom.addressed_role: Role
  • ManagedParticipation.identified_mparticipation: ManagedParticipation
  • Document_Designation.document: Document

Acimg4.GIF

Delete Generalization Relationships

  • Entity § Living_subject
  • Entity § Organization
  • Entity § Place
  • Entity § Material
  • Entity § Group
  • Living_subject § Person
  • Living_subject § Non_Person_Living_subject
  • Material § Manufactured_material
  • Manufactured_material § Container
  • Manufactured_material § Device
  • Device § Imaging_modality
  • Act § PatientEncounter
  • Act § ControlAct
  • Act § Supply
  • Act § WorkingList
  • Act § Procedure
  • Act § Observation
  • Act § DeviceTask
  • Act § SubstanceAdministration
  • Act § FinancialContract
  • Act § Account
  • Act § FinancialTransaction
  • Act § InvoiceElement
  • Diet § Supply
  • Observation § PublicHealthCase
  • Observation § DiagnosticImage
  • Role § Employee
  • Role § Access
  • Role § QualifiedEntity
  • Role § LicensedEntity
  • Role § Patient
  • Participation § ManagedParticipation
  • ContextStructure § Document

Acimg5.GIF

Propagate Inherited Attributes

Acimg6.GIF

Propagate Inherited Associations

Acimg7.GIF

Simplify datatypes

  • Person.nm: EN ==> PN
  • Organization.nm: EN ==> ON
  • All attributes of type CE ==> CV

Acimg8.GIF

Datatype expansion

Acimg9.GIF

All classes with composed attributes are expanded in the composing parts. The types interested in this process are PN, ON, EN, AD, TEL, II, PQ.


EN


Every attribute of type EN is expanded as the following schema.

This schema models the compositional nature of the attribute EN expanding in a container relationship.

Tables that have a EN attribute, create a table of name tablename_Name_Part table then this table refers the Entity_Name table.


Acimg10.GIF


ON and EN


ON (Organization Name) and PN (Person Name) are specializations of EN that doesn’t add any attribute, so they are expanded in the same way as EN.


AD


AD represents the concept of address and defines it in terms of parts and subparts. It is expanded like in the in schema:


Acimg11.GIF


The Person entity then refer the Person_Person_Address.


TEL


The TEL datatypes represents a set of telephone reference, so it is modeled as a multipart relation that expands in a schema like the following, in this case is showed the relation applied to an Organization entity:


Acimg12.GIF


II

All attributes of type II are expanded in the three compositing sub attributes:

  • root: UID
  • extension: ST
  • authority: ST


PQ


This type of attribute specify a physical quantity, and has many sub attributes as shows in the following schema:


Acimg13.GIF


Every table with a PQ attribute has reference to the table PhysicalQuantity.


Datatype Promotion

In this phase the CV (Coded Value) and CS (Coded Simple Value) data types are expanded in the corresponding UML class schema, so the all attributes of this types implicitly refers to this inserted element.

Acimg14.GIF


LDM-2-PDM

Given a clean LDM, this guide assumes that Hibernate is used to generate the PDM. Hibernate offers 3 different O-R strategies.

We have chosen:

  • table per hierarchy for the 6 foundation classes (and relative sub-classes) and code values: Act, Entity, Role,Participation,ActRelationship & CodeValue.
  • table per concrete class for all other related classes
  • There is a group of 5 tables dedicated to Vocabolaries and DataTypes

Dictionary.JPG


 if you are going to use JavaSIG to persist RIM data into the PDM mind that you will have to adapt the wrappers.


HIBERNATE MAPPING FILE


PDM 4 MySQL


DATATYPE Mapping


  • REAL = DECIMANL(9,3)
  • ST = VARCHAR(5000)
  • UID = INT(11)
  • ED = VARCHAR(5000)
  • GTS = DATETIME
  • TS = TIMESTAMP
  • BL = BIT(1)
  • INT = INT(11)


PDM 4 ORACLE 11gR1


DATATYPE Mapping


  • REAL = NUMBER
  • ST = VARCHAR(5000)
  • UID = NUMBER
  • ED = VARCHAR(5000)
  • GTS = DATE
  • TS = DATE
  • BL = BOOLEAN
  • INT = NUMBER


WARNINGS


  • Object names cannot be longer than 30 characters, thus some table and index names have to be shorten.
  • "auto increment" capabilities are implemented via the creation of SEQUENCES, incremented by PRE-INSERT db triggers.
  • Mind that SORTing clause is managed differently by the 2 DB technologies:
SELECT * FROM table ORDER BY col1
is allowed in MySQL but forbidden in Oracle, which asks for a more precise statement:
SELECT col1, col2 FROM table ORDER BY col1

  • MySQL is CASE INSENSITIVE while Oracle is CASE SENSITIVE, that means re-writing in Oracle the WHERE clause and ORDER BY by making use of UPPER() function in order to make the indexed search retrieve data stored in mixed case.
  • set properly the characterset
CHARACTER SET AL32UTF8;
NATIONAL CHARACTER SET AL16UTF16;

ORACLE ENVIRONMENT

Oracle environment sul server Linux:
EDITOR=vi
export EDITOR
ORACLE_SID=phidb
export ORACLE_SID
ORACLE_BASE=/oracle/app
export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/11.1.0/db_1
export ORACLE_HOME
ORA_CRS_HOME=$ORACLE_BASE/product/11.1.0/crs_1
export ORA_CRS_HOME
LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH
PATH=$ORACLE_BASE/admin/rman/bin:$ORACLE_HOME/bin:
$ORA_CRS_HOME/bin:/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/X11R6/bin:.
export PATH
umask 022
Place all DATAFILES in $ORACLE_BASE/oradata/phidb
Place ".bash_profile" and ".oracle_profile" into /home/oracle
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then . ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
if [ -f ~/.oracle_profile ]; then . ~/.oracle_profile
fi
.oracle_profile
PS1="`/bin/hostname `-> "
export PS1
EDITOR=vi
export EDITOR
ORACLE_SID=phidb
export ORACLE_SID
ORACLE_BASE=/oracle/app
export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/11.1.0/db_1
export ORACLE_HOME
ORA_CRS_HOME=$ORACLE_BASE/product/11.1.0/crs_1
export ORA_CRS_HOME
LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH PATH=$ORACLE_BASE/admin/rman/bin:$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/X11R6/bin:.
export PATH
umask 022
$ORACLE_BASE/admin has the following structure:
phidb
|-- adump
|-- ddl
| |-- arrdb.sql
| |-- dropall.sql
| |-- jbpm.sql
| |-- mkall.log
| |-- mkall.sql
| |-- mktablespaces.sql
| |-- mkusers.sql
| |-- qrtz.sql
| `-- rimpdm2.sql
|-- dpdump
| `-- dp.log
|-- pfile
| |-- crspfile.sql
| `-- init.ora
`-- scripts
|-- CreateDB.sql
|-- CreateDBCatalog.sql
|-- CreateDBFiles.sql
|-- JServer.sql
|-- apex.sql
|-- context.sql
|-- cwmlite.sql
|-- emRepository.sql
| -- init.ora
|-- interMedia.sql
|-- lockAccount.sql
|-- ordinst.sql
|-- owb.sql
|-- phidb.sh
|-- phidb.sql
|-- postDBCreation.sql
|-- spatial.sql
|-- ultraSearch.sql
|-- ultraSearchCfg.sql
`-- xdb_protocol.sql

XML database

This section discusses some of the best practices for those applications that map RIM objects to a XML database.
Note to readers: the reader is assumed to have basic knowledge of XML Databases.

  • It has the advantage that full fidelity (of the XML) is maintained. The use of an XML database makes sense if one only has to support CDA, in which case the XML version has to persisted, even if the data contained therein is stored in a structured fashion.
  • It has the advantage that one doesn't need any joins (depending on how big of an XML snippet one persists, could be the entire message/CDA message type, or it could just be a series of derived XML snippets, e.g. the CDA part which identifies the Patient).
  • It has the disadvantage that XQuery is a standard unfamiliar to most, one implementation had to export data to ER-based Datamarts - these Datamarts where queried using SQL to do statistical/clinical analysis of the data.

Best practices:

  • If one received data according to different Message Types, the clone names used for classes will be different. To use XQuery, one has to transform the XML (as received) into an alternative representation which is based on names of RIM classes. This allows for XQuerying across all data received, regardless of the clone names used in any particular model. If persistence isn't limited to one single message type (e.g. the CDA R-MIM for some applications) the use of the RIM ITS should be seen as a necessity.