This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

Difference between revisions of "Software Implementation of CDA"

From HL7Wiki
Jump to navigation Jump to search
 
(57 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[category:RIMBAA Whitepaper]]
+
[[category:AID Whitepaper]][[Category:HowTo]]
Notes:
+
'''This whitepaper is one of a [[:Category:AID Whitepaper|series of whitepapers]] created by the [[AID]] Work Group. The whitepaper is based on actual CDA implementation experiences and aims to document a best practice or an implementation pattern.'''
*this page contains a draft deliverable for RIMBAA project 550
+
 
*the whitepaper is based (in part) on a Dutch whitepaper, elements of which have yet to be translated
+
The contents of this whitepaper were approved by the AID WG on 2017-05-09 as a reflection of current best practice. This is a "living" document, it may be updated by any person at any point in time.
*this is not a "pattern" (it's too wide ranging for that), but a "best practices" document
+
 
 +
Short URL: [http://j.mp/gDwZKm http://j.mp/gDwZKm] - See also: [[CDA Implementation Tools]]
  
 
==Summary==
 
==Summary==
This paper addresses the creation of a software application that has to support the CDA R2 model. It discusses the application architecture, and discusses various approaches with regard to code generation and persistence.
+
This paper addresses the creation of a software application that has to support the CDA R2 model. It discusses the application architecture, and discusses various approaches with regards to code generation and persistence.  
 
 
Although it is tempting to use XML techniques to support the creation, validation and parsing of CDA documents this paper shows that this is associated with a high risk of non conformant CDA instances. A model driven class code generator should be used if one wants to ensure compliance with the CDA standard as well as the appropriate implementation guide and associated templates.
 
  
'''Note''': this paper assumes the application has to support one HL7 version 3 model (CDA) '''only'''. There are drawbacks to locking down an application to a specific [[SIM]] (a.k.a. R-MIM). The use-case whereby one needs to support multiple version 3 SIMs is covered in these discussion pages: [[Schema based code generation]] and [[MIF based code generation]].
+
Although it is tempting to use XML techniques to support the creation, validation and parsing of CDA documents this paper shows that this approach is associated with a high risk of non conformant CDA instances. A model driven class code generator should be used if one wants to ensure compliance with the CDA standard as well as the appropriate implementation guide and associated templates.
  
 +
'''Note''': this paper assumes the application has to support one HL7 version 3 model (CDA) '''only'''. The use-case whereby one needs to support multiple version 3 SIMs is covered in these discussion pages: [[Schema based code generation]] and [[MIF based code generation]].
 
==Introduction==
 
==Introduction==
  
The HL7 e-Document standard (Clinical Document Architecture or CDA) is part of the HL7 version 3 standard. The current release of that standard (Release 2) was published in 2005 . CDA documents are used in a large number of projects, quite often in combination with HL7 version 3 messages or services. This article covers the development of software applications that support the CDA standard. The primary audience consists of application architects and software developers.
+
The HL7 e-Document standard (Clinical Document Architecture or CDA) is part of the HL7 version 3 standard. The current release of that standard (Release 2) was published in 2005. CDA documents are used in a large number of projects, quite often in combination with HL7 version 3 messages or services. This article covers the development of software applications that have to support the CDA standard. The primary audience consists of application architects and software developers.
  
 
The implementation of the CDA standard and the validation of CDA-conform XML instances is based on two types of specifications:
 
The implementation of the CDA standard and the validation of CDA-conform XML instances is based on two types of specifications:
#The CDA class model, a refinement of the HL7 Reference Information Model (RIM). The class model is expressed in MIF (Model Interchange Format ), the meta model format used by HL7 for all version 3 artefacts, or in derivations thereof such as UML or XML Schema. The CDA class model references Hl7 version 3 data types and coding systems.
+
#The CDA class model, a refinement of the HL7 Reference Information Model (RIM). The class model is expressed in MIF (Model Interchange Format), the meta model format used by HL7 for all version 3 artefacts, or in derivations thereof such as UML or XML Schema. The CDA class model references HL7 version 3 data types and coding systems.
#Context-specific constraints (Templates) of the generic CDA model, as defined in an DA implementationguide for specific document type and one specific context (e.g. country or project). Templates could express constraints on the class model itself, the use of data types, the values defined by coding systems, or they could be expressions of business rules. An example of the latter category is a template which defines that documents of the type “Natal report” SHALL be created within 7 days after the birth of the child.
+
#Context-specific constraints (Templates) of the generic CDA model, as defined in a CDA implementation guide for specific document type and one specific context (e.g. country or project). Templates could express constraints on the class model itself, on the use of data types, on the values defined by coding systems, or they could be expressions of business rules. An example of the latter category is a template which defines that the 'creation date' of “Natal report” documents SHALL be no more than 7 days after the birth of the child.
 +
 
 +
At this point in time Templates are defined either in
 +
*textual form as part of a CDA implementation guide; these can be (manually) transformed into software processable specifications such as OCL or Schematron. Many implementation guides are being published jointly with Schematron-based versions of the templates.
 +
*electronic form as supported by template design tools. The underlying electronic format is (as of yet) proprietary in the case of Lantana’s Trifolia (http://www.lantanagroup.com/newsroom/press-releases/trifolia-workbench-hl7-web-edition/), or is based on the HL7 Templates DSTU (HL7 Templates Standard: Specification and Use of Reusable Information Constraint Templates, Release 1) in the case of ART-DECOR (http://www.art-decor.org). 
  
At this point in time Templates are mostly defined as part of a CDA implementation guide in a textual form; these can be transformed into software processable specifications such OCL  or Schematron . An increasing number of implementation guides are made available with Schematron-based versions of the templates.
+
Recently (2014) HL7 has published a standard for the expression of template definitions (the DSTU HL7 Templates Standard). This format is in the process of being adopted by Template Editors, Template Repositories, as well as Schematron and code generation tools. The ART-DECOR tool already incorporated the DSTU, can create, edit and manage templates, generate schematrons out of the definitions, validate instances and acts both as a registry and a repository for templates and accompanying value sets. ART-DECOR also acts as a reference tools for the Templates DSTU. IHE’s testing and validation suite Gazelle (ObjectsChecker) has the ability to consume ART-DECOR templates in order to allow model driven validation of CDA instances.
  
The management aspect of templates is a major issue: a single CDA implementation guide may define hundreds of templates – which are quite often defined in terms of templates defined in other (more generic) CDA implementation guides.
+
The management aspect of templates is a major issue: a single CDA implementation guide may define hundreds of templates – which are quite often defined in terms of templates defined in other (more generic) CDA implementation guides. This issue is also illustrated by the creation of a US-Realm Consolidated CDA implementation guide (a.k.a. CCDA) - the number of templates (e.g. as defined by IHE, HL7, and HITSP) and the incompatibilities between them made it necessary to consolidate a number of template definitions.
  
 
===MIF and XML schema===
 
===MIF and XML schema===
A HL7 MIF definition of the CDA class model is provided with the HL7 v3 standard. The CDA MIF file can be transformed into less "rich" expressions such as UM and XML schema. Parts of the requirements as expressed by the MIF are lost during the transformation process.
+
A HL7 MIF definition of the CDA class model is provided with the HL7 v3 standard. The CDA MIF file can be transformed into less "rich" expressions such as UML and XML schema. Parts of the requirements as expressed by the MIF are lost during the transformation process.
  
CDA instances are based on XML and the standard requires that all CDA instances validate (at a minumum) against a published CDA XML schema. This is the main reason why a lot of CDA implementations are based on the CDA XML schema. The wide availability of XML tools is a definitive advantage; there are disadvantages as well. The XML schema language is not rich enough by far to express all of the requirements as present in the original CDA class model. A CDA document instance that validates against the XML schema is '''not''' guaranteed to be a valid CDA instance - to be a valid CDA instance one has to create XML that conforms to the rquirements that are expressed in the CDA class model.
+
CDA instances are based on XML and the standard requires that all CDA instances validate (at a minimum) against a published CDA XML schema. This is the main reason why a lot of CDA implementations are based on the CDA XML schema. The wide availability of XML tools is a definite advantage; there are disadvantages as well. The XML schema language is not rich enough by far to express all of the requirements as present in the original CDA class model. A CDA document instance that validates against the XML schema is '''not''' guaranteed to be a valid CDA instance - to be a valid CDA instance one has to create XML that conforms to the requirements that are expressed in the CDA class model.
 
   
 
   
Examples of the limited capabilities of the XMl schema language to express the model requirements include the use of conditional XML attributes with a HL7 v3 data type: a CD data type should either use both the attributes {@code and @codeSystem}, or the attribute @nullFlavor. This requirement simply can't be expressed in XML schema. This has the consequence that a CDA instance that only contains @code will be considered to be a valid document instance if validated against the CDA XML schema. Another example is the use of empty XML elements (<element/>), these are not allowed in any HL7 version 3 instance (see Footnote 1). This can't be specified in XML schema. There are complex workarounds for some of the above limitations of the XML schema language; these however lead to large schema definitions. Nictiz, the Dutch NHIN provider which specifies HL7 v3 artefacts for use in the Netherlands, has resorted to publishing a large set of Schematron files (mainly for data types and coding systems) to deal with the 'incomplete' validation as supported by XML schema.
+
Examples of the limited capabilities of the XML schema language to express the model requirements include the use of conditional XML attributes with a HL7 v3 data type: a CD data type should either use both the attributes {@code and @codeSystem}, or the attribute @nullFlavor. This requirement simply can't be expressed in XML schema. This has the consequence that a CDA instance that only contains @code will be considered to be a valid document instance if validated against the CDA XML schema. Another example is the use of empty XML elements (<element/>), these are not allowed in any HL7 version 3 instance (see Footnote 1). This can't be specified in XML schema. There are complex workarounds for some of the above limitations of the XML schema language; these however lead to large and unwieldy schema definitions.  
 +
 
 +
Note: (September 2012) [http://www.w3.org/XML/Schema XML Schema 1.1], a yet to be finalized W3C specification does support many of the desired features. It has yet to be determined whether or not most XML tools support version 1.1 - that would be a prerequisite for HL7 to start generating XML Schema 1.1. 
 +
 
 +
Nictiz, the Dutch NHIN provider which specifies HL7 v3 artefacts for use in the Netherlands, has resorted to publishing a large set of Schematron files (mainly for data types and coding systems) to deal with the 'incomplete' validation as supported by XML schema. The limitations of XML schema are also illustrated by the "Common issues found in implementations of the HL7 Clinical Document Architecture (CDA)" paper (http://www.ringholm.de/docs/03020_en_HL7_CDA_common_issues_error.htm) written in 2008, and the "Model-based Analysis of HL7 CDA R2 Conformance and
 +
Requirements Coverage" paper (http://www.ejbi.org/img/ejbi/2015/2/Boufahja_en.pdf) written by IHE in 2015.
  
In order to fulfill all requirements as expressed by the CDA class model the starting point for all CDA implementations would have to be the CDA MIF. MIF has the disadvantage that it is a HL7 specific format which is only supported by a limited number of tools.
+
In order to fulfill all requirements as expressed by the CDA class model the starting point for all CDA implementations would have to be the CDA MIF. MIF however has the disadvantage that it is a HL7 specific format which is only supported by a limited number of tools.
  
 
==Software development approaches==
 
==Software development approaches==
De huidige CDA software-implementaties zijn te verdelen in twee groepen: een groep die zich uitsluitend baseert op XML-technieken en tools, en een andere groep die zich baseert op het CDA-klassenmodel (MIF of UML).
+
The current implementations of CDA can be divided in two categories: a group which uses XML technologies and tools, and another group which is based on the CDA class model (MIF or UML).
  
 
===CDA implementation using XML techniques===
 
===CDA implementation using XML techniques===
  
De basis voor het toepassen van XML-technieken bij de implementatie is gelegen in het feit dat CDA gebruik maakt van XML: de CDA-documenten zijn XML documenten, er is een XML Schema voor CDA, en templates worden veelal gepubliceerd in de vorm van Schematron-bestanden. Schematron behoort tot de standaard XML technieken; een Schematron compiler heeft alleen een XSLT engine nodig om te kunnen werken.
+
The main reason for using XML techniques when implementing CDA is the fact that CDA is based on the XML standard: CDA instances are XML documents and there is a published XML schema for CDA documents. Template definitions are generally made available in the form of Schematron files. Schematron is part of the XML family of specifications; a Schematron compiler is based on a generic XSLT engine. The choice to use XML techniques is therefore an obvious one.
 +
 
 +
Class generators are commonly used next to other well known XML techniques such as Xpath and DOM/SAX. JAXB is an example of a class generator: a tool which transforms XML schema to corresponding Java classes.
 +
 
 +
A number of online CDA validation tools are based on XML techniques (Schema and Schematron); examples include NIST (http://xreg2.nist.gov/cda-validation/validation.html), and Lantana (https://www.lantanagroup.com/validator/). These tools can be used to test for validity of CDA instances. A XML document that is considered to be valid by these tools is not necessarily a valid CDA instance as XML-based tools are not capable of validating all aspects of a CDA.
  
Naast bekende technieken als Xpath, DOM/SAX wordt er veelal gebruik gemaakt van klassegeneratoren : bijvoorbeeld JAXB, een tool die het CDA XML Schema omzet in Java-klassen.
+
===Model driven CDA implementation===
 +
The basis for model driven implementations is the CDA class model as documented in the CDA MIF. Because of the fact that CDA essentially is an information model without any behavioral aspects associated with it one has the option of creating a very solid mapping from CDA MIF to UML, which in turn allows for the use of UML based tools.
  
Een aantal online CDA validatietools maakt gebruik van XML technieken (Schema en Schematron), o.a. NIST (http://xreg2.nist.gov/cda-validation/validation.html ), Alschuler Associates (http://www.alschulerassociates.com/validator/), en IHE (http://gazelle.ihe.net/EVSClient/cda/validator.seam?cid=9) . Deze tools kunnen gebruikt worden voor testdoeleinden. Het feit dat een document valide is volgens deze tools houdt niet automatisch in dat een document een valide CDA document is.  
+
The CDA MIF (or the UML equivalent thereof) can be used by class generators to create a set of classes (in e.g. Java or C#). At this point in time (November 2010) there are a couple of freely available class generators which one could consider when implementing CDA:
 +
#MDHT (http://www.cdatools.org/), a CDA specific class generator. This tool generates Java classes based on a UML representation of the CDA class model and on an OCL representation of applicable templates. The tool also supports the management and definition of templates to deal with the multitude of template definitions in CDA implementation guides. MDHT allows for the definition of templates in a table-like structure. Templates can be defined to be additional constraints on other templates. Templates defined in this way are automatically transformed into formal OCL statements; these OCL statements are in turn used when generating the Java classes. 
 +
#IHE’s testing and validation suite Gazelle (ObjectsChecker) has the ability to consume ART-DECOR templates in order to allow model driven validation of CDA instances. Although aimed at validation ObjectsChecker is also a code generator.
 +
#MARC-HI Everest (http://everest.marc-hi.ca/), a HL7 version 3 (not just CDA) MIF-based class generator. This tool has recently (2014) embraced the HL7 Template Definition standard. A new add-on module (Sherpas) has been developed to compile the HL7 templates (e.g. as produced by ART DECOR) and to generate code. Sherpas is available in the development branch of Everest and will be included in the 1.4 release of Everest (which was scheduled for the summer of 2015, but hasn't bee published as of May 2016). The current Everest 1.3 release has only partial support for templates. See https://www.youtube.com/watch?v=p5oasVIQaNE&feature=youtu.be for a recorded presentation related to Everest and Sherpas.
  
 +
There is an online validation tool based on MDHT (http://cdatools.org/validation/), and IHE hosts an online 2-step (schema as well as model based) validation tool (http://gazelle.ihe.net/EVSClient/cda/validator.seam?cid=9)
  
===Model driven CDA implementation===  
+
===Green CDA===
De basis voor een modelgebaseerde implementatie is het CDA-klassenmodel zoals vastgelegd in MIF. De CDA MIF kan, vanwege het feit dat CDA documenten alleen bestaan uit een informatie model, zeer goed in een UML representatie worden omgezet.  
+
The [[GreenCDA Project]] added the concept of a simplified XML format that can be transformed to and from the full normative CDA XML format. This has the implementation advantage that simplified schema exist for GreenCDA versions of a CDA document - an improvement over the generic normative CDA schema in terms of validation strength as well as its suitability to be used as a basis for code generation. Simplified schema, optimized for human readability, generate readable code.  
  
De CDA MIF (of de UML versie daarvan) dient als uitgangspunt voor het genereren van klasse generatoren , die het MIF/UML formaat bijvoorbeeld omzetten in Java klassen. Op dit moment (Augustus 2010) zijn er twee vrij beschikbare ontwikkeltools waarvan men het gebruik zou moeten overwegen:
+
In principle each and every implementer could define their own GreenCDA format, GreenCDA describes a process and not a format. When the process of 'Greening' is applied to Entry-level CDA templates there can be a significant level of re-use between various CDA document types. generally such an approach will only be possible if one has to implement multiple document types based on one common set of templates (e.g. in the context of US Meaningful Use requirements, or when implementing a set of IHE PCC implementation guides for CDA).
#MDHT (http://www.cdatools.org/), een CDA-specifieke klassengenerator. De tool maakt Java klassen aan op basis van enerzijds een UML representatie van het CDA model, en anderzijds OCL representaties van de templates. De tool bevat tevens een goed beheersysteem voor de veelheid aan template definities die gebruikt worden in CDA-implementatiehandleidingen. MDHT stelt het de gebruiker in staat een Template op een relatief eenvoudige wijze aan te maken in een tabel-achtige structuur. Templates kunnen als inperkingen op al eerder aangemaakte Templates worden gedefinieerd. De aangemaakte template definitie wordt vervolgs omgezet naar OCL statements.
 
#MARC-HI Everest (http://everest.marc-hi.ca/), een HL7 versie 3 (niet alleen CDA) klassengenerator. De tool maakt C# klassen aan op basis van de CDA MIF. De tool biedt op dit moment geen ondersteuning voor CDA templates; voorlopig zou men gebruik kunnen maken van de Schematron versies van de templates.
 
#Java SIG (http://aurora.regenstrief.org/javasig), een (helaas niet recent bijgewerkte) toolkit die op basis van de CDA MIF (uit 2005) Java klassen genereert. Deze toolkit is in tegenstelling tot de eerder genoemde toolkits al relatief lang beschikbaar en zij is in gebruik in meerdere implementaties, o.a. DRV Bund (Duitsland) en PCS (Oostenrijk).  De tool biedt geen ondersteuning voor templates.
 
  
MDHT heeft tevens een online validatie tool (http://cdatools.org/validation/ ). Er bestaat tevens een MIF gebaseerde validatie tool voor het Eclipse platform (http://hl7book.net/index.php?title=Eclipse_Instance_Editor).  
+
*When implemented in an XML-centric fashion the same caveats and limitations as described above (see section ''CDA implementation using XML techniques'') apply; the biggest one being that XML-based tools are simply not capable of validating all aspects of a CDA.
 +
*However, one could also view the Green CDA format as a (simplified) class model, and use it for code generation. The normative full CDA format is then transformed to/from the Green CDA XML format, which in turn is generated/processed by the generated code.
  
 
==Persistence==
 
==Persistence==
De CDA standaard bevat een aantal eisen ten aanzien van de archivering van CDA documenten. Eén van de vereisten is dat als men een document opslaat, men in staat moet zijn het XML document zo op te leveren dat het gelijkluidend is aan het originele document. In relationele databases worden CDA documenten om die reden veelal als “blob” opgeslagen.  
+
The CDA standard contains a number of requirements when it comes to the persistence of CDA documents. One of the requirements is that one should be able to reproduce the exact same XML instance which was originally received. In a relational database this requires that one stores the CDA document as a blob; or one could use a native XML database .
 +
 
 +
If one uses a class generator it is recommended that one uses an ORM (Object-Relational Mapper) toolkit to persist the data as present in the CDA instance. The CDA document is 'shredded' into its constituent data components. If data is extracted from a CDA document one should ensure that the relationship between the data and its source document is persisted as well - it could be that document (as a whole) is replaced or nullified at a later point in time, in which case one should also have the ability to designate the data as either nullified or replaced.
  
Indien men gebruik maar van een klassengenerator is het aan te raden daarnaast gebruik te maken van een ORM applicatie om de gestructureerde gegevens afkomstig uit het CDA document in de database op te slaan. Hierbij dient men de relatie met het document bij de daaruit afkomstige gegevens vast te leggen – het kan zijn dat het document op een later moment ongeldig wordt verklaard, in welk geval men tevens de daaruit afkomstige gegevens als dusdanig moet kunnen markeren.  
+
==Processing==
Aanbevelingen
+
CDA documents may be based on a specific implementation guide and a series of templates. These provide context which may be used when processing the contents of the document. Knowledge of the context (the underlying use case) does allow one to reuse code.
Het gebruik van de CDA XML schema leidt in de praktijk tot het aanmaken van CDA documenten die geen valide instantiaties van de CDA-standaard zijn. Indien men door omstandigheden gedwongen wordt uitsluitend gebruik te maken van XML-technieken, dan dient de implementatie extra aandacht te besteden aan extra validatiemodules voor de HL7 versie 3 data types en coderingssystemen. Omdat de XML schema taal niet de vereiste “rijkheid” bezit leiden vooral die gebieden tot een veelheid aan fouten in CDA documenten.  
 
  
Een anonieme bron bij een van de aanbieders van een online CDA validatie tool bevestigt het bovenstaande: indien men nu zou beginnen aan de ontwikkeling van een dergelijke tool, dan zou deze niet gebaseerd worden op XML technieken. De huidige tool levert teveel false positives – documenten worden als geldige CDA documenten gevalideerd terwijl zij dat niet zijn. Het beheer van template definities in de vorm van een collectie schematron files is eveneens problematisch.  
+
In general it is advisable to use the available context for processing. As an alternative one could base the entire processing logic on the data itself, irrespective of the implementation guide and templates used.
  
Voor een correcte ondersteuning van de CDA standaard dienen software applicaties gebruik te maken van het CDA-klassenmodel. Applicaties die uitsluitend gebruik maken van de CDA XML Schema kunnen geen garantie bieden dat de resulterende documenten ook werkelijk valide CDA-instantiaties zijn. Op dit moment biedt de MDHT tool voor Java het beste uitgangspunt voor het ontwikkelen van een CDA-klassenmodel gebaseerde applicatie. Deze toolkit levert tevens ondersteuning voor het beheer en testen van Templates.
+
==Summary and Recommendations==
 +
The diagram below shows the relationships between the various artefacts discussed in this whitepaper. A CDA document has to conform to the requirements as defined in a CDA Implementation Guide. It has to conform to both the formal CDA class model as well as the templates. The CDA class model can be expressed in either MIF, or in a derived format such as UML or XML schema. Templates can be expressed in Schematron, or in OCL, or in MIF with OCL annotations. The actual validation of CDA instances is based on the expressions of the CDA class model and the applicable templates.
 +
[[Image:Cda implementation article.PNG|center]]
  
==Software Architecture==
+
The use of XML techniques leads to the creation of CDA documents that are not valid instances of the CDA standard. If one is forced to solely rely on XML techniques (and up to about 2011 there really wasn’t any other available option) one should pay particular attention to the HL7 version 3 data types and coding systems. Those are areas that mostly lead to issues because of the lack of expressivity/richness of the XML schema language.
CP-CO-CS (using technology matrix terms) is probably the way to support this,
+
*with MIF based code genration for CO (e.g. using Dave Carlson's CDA tool) or schema based code generation if one can't use the MIF tool.  
+
Sources at two of the organizations responsible for CDA online validation tools confirm the above: if they were to develop such a tool from scratch they would not base it on XML techniques. The current online tools produce too many false positives - documents which are erroneously declared to be valid CDA instances. The management of templates in the form of a set of Schematron files is also reported to be problematic.  
*And use a standard ORM tool for CP-CO.
 
  
==Summary and Recommendations==
+
A software application will have to be based on the CDA class model if one wishes to ensure that one creates valid CDA instances. Applications that are based on the CDA XML schema can't guarantee that the documents are valid CDA instances. Both the MDHT as well as the Everest toolkit support templates - they are the best candidates for a model based implementation.
  
[[Image:Cda implementation article.PNG|center]]
+
==Footnotes==
 +
#On the use of empty XML elements (&lt;element/>): in certain rather exotic circumstances empty XML elements may occur in HL7 version 3 instances. For example: if a model were to have a mandatory participation linked to a Role which has no required/mandatory attributes the Role could be present in the instance as an empty XML element. The CDA model doesn't contain any such requirements; a particular exotic CDA template could have this type of requirement as well, resulting in an empty XML element. The statement by the authors of this white paper that "no v3 instance SHALL contain an empty XML element" leads to a lot less "false positives" and a very small number of "false negatives".

Latest revision as of 08:25, 10 May 2017

This whitepaper is one of a series of whitepapers created by the AID Work Group. The whitepaper is based on actual CDA implementation experiences and aims to document a best practice or an implementation pattern.

The contents of this whitepaper were approved by the AID WG on 2017-05-09 as a reflection of current best practice. This is a "living" document, it may be updated by any person at any point in time.

Short URL: http://j.mp/gDwZKm - See also: CDA Implementation Tools

Summary

This paper addresses the creation of a software application that has to support the CDA R2 model. It discusses the application architecture, and discusses various approaches with regards to code generation and persistence.

Although it is tempting to use XML techniques to support the creation, validation and parsing of CDA documents this paper shows that this approach is associated with a high risk of non conformant CDA instances. A model driven class code generator should be used if one wants to ensure compliance with the CDA standard as well as the appropriate implementation guide and associated templates.

Note: this paper assumes the application has to support one HL7 version 3 model (CDA) only. The use-case whereby one needs to support multiple version 3 SIMs is covered in these discussion pages: Schema based code generation and MIF based code generation.

Introduction

The HL7 e-Document standard (Clinical Document Architecture or CDA) is part of the HL7 version 3 standard. The current release of that standard (Release 2) was published in 2005. CDA documents are used in a large number of projects, quite often in combination with HL7 version 3 messages or services. This article covers the development of software applications that have to support the CDA standard. The primary audience consists of application architects and software developers.

The implementation of the CDA standard and the validation of CDA-conform XML instances is based on two types of specifications:

  1. The CDA class model, a refinement of the HL7 Reference Information Model (RIM). The class model is expressed in MIF (Model Interchange Format), the meta model format used by HL7 for all version 3 artefacts, or in derivations thereof such as UML or XML Schema. The CDA class model references HL7 version 3 data types and coding systems.
  2. Context-specific constraints (Templates) of the generic CDA model, as defined in a CDA implementation guide for specific document type and one specific context (e.g. country or project). Templates could express constraints on the class model itself, on the use of data types, on the values defined by coding systems, or they could be expressions of business rules. An example of the latter category is a template which defines that the 'creation date' of “Natal report” documents SHALL be no more than 7 days after the birth of the child.

At this point in time Templates are defined either in

  • textual form as part of a CDA implementation guide; these can be (manually) transformed into software processable specifications such as OCL or Schematron. Many implementation guides are being published jointly with Schematron-based versions of the templates.
  • electronic form as supported by template design tools. The underlying electronic format is (as of yet) proprietary in the case of Lantana’s Trifolia (http://www.lantanagroup.com/newsroom/press-releases/trifolia-workbench-hl7-web-edition/), or is based on the HL7 Templates DSTU (HL7 Templates Standard: Specification and Use of Reusable Information Constraint Templates, Release 1) in the case of ART-DECOR (http://www.art-decor.org).

Recently (2014) HL7 has published a standard for the expression of template definitions (the DSTU HL7 Templates Standard). This format is in the process of being adopted by Template Editors, Template Repositories, as well as Schematron and code generation tools. The ART-DECOR tool already incorporated the DSTU, can create, edit and manage templates, generate schematrons out of the definitions, validate instances and acts both as a registry and a repository for templates and accompanying value sets. ART-DECOR also acts as a reference tools for the Templates DSTU. IHE’s testing and validation suite Gazelle (ObjectsChecker) has the ability to consume ART-DECOR templates in order to allow model driven validation of CDA instances.

The management aspect of templates is a major issue: a single CDA implementation guide may define hundreds of templates – which are quite often defined in terms of templates defined in other (more generic) CDA implementation guides. This issue is also illustrated by the creation of a US-Realm Consolidated CDA implementation guide (a.k.a. CCDA) - the number of templates (e.g. as defined by IHE, HL7, and HITSP) and the incompatibilities between them made it necessary to consolidate a number of template definitions.

MIF and XML schema

A HL7 MIF definition of the CDA class model is provided with the HL7 v3 standard. The CDA MIF file can be transformed into less "rich" expressions such as UML and XML schema. Parts of the requirements as expressed by the MIF are lost during the transformation process.

CDA instances are based on XML and the standard requires that all CDA instances validate (at a minimum) against a published CDA XML schema. This is the main reason why a lot of CDA implementations are based on the CDA XML schema. The wide availability of XML tools is a definite advantage; there are disadvantages as well. The XML schema language is not rich enough by far to express all of the requirements as present in the original CDA class model. A CDA document instance that validates against the XML schema is not guaranteed to be a valid CDA instance - to be a valid CDA instance one has to create XML that conforms to the requirements that are expressed in the CDA class model.

Examples of the limited capabilities of the XML schema language to express the model requirements include the use of conditional XML attributes with a HL7 v3 data type: a CD data type should either use both the attributes {@code and @codeSystem}, or the attribute @nullFlavor. This requirement simply can't be expressed in XML schema. This has the consequence that a CDA instance that only contains @code will be considered to be a valid document instance if validated against the CDA XML schema. Another example is the use of empty XML elements (<element/>), these are not allowed in any HL7 version 3 instance (see Footnote 1). This can't be specified in XML schema. There are complex workarounds for some of the above limitations of the XML schema language; these however lead to large and unwieldy schema definitions.

Note: (September 2012) XML Schema 1.1, a yet to be finalized W3C specification does support many of the desired features. It has yet to be determined whether or not most XML tools support version 1.1 - that would be a prerequisite for HL7 to start generating XML Schema 1.1.

Nictiz, the Dutch NHIN provider which specifies HL7 v3 artefacts for use in the Netherlands, has resorted to publishing a large set of Schematron files (mainly for data types and coding systems) to deal with the 'incomplete' validation as supported by XML schema. The limitations of XML schema are also illustrated by the "Common issues found in implementations of the HL7 Clinical Document Architecture (CDA)" paper (http://www.ringholm.de/docs/03020_en_HL7_CDA_common_issues_error.htm) written in 2008, and the "Model-based Analysis of HL7 CDA R2 Conformance and Requirements Coverage" paper (http://www.ejbi.org/img/ejbi/2015/2/Boufahja_en.pdf) written by IHE in 2015.

In order to fulfill all requirements as expressed by the CDA class model the starting point for all CDA implementations would have to be the CDA MIF. MIF however has the disadvantage that it is a HL7 specific format which is only supported by a limited number of tools.

Software development approaches

The current implementations of CDA can be divided in two categories: a group which uses XML technologies and tools, and another group which is based on the CDA class model (MIF or UML).

CDA implementation using XML techniques

The main reason for using XML techniques when implementing CDA is the fact that CDA is based on the XML standard: CDA instances are XML documents and there is a published XML schema for CDA documents. Template definitions are generally made available in the form of Schematron files. Schematron is part of the XML family of specifications; a Schematron compiler is based on a generic XSLT engine. The choice to use XML techniques is therefore an obvious one.

Class generators are commonly used next to other well known XML techniques such as Xpath and DOM/SAX. JAXB is an example of a class generator: a tool which transforms XML schema to corresponding Java classes.

A number of online CDA validation tools are based on XML techniques (Schema and Schematron); examples include NIST (http://xreg2.nist.gov/cda-validation/validation.html), and Lantana (https://www.lantanagroup.com/validator/). These tools can be used to test for validity of CDA instances. A XML document that is considered to be valid by these tools is not necessarily a valid CDA instance as XML-based tools are not capable of validating all aspects of a CDA.

Model driven CDA implementation

The basis for model driven implementations is the CDA class model as documented in the CDA MIF. Because of the fact that CDA essentially is an information model without any behavioral aspects associated with it one has the option of creating a very solid mapping from CDA MIF to UML, which in turn allows for the use of UML based tools.

The CDA MIF (or the UML equivalent thereof) can be used by class generators to create a set of classes (in e.g. Java or C#). At this point in time (November 2010) there are a couple of freely available class generators which one could consider when implementing CDA:

  1. MDHT (http://www.cdatools.org/), a CDA specific class generator. This tool generates Java classes based on a UML representation of the CDA class model and on an OCL representation of applicable templates. The tool also supports the management and definition of templates to deal with the multitude of template definitions in CDA implementation guides. MDHT allows for the definition of templates in a table-like structure. Templates can be defined to be additional constraints on other templates. Templates defined in this way are automatically transformed into formal OCL statements; these OCL statements are in turn used when generating the Java classes.
  2. IHE’s testing and validation suite Gazelle (ObjectsChecker) has the ability to consume ART-DECOR templates in order to allow model driven validation of CDA instances. Although aimed at validation ObjectsChecker is also a code generator.
  3. MARC-HI Everest (http://everest.marc-hi.ca/), a HL7 version 3 (not just CDA) MIF-based class generator. This tool has recently (2014) embraced the HL7 Template Definition standard. A new add-on module (Sherpas) has been developed to compile the HL7 templates (e.g. as produced by ART DECOR) and to generate code. Sherpas is available in the development branch of Everest and will be included in the 1.4 release of Everest (which was scheduled for the summer of 2015, but hasn't bee published as of May 2016). The current Everest 1.3 release has only partial support for templates. See https://www.youtube.com/watch?v=p5oasVIQaNE&feature=youtu.be for a recorded presentation related to Everest and Sherpas.

There is an online validation tool based on MDHT (http://cdatools.org/validation/), and IHE hosts an online 2-step (schema as well as model based) validation tool (http://gazelle.ihe.net/EVSClient/cda/validator.seam?cid=9)

Green CDA

The GreenCDA Project added the concept of a simplified XML format that can be transformed to and from the full normative CDA XML format. This has the implementation advantage that simplified schema exist for GreenCDA versions of a CDA document - an improvement over the generic normative CDA schema in terms of validation strength as well as its suitability to be used as a basis for code generation. Simplified schema, optimized for human readability, generate readable code.

In principle each and every implementer could define their own GreenCDA format, GreenCDA describes a process and not a format. When the process of 'Greening' is applied to Entry-level CDA templates there can be a significant level of re-use between various CDA document types. generally such an approach will only be possible if one has to implement multiple document types based on one common set of templates (e.g. in the context of US Meaningful Use requirements, or when implementing a set of IHE PCC implementation guides for CDA).

  • When implemented in an XML-centric fashion the same caveats and limitations as described above (see section CDA implementation using XML techniques) apply; the biggest one being that XML-based tools are simply not capable of validating all aspects of a CDA.
  • However, one could also view the Green CDA format as a (simplified) class model, and use it for code generation. The normative full CDA format is then transformed to/from the Green CDA XML format, which in turn is generated/processed by the generated code.

Persistence

The CDA standard contains a number of requirements when it comes to the persistence of CDA documents. One of the requirements is that one should be able to reproduce the exact same XML instance which was originally received. In a relational database this requires that one stores the CDA document as a blob; or one could use a native XML database .

If one uses a class generator it is recommended that one uses an ORM (Object-Relational Mapper) toolkit to persist the data as present in the CDA instance. The CDA document is 'shredded' into its constituent data components. If data is extracted from a CDA document one should ensure that the relationship between the data and its source document is persisted as well - it could be that document (as a whole) is replaced or nullified at a later point in time, in which case one should also have the ability to designate the data as either nullified or replaced.

Processing

CDA documents may be based on a specific implementation guide and a series of templates. These provide context which may be used when processing the contents of the document. Knowledge of the context (the underlying use case) does allow one to reuse code.

In general it is advisable to use the available context for processing. As an alternative one could base the entire processing logic on the data itself, irrespective of the implementation guide and templates used.

Summary and Recommendations

The diagram below shows the relationships between the various artefacts discussed in this whitepaper. A CDA document has to conform to the requirements as defined in a CDA Implementation Guide. It has to conform to both the formal CDA class model as well as the templates. The CDA class model can be expressed in either MIF, or in a derived format such as UML or XML schema. Templates can be expressed in Schematron, or in OCL, or in MIF with OCL annotations. The actual validation of CDA instances is based on the expressions of the CDA class model and the applicable templates.

Cda implementation article.PNG

The use of XML techniques leads to the creation of CDA documents that are not valid instances of the CDA standard. If one is forced to solely rely on XML techniques (and up to about 2011 there really wasn’t any other available option) one should pay particular attention to the HL7 version 3 data types and coding systems. Those are areas that mostly lead to issues because of the lack of expressivity/richness of the XML schema language.

Sources at two of the organizations responsible for CDA online validation tools confirm the above: if they were to develop such a tool from scratch they would not base it on XML techniques. The current online tools produce too many false positives - documents which are erroneously declared to be valid CDA instances. The management of templates in the form of a set of Schematron files is also reported to be problematic.

A software application will have to be based on the CDA class model if one wishes to ensure that one creates valid CDA instances. Applications that are based on the CDA XML schema can't guarantee that the documents are valid CDA instances. Both the MDHT as well as the Everest toolkit support templates - they are the best candidates for a model based implementation.

Footnotes

  1. On the use of empty XML elements (<element/>): in certain rather exotic circumstances empty XML elements may occur in HL7 version 3 instances. For example: if a model were to have a mandatory participation linked to a Role which has no required/mandatory attributes the Role could be present in the instance as an empty XML element. The CDA model doesn't contain any such requirements; a particular exotic CDA template could have this type of requirement as well, resulting in an empty XML element. The statement by the authors of this white paper that "no v3 instance SHALL contain an empty XML element" leads to a lot less "false positives" and a very small number of "false negatives".