μ ITS Requirements
This specification is a collection of requirements for creating simplified XML exchanges based on HL7 Version 3 modeling. These principles allow HL7 domain committees and realms to create a domain specific markup language (DSML) used to simplify the exchange of semantically complete content models. The principles found in this specification assure the following:
- Domain Specific Markup Languages (DSML) are XML languages.
- DSMLs created using these principles will always have a mapping to a semantically interoperable representation of content that is consistent with the HL7 Reference Information Model.
- Instances using the DSML will be consistent with the restricted model (R-MIM) of the domain specified message.
- Most RIM-based representations of the domain specified message can be transformed into a DSML representation.
The hallmarks of a DSML conforming to these principals are:
- The DSML is specified by an XML schema representation that supports validation and binding to storage representations.
- An automated platform independent transform exists that produces a RIM-ITS representation of all possible (valid) content found in a DSML instance.
- An automated transform exists that produces a DSML representation of the RIM-ITS representation for a super-majority of expected use cases.
- The DSML is documented in detail, including the mappings to RIM-ITS representations.
An ideal DSML:
- Has completely reversible transforms between the DSML and RIM-ITS representations of domain content.
- Represents all possible domain content.
In an ideal world, a DSML generator will exist that produces a DSML meeting these requirements. The generator would take as its input a specification of the mapping between language elements of the DSML and produce as output:
- An annotated schema for the DSML.
- An automatically generated transform from the DSML to a RIM-ITS representation.
- An automatically generated transform from a RIM-ITS representation to the DSML.
XML Schema Representation
There are a number of different XML Schema languages. The most commonly used is the one specified in the W3C XML Schema version 1.0 Parts 1 and 2. RelaxNG is another schema language that can be used to describe the structure of an XML document. The original XML specification also includes a Document Type Definition that describes the legal structures of XML instances. ISO Schematron is a schema language that has been found to be especially convenient for validating instances. Other schema languages exist which meet these requirements.
Schema languages can sometimes be used together. The datatypes-base.xsd schema found in the HL7 Version 3 Normative Edition uses both the W3C XML Schema language, and ISO Schematron to enforce constraints on data types. There may be separate schemas for validation and data binding. Some schema languages are well suited for validation, but not data binding (e.g., ISO Schematron), and visa versa.
The μ-ITS does not require the use of a specific schema representation, it only requires that:
- Documentation for the schema representation itself must be publicly available.
- Schema processors must be available across multiple platforms, preferably in a platform independent language.
The μ-ITS does not specify how transformations are performed. It only requires that:
- The transformations between DSML and RIM-ITS and back be specified in a manner that is independent of hardware platform. It should be specified in an executable form, but may be specified in human readable narrative.
- These transformations must not lose information that would change semantic interpretation of the content.
The W3C XML Stylesheet Language for Transformations (XSLT) is commonly used to perform transformations between XML languages. Other mechanisms of transformation are possible. Documentation on transformation languages must be publicly available, and transformation technologies must be available on multiple platforms.
Every instance of a RIM graph modeled using a DSML must have a canonical form. There must be an automatic method of generating the canonical form from the DSML. These requirements ensure both the existence an attainability of the canonical form. The fact that a canonical form can be obtained allows transformations to be written for a canonical form to a RIM-ITS instance without having to deal with all possible variant representations of a concept. The requirement that the canonical form be attainable through automation provides a mechanism whereby transformations written against the canonical form can be used on any possible variant representation.
The following example shows why canonical forms are important: In the Clinical Document Architecture, a criteria for a precondition is instantiated in the XML using the <criterion> element. Here are four possible representations of that element:
<criterion moodCode='EVN.CRT' classCode='OBS'> <!-- Default value assumed for classCode --> <criterion classCode='OBS'> <!-- Fixed value assumed for moodCode --> <criterion moodCode='EVN.CRT'> <!-- default and fixed values used for class and mood respectively --> <criterion>
An XSLT processor that does not use a validating parser on the input document will not be aware of the fixed or default values used in the CDA schema. An XSLT transform that needs to transform the <criterion> class would then need to be aware of all possible representations if it is to work on every variation that could be seen.
Any one of the four representations could be chosen as the canonical form. The most readily accessible representation is the one that always contains the moodCode and classCode attributes. The W3C XML standard describes a “standalone” XML document representation. This representation requires that the instance contain all attribute values that use the default or fixed values. Furthermore, there is an algorithm that supports the generation of the standalone form. If one were to make use of this algorithm, the canonical form of the <criterion> element would then be the element with all fixed and default attributes explicitly specified.
Attributes of a Canonical Form
Below are the required attributes of the canonical form:
- A canonical form of an instance is unique.
- A canonical form is complete. A reader of the canonical form is not required to make assumptions about the content.
- White space typically used to ensure document readability is NOT present.
- Sequences of elements that can appear in any order have a defined canonical order (e.g., alphabetical by element name).
- Sequences of NMTOKENS, ENTITIES or IDREFS in an attribute that can appear in any order have a defined canonical order.