Vocabulary Maintenance Language
Introduction | Proposal | Code System | Value Set | Domain | Properties | Appendices |
Introduction to the Original VML and to the 2014 Extension
This document describes a formal, XML based language that can be used to create and maintain the HL7 Version 3 reference vocabulary. The immediate purpose of this language is to provide vocabulary facilitators a consistent and rigorous mechanism that can be used to specify HL7 vocabulary additions and updates. In 22014, this is the primary means of preparing vocabulary changes for inclusion in the HL7 Vocabulary. The files are processed from VML through the Access data base and to MIF using the VML Processing Widget and RoseTree.
using these tools, content changes expressed in these proposals are applied to the repository of vocabulary content in an Access data base in two ways:
- Processing with a Java vocabulary update program first created in 2006-7, and
- Execution of pre-defined Access update queries drawing data from source tables generated by XSLT transforms that convert the "extension" properties that have been defined (below).
The formal representation of the vocabulary content is expressed in HL7 Model Interchange Format (MIF) files. This expression is undertaken by the RoseTree application working from the Access data base as its sole, primary source.
In this specification, a vocabulary maintenance description consists of a sequential list of parametrized function calls such as RegisterCodeSystem, AddCodes, CreateValueSet, etc. Further, the approach here is to separate the maintenance task into three separate parts:
- Code Systems – A code system contains a set of unique concept codes. Each concept code serves as a token to represent a useful category or class as viewed from a particular perspective. The definition and organization of the tokens within a code system represents assertions about the organization of the corresponding categories and classes within a real world. A code system may also carry information about the various ways that the categories or classes are identified in different situations and languages, as well as additional defining and identifying information that serves to clarify the intended meaning of the tokens
- Value Sets – A value set represents a list of concept codes. Value sets are used to specify a set of possible values for one or more RIM-derived coded attributes.
- Concept Domains - A concept domain represents an abstract conceptual space that can be associated with RIM-derived coded attributes. A concept domain can be represented by one or more value sets, where each associated value set applies in a given context. Further, sub-sets of concept domains may, themselves, be represented as concept domains in a parent-child semantic hierarchy.
Each of the above parts is maintained separately, with the revisions to the code system(s) occurring first followed by changes to the value sets followed by any revisions to concept domain/value set associations that might be necessary.
VML Extension in 2013-14
Over time, it became clear that couple of functions were "missing" from the existing VML. Specifically, the ability to "update" or "remove" a property from a code. As these were required, the update to the Access tables was taken as a manually created update, which while quick was prone to errors.
Beginning about 2009, the need arose to add properties to coded concepts, value sets, concept domains, etc. that could not be documented in the structure of the Access data base. Because the existing Java-based update tool is tightly bound to the data structures of the existing data base, and because no current volunteeers were familiar with the code of the Java application, the "toolsmiths" were reluctant to change the data content existing tables. As an alternative, the extension properties were added through two sets of tables:
- A pair of tables that treat these properties as name/value pairs, and that identify the target of these properties by their type (such as "vakueSet") and their fully qualified name. These tables are:
- VCS_property_definition that defines the purpose and type of each if the properties represented in this fashion.
- VCS_object_property That contains the name/value pairs and the formal identifier of the object to which they apply.
- For several years, the management of the content of these tables required manual updates to the their content in Access.
- VCS_property_definition that defines the purpose and type of each if the properties represented in this fashion.
- A pair of tables that document the version history of the value sets and code systems in the data base. Each time there is a change to a particular code system or value set, its history records will document in which release-index the change occurred, the formal published release in which the change first appeared, who made the change, and a brief summary of the changes. These data will allow proper release and version management for these code systems and value sets when these updates are completed in the second half of 2014. The tables in which the version history are documented are:
- VOC_value_set_history. and
- VCS_code_system_history
- VOC_value_set_history. and
The VML Extension, begun in 2013 and completed (we hope) in 2014, works withe the VML Processing Widget to automate the manual update processes of properties (done in 2013) and automate the capture of value set and code system history. In both cases this is realized by adding new "process elements" to the VML. The "automation" with the VML Processing Widget is provided by combination of:
- ANT scripts (run from the command line)
- XSLT Transforms to split the "extensions" from the original VML to create a Vocabulary Update Table that can be imported into Access; and
- Predefined update queries (programmed in Access Visual Basic For Applications (VBA)) that use the update table to change content in the appropriate vocabulary tables.
As the each VML is processed by ANT:
- the VML file is split to isolate the "extensions" from the "traditional" VML and to create the update table from the extensions;
- Access is activated to process the SQL-based Update and then shut down
- The isolated traditional VML is processed to the same Access data base using the Java update program to establish the "traditional" updates.
Recognizing Extension Elements
For those that were familiar with VML, the majority of this document is "old news". In order to make the recognition of extension elements easier, the graphics that have been updated to include extension elements have a red or maroon star somewhere on the right of the diagram; the extension elements are outlined in red, and the textual headers are in red font.
- Capability that is listed as Not exposed in application are functions that are not (as of this writing) supported by the Harmonization Tooling application.
- Capability that is listed as Not fully exposed in application represents a set of capabilities, some, but not all of which is implemented in the Harmonization Tooling application.
- Code System capability that is listed as Exposed under VS in application are functions that cannot be invoked directly, but that are present in processing value set changes on Value Sets whose Code System is established by a previous binding
- The term Concept Domain was formally adopted by the HL7 Vocabulary Technical Committee as the name for an abstract conceptual space that may be represented by (bound to) a set of concepts found in one or more specific code systems. Previous to this adoption, the preferred name for the same abstract conceptual space was Vocabulary Domain. In editing this document, the term "vocabulary domain" has been replaced with "concept domain", except where the term is part of the XML schema. In order to avoid "breaking" software tools that were built to the previous version of the schema, the XML attribute and element names retain the phrase "vocabularyDomain".