Technical Correction of Value Sets - February 2009

From HL7Wiki
Jump to navigation Jump to search

Summary

During the week of February 8, 2009, several reports surfaced about value set "errors" in the published Vocabulary content. Investigation revealed four sets of problems, three of which were inadvertently introduced while processing vocabulary "cleanup" or harmonized updates, and a fourth that arose because an approved change affected legacy software in the RMIM Designer (in Visio).

The first two errors introduced concomitant errors in the vocabulary core schema (voc.xsd) which caused validation errors to appear incorrectly for otherwise valid message instances. The first of these two errors will require that a patch be issued for Normative Edition 2008.

Fortunately, retained documentation from harmonization and "cleanup" activities allowed full identification of the source of the errors and the steps needed to correct them. For "current" use, all errors were corrected in release 2.25.2 of the "RIM Repository". Current releases can be downloaded from Gforge. Further, the limitation of the RMIM Designer software was corrected with Release 4.4.0 of the RMIM Designer that is also available on Gforge.

A summary of these issues follows in this section, and more detail appears for selected problems in other sections. Needless to say, these occurrences will lead to changes in the way that vocabulary changes, once made, are validated.

Errors in "x_" Value Sets

This problem is analyzed in detail in the section Invalid "x_" Value Sets in Normative Edition 2008 below. It occurred as a result of an algorithm used to update vocabulary content during the "vocabulary cleanup" project in January 2008. It affected five "x_" value sets by dropping from their content other value sets that should have been "included" in them.

This is the only correction that is planned for the Normative Edition 2008 patch.

The affected value sets are:

  • x_ActMoodIntentEvent
  • x_ActOrderableOrBillable
  • x_EntityClassDocumentReceiving
  • x_EntityClassPersonOrOrgReceiving
  • x_RoleClassCredentialedEntity

Errors in "Address Use" Value Sets

This problem arose in the construction of the PostalAddressUse and TelecommunicationAddressUse value sets, as released at the end of 2008. In both cases, changes introduced to meet approvals from Harmonization in November 2008 failed to "include" the "GeneralAddressUse" value set in these two value sets. This prevented, for example, validation of an address with usage of "WP" (work place).

The error was discovered while preparing material for Ballot 2009May, and did not affect any publications. Therefore the only remediation is the release of corrected content on February 16, 2009.

(This document contains no further discussion of this problem.)

Re-setting Value Set OIDs to Original Values

During the course of resolving the preceding issues, and of investigating how to "resurrect" the value sets for the RMIM Designer (below), we undertook a programmatic comparison of (a) the OIDs assigned to each value set as they appear in recent releases with (b) the OIDs from releases before "vocabulary cleanup" began in January of 2008. This comparison showed that the process of implementing approved updates on twelve value sets caused them to be assigned new OIDs, and in six cases caused their descriptions to be dropped.

These are deemed as Technical Corrections. Since all but one of these changes occurred in recent weeks, remediation steps are limited to the publication of corrected content on February 16, 2009. The value sets affected are listed in the table below. The list includes both the original, correct OID that has been reinstated, and the later-assigned OID that has been replaced. Further, there is a flag indicating whether the description was also refreshed.

"Reset" Value Sets
Value Set Name Correct OID Replaced OID DescReset
ActClassClinicalDocument 2.16.840.1.113883.1.11.13948 2.16.840.1.113883.1.11.20305 Yes
ActClassComposition 2.16.840.1.113883.1.11.19442 2.16.840.1.113883.1.11.20307 Yes
ActClassContainer 2.16.840.1.113883.1.11.19445 2.16.840.1.113883.1.11.20301 Yes
ActClassDocument 2.16.840.1.113883.1.11.18938 2.16.840.1.113883.1.11.20306 Yes
ActClassExposure 2.16.840.1.113883.1.11.19832 2.16.840.1.113883.1.11.19936 No
ActClassExtract 2.16.840.1.113883.1.11.19441 2.16.840.1.113883.1.11.20303 Yes
ActRelationshipSucceeds 2.16.840.1.113883.1.11.19765 2.16.840.1.113883.1.11.20015 No
AddressUse 2.16.840.1.113883.1.11.190 2.16.840.1.113883.1.11.20314 No
ObservationInterpretation 2.16.840.1.113883.1.11.78 2.16.840.1.113883.1.11.20327 Yes
PostalAddressUse 2.16.840.1.113883.1.11.10637 2.16.840.1.113883.1.11.20318 No
TelecommunicationAddressUse 2.16.840.1.113883.1.11.201 2.16.840.1.113883.1.11.20319 No
UnitsOfMeasureCaseSensitive 2.16.840.1.113883.1.11.12839 2.16.840.1.113883.1.11.20313 No

Requirement to "resurrect" Value Sets for RMIM Designer

At the November 2008 Harmonization Meeting, a proposal was approved to delete a number of value sets (and any related bindings) because they did nort represent appropriate content. For example there were value sets defined for the RoleCode and ActCode content that:

  1. have no use as a constraint since the content is only partial at best and should always be constrained further, using concept domain constraints;
  2. are bound to the equivalent concept domain, but should not be, since they will never represent either a universal or a representative binding; and
  3. are constructed from unions of other value sets expressed in a hierarchy, which is counter to the design philosophy of HL7 Vocabulary as adopted in November 2007.

When this change was implemented, it had the unintended and unanticipated effect of breaking the RMIM Designer tool that HL7 uses to design static models for the HL7 ballot. More detail on this issue is contained in the Dependency of RMIM designer section below.

As noted in the section below, the corrective action for this was to reinstate (resurrect) eight value sets that had been deleted, and re-establish their "unclassified" bindings. In each case, the value sets have been deprecated for use other than to provide support for the RMIM Designer tool. As a consequence, they will not appear in future ballots, and are "hidden" by default when reviewing vocabulary content in the RoseTree tool.

The value sets that were resurrected are:

  • ActCode (2.16.840.1.113883.1.11.13953)
  • ActReason (2.16.840.1.113883.1.11.14878)
  • ActSite (2.16.840.1.113883.1.11.16537)
  • HumanActSite (2.16.840.1.113883.1.11.16538)
  • InjuryActSite (2.16.840.1.113883.1.11.19438)
  • ObservationValue (2.16.840.1.113883.1.11.16614)
  • QueryParameterValue (2.16.840.1.113883.1.11.19726)
  • RoleCode (2.16.840.1.113883.1.11.12206)

Invalid "x_" Value Sets in Normative Edition 2008

Background

On February 11, 2009 Keith Boone identified an error in value set "x_ActMoodIntent" as it was represented in the vocabulary (and therefore in the vocabulary schema) published as part of Normative Edition 2008 (NE2008). This page is being built to document the problem, its analysis and intended resolution.

The simplest step, looking at the value set in any way (ballot, MIF2 file, data base, RoseTree) showed, as Keith had found, that this value set contains only one code - EVN (event) - and the intent codes that had previously been part of the value set were all missing.

The remainder of this document provides:

  • report or analysis as to how this happened both so as to avoid a repetition and to identify other value sets that were similarly affected
  • a list of the five value sets that were affected and how they were affected,
  • a summary of the impact of this error on packages such as Normative Edition 2008, and
  • the resolution of this issue in the form of new releases and patches to NE 2008

Analysis

GW Beeler reviewed the history of the vocabulary content and found that the error was introduced between the release of design repository "hl7_rimrepos-2.18.2.zip" on December 7, 2007, and "hl7_rimrepos-2.18.3.zip" on February 4, 2008. (See releases on gForge)

In January and February 2008, there was a project to "clean up" the vocabulary representation in the Access data base. This project progressed through several stages (each with multiple steps).

The first stage created code system hierarchies in all code systems based upon the value set hierarchies that previously existed.

The second stage involved re-aligning the value sets of the the CS RIM attributes (ActClass, ActMood, etc., but not ActCode, RoleCode and so forth). The task was to add or edit value sets that would align with each of the codes in the affected code system. This included "correcting" the value set constructors (elements that include value sets in other value sets) so that they matched the hierarchy in the code system. This is the stage that introduced the error Keith Boone found.

For each of these stages, we created source "tables" (in a data base) that "drove" the process, and programmed procedures and/or sets of queries to execute it. All of these have been retained, along with textual documentation of the whole process in the "MasterVocabularyQuery" data base.

Stage 2 logic

The detailed logic summary for the second stage process, written at the time of the changes in February of 2008, says:

Executes function "VS_ProcessVsChangesForStrucuralCs" which does the following:
  1. Opens record sets on tables: "zVS_VsChangesForStructCs" (the source), "VOC_value_set", "VOC_code_reference", and "VOC_value_set_constructor". (It also has a query for deleting selected rows from "VOC_value_set_property").
  2. Steps through the source data one row (concept code) at a time. For EACH code, it:
    • a) Determines whether there is an existing Value Set defined for that code. If there is, it:
      • i) Steps through all Voc_value_set_constructor records for which the existing VS is the "containing" VS. For each:
        • 1) if the contained VS is non-taxonomic, reset the containing VS to the "root" VS for this code system; if the contained VS is taxonomic, the constructor entry is deleted.
      • ii) Steps through all Voc_value_set_constructor records for which the existing VS is the "contained" VS and deletes these entries. (This could also be done with a simpler Delete query.)
      • iii) Deletes all Voc_Code_reference entries for which the existing VS is the "usedToBuildVs" (including the head code)
      • iv) Deletes all Voc_Value_set_property entries for which this VS is the "usedToBuildVs"
      • v) Determines whether the existing VS name matches the "style" for VS names of structural attributes. If it does not, it computes a corrected name and should (but does not yet) document that the old name is being "retired."
      • vi) Edits the Voc_value_set record to assert the correct name, definingExpression (null), allCodes(false), isTaxonomicSet (true), and basedOnCodeSystem for this VS
      • x)If there is no existing VS, the routine creates a new VS entry in Voc_value_set using the naming style and the concept's print name to arrive at a name for the VS.
      • xi) In either case, the value set id is stored in a "collection" keyed by the head code internalId (to allow later construction of the hierarchy.
    • b) Creates (or recreates) an entry in Voc_code_reference for the head code of the VS
  3. Steps through each row (concept code) of the source table a second time. It uses the parent head code internalId to look up the parent value set id in the collection created in step 2-a-xi. Then adds a row in Voc_value_set_constructor to show this relation ship IF the child VS has children of its own, but not if it is a leaf term.
  4. invokes "UpdateVocabularyVersionNumber" to add a new release version for these changes.

Logic Error

The problem is in step 2a-ii. The deletions documented here were fine for the "taxonomic" value sets because the constructors were added back in in step 3. However step 2a-ii should have excluded the cases where the "containing" value set (the "usedToBuildVs") was non-taxonomic (an "x_" value set), because these value sets do not align with the code hierarchy and were not added back in as part of step 3. The algorithm had no such exclusion and therefore executed the deletions that caused the problem Keith Boone found.

This analysis allowed the construction of queries to identify other value sets that might also have been affected. Specifically a query on the DB from a year ago that showed which "non-taxonomic" value sets included a taxonomic value set in their construction. This query revealed the list of value sets ion the following section, and inspection of all five from the current vocabulary content verify that they were indeed affected in a similar way.

Resulting Affected Value Sets

In all, it affected five "x_" value sets:

  • x_ActMoodIntentEvent - Deleted inclusion of the ActMoodIntent value set, which is based on the "INT" hierarchy in the ActMood code system
  • x_ActOrderableOrBillable - Deleted inclusion of the "ActClassProcedure" and "ActClassObservation" value sets, which are based on the "PROC" and "OBS" hierarchies in the ActClass code system.
  • x_EntityClassDocumentReceiving - Deleted inclusion of of the "EntityClassOrganization" value set which is based on the "ORG" hierarchy in the EntityClass code system.
  • x_EntityClassPersonOrOrgReceiving - Deleted inclusion of of the "EntityClassOrganization" value set which is based on the "ORG" hierarchy in the EntityClass code system.
  • x_RoleClassCredentialedEntity - Deleted Inclusion of "LicensedEntityRole" value set (renamed to "RoleClassLicensedEntity" by step 2a-v above), which is based on the "LIC" hierarchy in the RoleClass code system.

Impact on Published Package

The impact of such an error on the content of a package, such as the Normative Edition, is, fortunately, limited. Because the value set remained defined and available for hyper-links, none of the static model designs (RMIMs), their documentation, or the documentation of other vocabulary artifacts were misrepresented. The specific elements that need to be patched are:

  • the Design Repository delivered in support of the package, as this is the "primary" source for RIM and Vocabulary content;
  • the Vocabulary MIF files (two forms) that are used by tools such as the V3 Generator to build schemas and views for static model;
  • the Vocabulary.xml file that is used by the V3 Generator as the basis for the vocabulary schema that is referenced by all static model and data type schemas;
  • the voc.xsd schema that is referenced by all static model and data type schemas is created by the V3 Generator;
  • four html publication files that hold the definitions for value sets based on the ActClass, ActMood, RoleClass and EntityClass code systems, and display the affected value sets when browsing the content.

The most critical of these are the Vocabulary.xml file and the voc.xsd schema which are used to build and validate version 3 documents and messages (voc.xsd) and are used to re-generate sets of schemas (vocabulary.xml).

Planned Corrective Actions

In order to address this error, two near-term steps are planned:

  1. An updated release of the following is being prepared for immediate release:
    • New release Design repository, which includes a vocabulary core mif file;
    • Copies of the updated "vocabulary.xml" and "voc.xsd" will be added to the Design Repository package for this release;
    • New release of "CoreMif Instances" to include the updated MIF2 file;
    • E-mail notification of these releases, including hyper-links to the new releases.
  2. A "patch" for Normative Edition 2008 that will include all affected files, packaged in the directories where the Normative Edition web and CD versions install them. (This will permit "unzipping" the archive in a way that will replace the originals from the NE2008 distributions.)

Additional patches will not be prepared, but if members organization need patches for a particular design repository release they will be invited to request those by contacting the V3 Publishing Work Group.

Vocabulary Dependency of HL7 Design Software

Background of RMIM Designer Dependency

The HL7 RMIM Designer, using programs embedded in Visio, draws its constraints from the Vocabulary Content represented in a particular design repository. These programs were first created in the 2001-2003 time frame, and draw their vocabulary content from the "traditional" representation of vocabulary rather than the MIF2 representation.

In the course of "cleaning up" the vocabulary content, HL7 has tried to reduce the number of value sets and bindings that were introduced into the vocabulary content earlier on, but which have no proper use as constraints. For example there are value sets defined for the RoleCode and ActCode content that:

  1. Have no use as a constraint since the content is only partial at best and should always be constrained further, using concept domain constraints;
  2. Are bound to the equivalent concept domain, but should not be, since they will never represent either a universal or a representative binding; and
  3. Are constructed from unions of other value sets expressed in a hierarchy, which is counter to the design philosophy of HL7 Vocabulary as adopted in November 2007.

As a consequence, the Harmonization Meeting in November 2008 approved a proposal to delete a number of these value sets that have no possible representational use, and this proposal was implemented in December 2008. As a direct consequence of this change, the "drop-down" boxes used to select vocabulary constraints in the Visio RMIM Designer failed. These boxes no longer offered any choices to the user.

Basis for RMIM Designer Dependency

The "traditional" representation of HL7 vocabulary content conflated the ideas of concept domains, code systems and value sets in such a way that the display of the content was predicated on there being a triplet of one concept domain, one value set and one code system at the "root" of each constraint hierarchy. Although the "traditional" representation has been maintained in the tools, this representation is also dependent upon the retention of this triplet relationship within the defined content.

Consideration of the rationale for removing the value sets (as listed above) shows that one of the objectives is to break this triplet relationship wherever it is inappropriate. Unfortunately, the removal of the value sets and bindings in December broke this foundation for a number of critical attributes.

Work is underway to migrate the RMIM Design function to new software that will draw its content from MIF2 files. For this reason, no effort had been expended to migrate the existing RMIM Designer to the MIF2 representation, although program interfaces (APIs) to software libraries (DLLs) that contain the MIF2 content were available. As a consequence of the failure of the RMIM Designer vocabulary constraint interface, Tooling Work Group participants concluded that the effort needed for such a migration was reasonable and the conversion was completed fairly rapidly.

Software Dependencies on "legacy" vocabulary representations

So long as the RMIM Designer used the "legacy" representation of the vocabulary content, it would have been necessary to retain this triplet relationship for certain key vocabulary content. Accordingly, the decision was undertaken to upgrade the RMIM Designer in Visio to use the semantically correct MIF2 representation of the vocabulary content.

Release 4.4.0 of the RMIM Designer was released February 28, 2009, and contains this upgrade. Therefore the RMIM Designer is no longer sets a requirement to retain the value sets that had been "resurrected" in order to preserve RMIM Designer function.

There is, however, one remaining requirement that limits the conversion of these value sets. This is a dependency in the V3 Generator that uses the "legacy" representation of vocabulary as the basis for the "vocabulary schema" (voc.xsd.) Removal of this dependency has been proposed, but not yet scheduled for resolution as part of ongoing support and development of the V3 Generator.

The value sets that were resurrected and deprecated are:

  • ActCode (2.16.840.1.113883.1.11.13953)
  • ActReason (2.16.840.1.113883.1.11.14878)
  • ActSite (2.16.840.1.113883.1.11.16537)
  • HumanActSite (2.16.840.1.113883.1.11.16538)
  • InjuryActSite (2.16.840.1.113883.1.11.19438)
  • ObservationValue (2.16.840.1.113883.1.11.16614)
  • QueryParameterValue (2.16.840.1.113883.1.11.19726)
  • RoleCode (2.16.840.1.113883.1.11.12206)