Vocabulary Authority Reference

From HL7Wiki
Jump to navigation Jump to search

Return to SDTC page; Return to CDA R3 Formal Proposals page.

See CDA R3 Formal Proposals for instructions on using this form. Failure to adhere to these instructions may result in delays. Editing of formal proposals is restricted to the submitter and SDTC co-chairs. Other changes will be undone. Comments can be captured in the associated discussion page.

Submitted by: Victor Brodsky Revision date: November 19th, 2009
Submitted date: November 19th, 2009 Change request ID:


While CDA documents will often be sent from one entity to another, the author of the document has no way of making available intended internal definitions of included internal codes or entire value sets to all downstream recipients of the CDA document.


Provide the CDA authors with the ability to reference specific various externally accessible sources individually for each of the included code values, so that such sources could be queried by the CDA document recipient in order to obtain more information about a specific code, including the definition of the code and the entire value set. This can be accomplished by adding codeSystemAuthority="URLorAnotherMeansOfDirectContact" and codeSystemAuthorityMethodOfCommunication="MethodName" attributes which could be placed next to any instance of "code=" within a CDA document.


Specifically, synoptic anatomic pathology reports can get very complicated as more effort is spent to make the data within them more granular. Often, the approach to structuring a synoptic pathology report will vary among institutions and individual pathologists. This produces a myriad of differing value sets for given codes - that is if both parties do agree on the definition of a specific data element and on what it is intended to convey. Providing the document authors with the ability to reference their own (and/or an external authoritative) vocabulary server that contains the precise definitions and value sets intended by the author would rightfully allow the authors to control the content while allowing the recipients to retrieve the additional data (definitions, value sets) precisely as intended by the author. The key is permitting the author to reference a public authoritative vocabulary server for some codes, while referencing their own externally accessible local vocabulary server for other codes in the same CDA document.

Here is a realistic example: A surgeon receives a pathology report at 3am stating that the donor liver biopsy shows moderate macrosteatosis. Which may or may not be close to the cut off of "bad liver", meaning the patient would not get this transplant. Ideally, you would want the surgeon to be able to see the entire value set. What were the other choices? Low macrosteatosis? Was "severe steatosis" the next level up? or was it "moderate-to-severe"? The surgeon has to make up his mind whether to do the surgery pretty quick...

Another example: A researcher would like to study the incidence of moderate-to-severe steatosis across multiple reports. Was "moderate-to-severe" even a choice in front a pathologist filling out the report at a given time in a given institution? Considering that the standards of medical practice, and with them the acceptable value sets change over time, it would be nearly impossible to obtain accurate statistics over time without knowing that (as a hypothetical example) "moderate to severe steatosis" was dropped from the acceptable value set in April of 2005 for some institutions. In this case a special query to the vocabulary server specified in the CDA instance for that specific code by the pathologist would reveal the historical value sets, as per the command defined by the description of the method of communication with that specific server.

Additionally, in cases where a value set is for example the entire LOINC plus 200 local codes (or a subset of LOINC), referencing an externally accessible authority, rather than trying to embed the entire value set into the CDA document instance is the only option. An alternative approach of externalizing the references to the vocabulary authority and placing such references into an HL7 message sent along with the CDA document (as opposed to allowing the references within the CDA document instance next to the actual code values) sets these references up to be lost when the CDA document is passed along downstream. Lastly, since the CDA document instance may and will cross institutional and likely national boundaries, the codeSystemAuthority="URLorAnotherMeansOfDirectContact" would need to either contain a direct actionable reference to an accessible entity (such as a URL), or if it is an alias, the alias would need to be readily resolvable to a direct reference by the external downstream recipient of the CDA document instance via clearly outlined uniform means.

The reason specifying the method of communication with the vocabulary server is important is because there are multiple concurrent efforts to develop software which could potentially play the role of such a vocabulary server, including:

- LexGRID ( http://informatics.mayo.edu/LexGrid/index.php?page=2 )

- DAS Server ( http://www.sanger.ac.uk/Software/analysis/das/ )

- NCI Metathesaurus ( http://ncim.nci.nih.gov/ncimbrowser ) & the UMLS Knowledge Source Server ( https://login.nlm.nih.gov/cas/login?service=http://umlsks.nlm.nih.gov/uPortal/Login )

- PHIN VADS at the CDC (as suggested by Dr. Dolin) ( http://phinvads.cdc.gov/ )

- IHE/SVS - Sharing Value Set profile / implementation of subset of CTS2.0 ( http://www.ihe.net/Technical_Framework/upload/IHE_ITI_TF_Supplement_Sharing_Value_Sets_SVS_TI_Draft_2009-08-10.pdf )

- Open Biomedical Ontology Foundry: http://www.obofoundry.org/

- TruData server at Weill Cornell Medical College ( http://loinc.org/adopters/weill-medical-college-of-cornell-university.html/ )

- The Med dictionary at the Columbia / Presbyterian Hospital ( http://med.dmi.columbia.edu/ )

...Even a slightly modified LDAP server employing the ISO 11179 standard could be used to respond to definition and value set queries, as described in a NASA presentation. ( http://trs-new.jpl.nasa.gov/dspace/bitstream/2014/16243/1/00-2199.pdf )

Since all of the above efforts are intended to be externally accessible, there is no reason to dictate one method of communication. The only need is to allow the author to specify what the method is, as there will be relatively few "vocabulary server types" out there. Allowing to specify the method of communication keeps CDA future proof.


Institutions often develop internal vocabularies that remain isolated and inaccessible to outsiders, even if the outsiders receive documents from such an institution. Permitting organizations to reference their own externally accessible vocabulary servers (or choose to reference for some or all codes outside servers, such as a College of American Pathology's Cancer Checklist server, or a LexGRID, or a DAS server) provides a starting point for the next step - mapping all these vocabularies to each other without dealing with most of the bureaucracy of each institution just to access the scattered vocabulary fragments. An example of such mapping efforts can be found here: http://bioportal.bioontology.org/ . These same referenced vocabulary servers would then eventually become able to respond to mapping queries (as TruData currently can) and that would truly help health-care IT integration efforts.

Recommended Action Items

Add codeSystemAuthority="URLorAnotherMeansOfDirectContact" and codeSystemAuthorityMethodOfCommunication="MethodName" or a variation of these to the CDA R3.


March 23, 2010: looks to be an issue that warrants Vocab WG discussion. May be a useful extension to data types specification, but more likely, is a useful extension to the set of metadata used to characterize a code system. Opposed: 0; Abstain: 0: In favor: 6.