This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

FHIR Xml Page

From HL7Wiki
Revision as of 13:44, 26 September 2012 by Ewoutkramer (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Mappings

I think all mappings should be maintained as distinct files - including RIM and R2 mappings. The RIM mapping file will simply be one of the required source files.

Rationale is that all RIM mappings are highly dependent on the English definitions for the properties. If the definition changes, then the mapping should probably be looked at again. A mapping should therefore consist of 4 columns:

1. Element path being mapped 2. Definition of that element path at time of mapping 3. Mapping expression 4. Comments about the mapping

Presume we'd have a convention of naming the mapping CSV files something like "[resourceName]-mapping-[targetSpec].csv" where targetSpec could be RIM, v2, etc.

We'll need a convention on how to name mappings to other common specs, as well as convention on how to reference particular versions of specs.

The import validation process should spit out warnings if the current definition for the resource differs from the definition in the mapping file. Ideally it should flag the content in the HTML output too, perhaps with "?" in front or something. That will be the cue to the committees to go and look at the mapping and make sure it's correct, and when it is, copy the new definition into the mapping file. (Will have to ensure this is part of training.)

When we display mappings, I think we should just have a link that takes you to a separate page where the mappings for each element are listed. For some resources, we could have 20 different types of mappings. Having a simple list of links at the bottom of the resource would be easiest. That's probably appropriate even for v3, as a limited number of people will care to look at it.

Ideally, mappings should conform to the ITS Project's neutral mapping language.

Translations

At some point we're going to need to do translations. Not a super high priority, but we need to understand how we'll do it.

For resource content, Language translation files should be named [resourceName]-translation-[languageCode].csv

Each translation file will contain the following columns: 1. Element path being translated 2. Type of element being translated (name, definition, aliases, notes, etc.) 3. Current element English value (at time of translation) 4. Translated text

The import process spitting out a warning if the value at time of translation doesn't match the current value and flagging the published text with "?" to indicate that verification of the translation and updating the translation file with the new text value is required.

For resource html files, presume we can get away with [resourceName]-[languageCode].htm

For the FHIR spec itself, presume affiliates can set up a parallel site with a slightly modified url (e.g. www.hl7.org/fhir/de/introduction.htm) and then just translate each of the .htm files that make up the site. We'd generate the translated resources to drop into those directories too.

Aliases

We'd said that each resource would include a list of aliases to capture other terms by which it was sometimes known. I don't see this as a column.

Ontology

In the past, Grahame has said that FHIR operates as an ontology. However, none of the web documentation discusses that or what it means. If we've discarded that idea, that's ok. (Though I think extensions, at minimum, work as some sort of terminology.) If we're retaining it, we should describe what it means, even if just an isolated section that only the die-hards will read. We also need to think about exactly what sort of relationships will exist in the ontology. At the moment, I think we just have "part of" and "references". Do we need any others?

  • LM: From conversation w/ Grahame: This will partly be handled via the RIM mappings. It's partly handled by the fact that all elements can be expressed in OWL and viewed as name-value pairs if you want. Further work on this later.


Narrative

Status Flag

currently, there are three values allowed:

  • generated: The contents of the narrative are entirely generated from the structured data in the resource.
  • extensions: The contents of the narrative are entirely generated from the structured data in the resource, and some of the structured data is contained in extensions
  • additional: The contents of the narrative contain additional information not found in the structured data

these values aren't sufficient. Need a value for manual (additional was kind of supposed to cover that, but you don't know whether it actually means additional or not, and there's a difference between manual and additional). Also, there's allowance for an empty placeholder text - there should be a code to use for this case too. so propose two additional codes:

  • manual: the contents of the narrative were manually authored or edited
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

LM: I'd like the definitions revised before voting:

  • manual: the contents of the narrative were manually authored or edited and thus the relationship between discrete data and text cannot be confirmed
  • additional: The contents of the narrative are entirely generated and contain additional information not found in the structured data

(need to clearly disambiguate 'additional' from 'manual')

There's a slight issue in that "additional" and "extensions" are orthogonal. However, I'm not clear enough on the use of this information to know whether we need additional codes to handle the various combinations.

GG: clarified the definitions - they are not orthogonal

Voting on the following options then:

  • generated: The contents of the narrative are entirely generated from the structured data in the resource.
  • extensions: The contents of the narrative are entirely generated from the structured data in the resource, and some of that structured data is contained in extensions
  • additional: The contents of the narrative are entirely generated and contain additional information not found in the structured data and extensions
  • manual: the contents of the narrative were manually authored or edited and thus the relationship between discrete data and text cannot be confirmed
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

RI: I think we are mixing two concepts here (and they should be separate):

  1. How the narrative was created
  2. The semantic relationship (equivalence) between the narrative and the structured data

I suggest two flags:

  • created (values of "auto" and "manual")
  • equivalence (values of "equal", "more", "less", "none")

I also suggest that created is optional and equivalence is mandatory

GG: but if the narrative was created manually, how do you know what the equivalence is? (Also, no equivalence isn't the same as no text. I don't think it's quite equivalence). And your options don't cater for whether the narrative covers the extensions or not.

RI: Isn’t it irrelevant "how" the narrative was created? The more important point is it's semantic equivalence to the structured data? Instead of "none" we can have "notused". We know (without any flags) if there are extensions - so the flag does not tell us anything new. If the narrative does not cover the extensions, then you would use "less".

GG: well, no, I think it's very relevant whether it's human authored or machine generated. Very important indeed. With regard to extensions, I think the logic of considering it when was that if the narrative is generated, you can be confident you fully understand the resource. But if it was generated from data in the extensions as well, you need to understand them to be fully confident. The problem with "less" is that it doesn't tell you which less. And it's never full, because there's always computable stuff that doesn't go in the narrative. But having written this paragraph, I'm not overly impressed by the original argument

LM: I had been thinking similar to RI initially, but came to the conclusion that they were too intrinsically tied. If something is manual, there is no way to know what it contains - at least not safely. So you have to assume it contains additional content not encoded, and also have to assume that there may be important discrete data not rendered in the text. Equivalence can only be asserted in situations where rendering has occurred. Lets go to use-cases. The only reason we care about this flag at all is so applications know what they need to do with the text:

  1. You can ignore the text and display the discrete data by itself however you like because there's nothing in the text that isn't also in the discrete data. (Text must have been generated for this to be true.) There are three sub-cases:
    1. You only need to worry about resource content, as the text doesn't cover anything in extensions
    2. You can't throw away the text if you don't also understand the "must understand" extensions because some of the text content comes from them. (I presume non-must understand extensions aren't relevant?)
    3. There's nothing useful in the text at all
  2. You can throw away the text or the discrete data. They both convey exactly the same information
  3. If you wish, you can throw away the discrete data because there's nothing discrete that isn't fully exposed in the text, but there's stuff in the text that isn't in the discrete data. (Again, text must have been generated for this to be true.)
  4. You need to display both the text and discrete data that's relevant to your system because neither can be certain to be a complete representation of essential information. (Could be generated or manual)

To support this, I'd suggest the following:

  • generated-core: the text is generated solely from discrete data expressed in core resource elements, though it may not incorporate all discrete data
  • generated-extensions: the text is generated solely from discrete data expressed in core resource elements and extensions, though it may not incorporate all discrete data
  • generated-complete: the text is generated and represents exactly the set of discrete data expressed in the core resource elements and extensions
  • generated-additional: the text is generated and represents the set of discrete data expressed in the core resource elements and extensions as well as additional information not conveyed as discrete elements
  • supplemental: the text cannot be guaranteed to be either a subset or a superset of the discrete data. This includes manually entered text
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

RI: So lets work thru LMs list to see what you would do as a recipient system. That is, of the two options (structured V narrative) where is the "most" info that is clinically safe to render.

  • nothing - render the structured data
  • supplemental - render the structured data
    • LM: Renders structured data & text
  • generated-additional - render the narrative
  • generated-complete - Either is OK
  • generated-extensions - render the structured data
    • LM: Render the structured data if you understand all extensions, otherwise render structured data & narrative
  • generated-core - render the structured data

Is this correct? ( Perhaps we are looking at this from the wrong direction...and we just need a "renderMe" flag on either the structured data or the narrative....)

  • LM: Not sure it's that simple. We can't tell apps to render because not all apps do. And as indicated above, sometimes it depends on what an application supports whether it needs to render or not.

dataAbsentReason

During the thread on conformance expectations for data types, a question was asked that I don't think had a good answer at the time, around the use cases for the values of dataAbsentReason.

Here's my take on this:

 unknown 	The value is not known

This is a bit pro-forma, really. It's an assertion that the value is missing because the value is unknown. As opposed to some other reason that the value would be missing, because.... well some of the other options below. So this is kind of defined in it's actual use as "not one of the other reasons". What difference does it make? Well, to a human it might make some kind of difference as to whether or where they should bother trying to pursue the missing information. Functionally, it's an efficiency question (and this stuff actually matters, because care providers spend a lot of time hunting information that is missing)

The next three are sub-types of unknown, where an additional reason for unknown is provided

   asked 	The source human does not know the value

So for a clinical user, this is useful information - there's no point trying to find this information anywhere but the source human - very usually the patient - and you might draw a blank there.

   temp 	There is reason to expect (from the workflow) that the value

may become known

Again, this is useful - look, we don't know this information because we haven't got to that yet. We *might* in the future. so if this is an old record, you could try again in more recent. Or maybe you can infer that the workflow terminated prematurely for some reason

   notasked	The workflow didn't lead to this value being known

This is a variation on the previous one - we didn't ask the patient. This might be due to conscious choice, or simply that the question didn't come up, but it differs from the previous in that there's an assumption that we're done and not going to ask. If you can't differentiate between this case, and the last case, it's simply unknown.

 masked 	The information is not available due to security, privacy or

related reasons

Generally, if you're suppressing information due to security/access control/privacy reasons, the last thing you do is tell people you've done that. Saying, "we're not going to give you this person's psych records" is nearly as bad as giving them. But medicine has it's special cases, commonly referred to as emergency access, and this exists to support this - i.e. we could provide this information if you invoked some special access. Personally, I'm suspicious of this item. I can't imagine this kind of message being conveyed at the data item level, but only at the business exchange level. I'd like to drop it (prediction: Lloyd is going to explain just why this is absolutely critical and in the core)

 unsupported 	The source system wasn't capable of supporting this element

This is included because the single most common reason that some information isn't able to be provided is that the system providing the data simply doesn't track this piece of data. A case in point here in Australia that is a running topic is "comments on a medication". There's a group of clinical users who think that it's a critical safety thing to have a slot to make general comments on a medication, to explain things about it that don't fit anywhere else - compliance notes, explanations of choice, etc. But most systems don't have this field. And so people can't make a comment about the medication. How, then, can they know whether the field might have had a value if a value was possible? That's what this value is for. In a UI, you might display some hint like "the source system could not provide this information"

 astext 	The content of the data is represented as text (see below)

this is a FHIR specific one - the value we have doesn't fit into the typed element that is available. In v3, we provide originalText for this. But in FHIR, there's narrative - so this says, the value of this element is text - see the narrative. This is something that needs exercising to see how well it works out. And it's a note to the system. Perhaps in a UI, you might provide a hint that says "this information is only available by going here' or something.

 error 	Some system or workflow process error means that the

information is not available

The poster child for this is in a lab, where the sample runs out, or is dropped or something, so the result is not available due to error. However there's a lot of other cases for this - "error" - for some reason outside normal expected flow, this result is not available. Typically, there will be more information elsewhere, but this helps data extraction, because this is marked on the piece of data itself - so you keep this, along with a reference to the source material

Note that the last two are CDA-like concepts that don't really make sense unless there's a narrative.