This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

Difference between revisions of "FHIR Xml Page"

From HL7Wiki
Jump to navigation Jump to search
Line 94: Line 94:
  
 
GG: well, no, I think it's very relevant whether it's human authored or machine generated. Very important indeed. With regard to extensions, I think the logic of considering it when was that if the narrative is generated, you can be confident you fully understand the resource. But if it was generated from data in the extensions as well, you need to understand them to be fully confident. The problem with "less" is that it doesn't tell you which less. And it's never full, because there's always computable stuff that doesn't go in the narrative. But having written this paragraph, I'm not overly impressed by the original argument
 
GG: well, no, I think it's very relevant whether it's human authored or machine generated. Very important indeed. With regard to extensions, I think the logic of considering it when was that if the narrative is generated, you can be confident you fully understand the resource. But if it was generated from data in the extensions as well, you need to understand them to be fully confident. The problem with "less" is that it doesn't tell you which less. And it's never full, because there's always computable stuff that doesn't go in the narrative. But having written this paragraph, I'm not overly impressed by the original argument
 +
 +
LM: I had been thinking similar to RI initially, but came to the conclusion that they were too intrinsically tied.  If something is manual, there is no way to know what it contains - at least not safely.  So you have to assume it contains additional content not encoded, and also have to assume that there may be important discrete data not rendered in the text.  Equivalence can only be asserted in situations where rendering has occurred.  Lets go to use-cases.  The only reason we care about this flag at all is so applications know what they need to do with the text:
 +
# You can ignore the text and display the discrete data by itself however you like because there's nothing in the text that isn't also in the discrete data.  (Text must have been generated for this to be true.)  There are three sub-cases:
 +
## You only need to worry about resource content, as the text doesn't cover anything in extensions
 +
## You can't throw away the text if you don't also understand the "must understand" extensions because some of the text content comes from them.  (I presume non-must understand extensions aren't relevant?)
 +
## There's nothing useful in the text at all
 +
# You can throw away the text or the discrete data.  They both convey exactly the same information
 +
# If you wish, you can throw away the discrete data because there's nothing discrete that isn't fully exposed in the text, but there's stuff in the text that isn't in the discrete data.  (Again, text must have been generated for this to be true.)
 +
# You need to display both the text and discrete data that's relevant to your system because neither can be certain to be a complete representation of essential information.  (Could be generated or manual)
 +
 +
To support this, I'd suggest the following:
 +
* generated-core: the text is generated solely from discrete data expressed in core resource elements, though it may not incorporate all discrete data
 +
* generated-extensions: the text is generated solely from discrete data expressed in core resource elements and extensions, though it may not incorporate all discrete data
 +
* generated-complete: the text is generated and represents exactly the set of discrete data expressed in the core resource elements and extensions
 +
* generated-additional: the text is generated and represents the set of discrete data expressed in the core resource elements and extensions as well as additional information not conveyed as discrete elements
 +
* supplemental: the text cannot be guaranteed to be either a subset or a superset of the discrete data.  This includes manually entered text
 +
* nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

Revision as of 16:24, 1 June 2012

Mappings

I think all mappings should be maintained as distinct files - including RIM and R2 mappings. The RIM mapping file will simply be one of the required source files.

Rationale is that all RIM mappings are highly dependent on the English definitions for the properties. If the definition changes, then the mapping should probably be looked at again. A mapping should therefore consist of 4 columns:

1. Element path being mapped 2. Definition of that element path at time of mapping 3. Mapping expression 4. Comments about the mapping

Presume we'd have a convention of naming the mapping CSV files something like "[resourceName]-mapping-[targetSpec].csv" where targetSpec could be RIM, v2, etc.

We'll need a convention on how to name mappings to other common specs, as well as convention on how to reference particular versions of specs.

The import validation process should spit out warnings if the current definition for the resource differs from the definition in the mapping file. Ideally it should flag the content in the HTML output too, perhaps with "?" in front or something. That will be the cue to the committees to go and look at the mapping and make sure it's correct, and when it is, copy the new definition into the mapping file. (Will have to ensure this is part of training.)

When we display mappings, I think we should just have a link that takes you to a separate page where the mappings for each element are listed. For some resources, we could have 20 different types of mappings. Having a simple list of links at the bottom of the resource would be easiest. That's probably appropriate even for v3, as a limited number of people will care to look at it.

Ideally, mappings should conform to the ITS Project's neutral mapping language.

Translations

At some point we're going to need to do translations. Not a super high priority, but we need to understand how we'll do it.

For resource content, Language translation files should be named [resourceName]-translation-[languageCode].csv

Each translation file will contain the following columns: 1. Element path being translated 2. Type of element being translated (name, definition, aliases, notes, etc.) 3. Current element English value (at time of translation) 4. Translated text

The import process spitting out a warning if the value at time of translation doesn't match the current value and flagging the published text with "?" to indicate that verification of the translation and updating the translation file with the new text value is required.

For resource html files, presume we can get away with [resourceName]-[languageCode].htm

For the FHIR spec itself, presume affiliates can set up a parallel site with a slightly modified url (e.g. www.hl7.org/fhir/de/introduction.htm) and then just translate each of the .htm files that make up the site. We'd generate the translated resources to drop into those directories too.

Aliases

We'd said that each resource would include a list of aliases to capture other terms by which it was sometimes known. I don't see this as a column.

Ontology

In the past, Grahame has said that FHIR operates as an ontology. However, none of the web documentation discusses that or what it means. If we've discarded that idea, that's ok. (Though I think extensions, at minimum, work as some sort of terminology.) If we're retaining it, we should describe what it means, even if just an isolated section that only the die-hards will read. We also need to think about exactly what sort of relationships will exist in the ontology. At the moment, I think we just have "part of" and "references". Do we need any others?

Narrative

Status Flag

currently, there are three values allowed:

  • generated: The contents of the narrative are entirely generated from the structured data in the resource.
  • extensions: The contents of the narrative are entirely generated from the structured data in the resource, and some of the structured data is contained in extensions
  • additional: The contents of the narrative contain additional information not found in the structured data

these values aren't sufficient. Need a value for manual (additional was kind of supposed to cover that, but you don't know whether it actually means additional or not, and there's a difference between manual and additional). Also, there's allowance for an empty placeholder text - there should be a code to use for this case too. so propose two additional codes:

  • manual: the contents of the narrative were manually authored or edited
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

LM: I'd like the definitions revised before voting:

  • manual: the contents of the narrative were manually authored or edited and thus the relationship between discrete data and text cannot be confirmed
  • additional: The contents of the narrative are entirely generated and contain additional information not found in the structured data

(need to clearly disambiguate 'additional' from 'manual')

There's a slight issue in that "additional" and "extensions" are orthogonal. However, I'm not clear enough on the use of this information to know whether we need additional codes to handle the various combinations.

GG: clarified the definitions - they are not orthogonal

Voting on the following options then:

  • generated: The contents of the narrative are entirely generated from the structured data in the resource.
  • extensions: The contents of the narrative are entirely generated from the structured data in the resource, and some of that structured data is contained in extensions
  • additional: The contents of the narrative are entirely generated and contain additional information not found in the structured data and extensions
  • manual: the contents of the narrative were manually authored or edited and thus the relationship between discrete data and text cannot be confirmed
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"

RI: I think we are mixing two concepts here (and they should be separate):

  1. How the narrative was created
  2. The semantic relationship (equivalence) between the narrative and the structured data

I suggest two flags:

  • created (values of "auto" and "manual")
  • equivalence (values of "equal", "more", "less", "none")

I also suggest that created is optional and equivalence is mandatory

GG: but if the narrative was created manually, how do you know what the equivalence is? (Also, no equivalence isn't the same as no text. I don't think it's quite equivalence). And your options don't cater for whether the narrative covers the extensions or not.

RI: Isn’t it irrelevant "how" the narrative was created? The more important point is it's semantic equivalence to the structured data? Instead of "none" we can have "notused". We know (without any flags) if there are extensions - so the flag does not tell us anything new. If the narrative does not cover the extensions, then you would use "less".

GG: well, no, I think it's very relevant whether it's human authored or machine generated. Very important indeed. With regard to extensions, I think the logic of considering it when was that if the narrative is generated, you can be confident you fully understand the resource. But if it was generated from data in the extensions as well, you need to understand them to be fully confident. The problem with "less" is that it doesn't tell you which less. And it's never full, because there's always computable stuff that doesn't go in the narrative. But having written this paragraph, I'm not overly impressed by the original argument

LM: I had been thinking similar to RI initially, but came to the conclusion that they were too intrinsically tied. If something is manual, there is no way to know what it contains - at least not safely. So you have to assume it contains additional content not encoded, and also have to assume that there may be important discrete data not rendered in the text. Equivalence can only be asserted in situations where rendering has occurred. Lets go to use-cases. The only reason we care about this flag at all is so applications know what they need to do with the text:

  1. You can ignore the text and display the discrete data by itself however you like because there's nothing in the text that isn't also in the discrete data. (Text must have been generated for this to be true.) There are three sub-cases:
    1. You only need to worry about resource content, as the text doesn't cover anything in extensions
    2. You can't throw away the text if you don't also understand the "must understand" extensions because some of the text content comes from them. (I presume non-must understand extensions aren't relevant?)
    3. There's nothing useful in the text at all
  2. You can throw away the text or the discrete data. They both convey exactly the same information
  3. If you wish, you can throw away the discrete data because there's nothing discrete that isn't fully exposed in the text, but there's stuff in the text that isn't in the discrete data. (Again, text must have been generated for this to be true.)
  4. You need to display both the text and discrete data that's relevant to your system because neither can be certain to be a complete representation of essential information. (Could be generated or manual)

To support this, I'd suggest the following:

  • generated-core: the text is generated solely from discrete data expressed in core resource elements, though it may not incorporate all discrete data
  • generated-extensions: the text is generated solely from discrete data expressed in core resource elements and extensions, though it may not incorporate all discrete data
  • generated-complete: the text is generated and represents exactly the set of discrete data expressed in the core resource elements and extensions
  • generated-additional: the text is generated and represents the set of discrete data expressed in the core resource elements and extensions as well as additional information not conveyed as discrete elements
  • supplemental: the text cannot be guaranteed to be either a subset or a superset of the discrete data. This includes manually entered text
  • nothing: the contents of the narrative contain something equivalent to "No human readable text provided for this resource"