This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

Difference between revisions of "FHIR editing and versions (old)"

From HL7Wiki
Jump to navigation Jump to search
(Created page with "{{FHIR Discussion Page}} <!-- Category:Active FHIR Discussion --> '''(Back to FHIR)''' =Introduction= Try as we might, any question regarding versions, authorship, updat...")
 
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
{{FHIR Discussion Page}}
 
{{FHIR Discussion Page}}
<!-- [[Category:Active FHIR Discussion]] -->
+
[[Category:Active FHIR Discussion]]
'''(Back to [[FHIR]])'''
 
  
 
=Introduction=
 
=Introduction=
Line 13: Line 12:
 
The openEHR and EN13606 Reference Models support this notion strongly, modelling information structures, rather than the "real world". Rather, the real world entities are mapped onto the reference model using archetypes. The RIM tries to stay closer to the domain model, but in doing so stimulates people to think that the classes in the RIM are the equivalent of the concepts in the domain. There are real benefits in staying close to a domain model, such as easier comprehension of the modelled information by involved clinicians, and an more natural mapping from the real world to its derived information, as amply described in Eric Evans "Domain Driven Design" book. However, we as modellers should keep in mind that we are producing classes that contain information we wish to retain about their real-world counterparts. This gives us the freedom to collapse an comprehensive set of concepts in reality to a single information property, much like terminologists collapse whole concepts to a single term. It should therefore no longer be a surprise that we can model a single real-world concept in two different ways using the RIM, or that one concept in the RIM covers two separate concepts in the real world. Anyone who has ever modelled questionnaires as deeply nested Observations understand that some data is in the end just that, data.
 
The openEHR and EN13606 Reference Models support this notion strongly, modelling information structures, rather than the "real world". Rather, the real world entities are mapped onto the reference model using archetypes. The RIM tries to stay closer to the domain model, but in doing so stimulates people to think that the classes in the RIM are the equivalent of the concepts in the domain. There are real benefits in staying close to a domain model, such as easier comprehension of the modelled information by involved clinicians, and an more natural mapping from the real world to its derived information, as amply described in Eric Evans "Domain Driven Design" book. However, we as modellers should keep in mind that we are producing classes that contain information we wish to retain about their real-world counterparts. This gives us the freedom to collapse an comprehensive set of concepts in reality to a single information property, much like terminologists collapse whole concepts to a single term. It should therefore no longer be a surprise that we can model a single real-world concept in two different ways using the RIM, or that one concept in the RIM covers two separate concepts in the real world. Anyone who has ever modelled questionnaires as deeply nested Observations understand that some data is in the end just that, data.
  
* LM: Where things get fuzzy is when you create an electronic order or document where its entry and representation in the system is the only "concrete" representation of that real thing.  However, even in that circumstance, the RFH instance is still not the "real" thing - it's an encoded representation of the real thing.  It's the data, not the reality.  That data can drive real behaviors though, even if the data is wrong.
+
* LM: Where things get fuzzy is when you create an electronic order or document where its entry and representation in the system is the only "concrete" representation of that real thing.  However, even in that circumstance, the FHIR instance is still not the "real" thing - it's an encoded representation of the real thing.  It's the data, not the reality.  That data can drive real behaviors though, even if the data is wrong.
  
 
= The real word, information and records =
 
= The real word, information and records =
Line 21: Line 20:
  
 
=Versions=
 
=Versions=
The concept of an observer also guides us to define what a "version" of a record is in the context of RfH. Generally speaking, a new version emerges when the observer makes a new record of the same observed fact or thing. The meaning of the simple sentence "the same fact" and thus the word "version" is connected to a distinction between two categories of real world things for which we record information:
+
The concept of an observer also guides us to define what a "version" of a record is in the context of FHIR. Generally speaking, a new version emerges when the observer makes a new record of the same observed fact or thing. The meaning of the simple sentence "the same fact" and thus the word "version" is connected to a distinction between two categories of real world things for which we record information:
 
* LM: Something else to consider is that objects can change all by themselves based on pre-defined rules.  E.g. a suspended drug scheduled to become active again, a prescription that transitions to "complete" when the patient would be presumed to have finished.
 
* LM: Something else to consider is that objects can change all by themselves based on pre-defined rules.  E.g. a suspended drug scheduled to become active again, a prescription that transitions to "complete" when the patient would be presumed to have finished.
  
Line 35: Line 34:
  
 
=Storage and communication of records=
 
=Storage and communication of records=
There is yet a fourth category of data: attributes associated with the communication and/or storage of the records. This information is generated by the technical process of communication or storage and are out of scope for RfH. Examples are "date received", "username of sender" or the collection of records received in the same transaction. Identifying these attributes is simple: imagine sending a record to another party. Data which is not influenced by this transmission is part of the record, otherwise it is part of the receiving system's data. Note that this data is not necessarily private to the system, as some of it might be needed to fulfill functional requirements like those specified in the EHR-SFM.
+
There is yet a fourth category of data: attributes associated with the communication and/or storage of the records. This information is generated by the technical process of communication or storage and are out of scope for FHIR. Examples are "date received", "username of sender" or the collection of records received in the same transaction. Identifying these attributes is simple: imagine sending a record to another party. Data which is not influenced by this transmission is part of the record, otherwise it is part of the receiving system's data. Note that this data is not necessarily private to the system, as some of it might be needed to fulfill functional requirements like those specified in the EHR-SFM.
  
 
=Practical consequences=
 
=Practical consequences=
RfH will define, for each type of resource, which attributes are part of its recorded information. The resource is our "informational entity". We will call this set of information informally "the data". Any resource, however, is created by the same idealized process of recording, so there RfH defines a fixed set of attributes belonging to the record itself, and which are not resoure-specific. We'll call this information "the metadata". Depending on the technology used to communicate a resource, the data may end up in "bodies" or "documents" while the metadata will be communicated in "headers". Indeed, these notions map easily to technologies like RESTful HTTP, ATOM RSS and hData. Users of openEHR and EN13606 will find that this separation is reflected in the distinction between archetyped data and RM classes like VERSION<T> or "Audit_info".
+
FHIR will define, for each type of resource, which attributes are part of its recorded information. The resource is our "informational entity". We will call this set of information informally "the data". Any resource, however, is created by the same idealized process of recording, so there FHIR defines a fixed set of attributes belonging to the record itself, and which are not resoure-specific. We'll call this information "the metadata". Depending on the technology used to communicate a resource, the data may end up in "bodies" or "documents" while the metadata will be communicated in "headers". Indeed, these notions map easily to technologies like RESTful HTTP, ATOM RSS and hData. Users of openEHR and EN13606 will find that this separation is reflected in the distinction between archetyped data and RM classes like VERSION<T> or "Audit_info".
  
 
Finally, we will discuss some notorious attributes and give them their place in the categories just described:
 
Finally, we will discuss some notorious attributes and give them their place in the categories just described:
Line 49: Line 48:
  
 
*Created/Updated date
 
*Created/Updated date
There are two sets of them: one the date of creation/update of a specific version, and probably one in the receiving system, beloning to the date an actual database-record was inserted. For RfH, only the first of these two is relevant and so it is part of the "metadata".
+
There are two sets of them: one the date of creation/update of a specific version, and probably one in the receiving system, beloning to the date an actual database-record was inserted. For FHIR, only the first of these two is relevant and so it is part of the "metadata".
  
 
**LM: This is iffy.  Prescription creation date seems like a piece of the record more than a piece of the metadata.  Perhaps that's because, in this case, the record directly corresponds to the real-world object.
 
**LM: This is iffy.  Prescription creation date seems like a piece of the record more than a piece of the metadata.  Perhaps that's because, in this case, the record directly corresponds to the real-world object.

Latest revision as of 16:39, 7 September 2012

Introduction

Try as we might, any question regarding versions, authorship, updates and identity sooner or later ask for a consistent view of what it is what we record, and what it means to record something. And as much as I like to be a pragmatic developer, I am afraid we will have to digress a bit into that murky subject of ontology. Lloyd had fun when Yeb started to ask how to express the fact that mr. X recorded that miss Y observed that some device measured mr. A's body temperature to be 37 celcius. What here was a fact, is there such a thing as a fact or is anything we record just a snapshot of our world at some point in time? I think the current RIM has a nice tools to handle most of these cases, but if you keep drilling deeper, it's never enough. Let's take the liberty to stop worrying at some point, and leave the rest to the ontologists who have been enjoying this ever since the Greek. Ontologists have the freedom to just pose nasty questions and leave it at that, unfortunately, in the end, implementors will have to run CREATE TABLE's.

On Modelling "reality"

We should also avoid too much navel-gazing when it comes to assigning properties to things. Does a person have an address, or should we record the fact that he lived somewhere at some point in time? Do I actually have a name, or is that just something that my dad registered for me at the town hall the morning of my birth? There is not definite way to model reality as a software artifact. Here, it is appropriate to quote Thomas Beale from "The openEHR Modelling Guide":

"One crucial point to understand about modelling is that the semantics of all definitions in a model constitute statements about the informational (or behavioural) entities defined by the relevant classes, and no more. (...) Thus, any concept in a model, (...) should not be understood as being a description of a concept in the real world, but as a formal, abstract model of a concept as agreed by the modellers."

The openEHR and EN13606 Reference Models support this notion strongly, modelling information structures, rather than the "real world". Rather, the real world entities are mapped onto the reference model using archetypes. The RIM tries to stay closer to the domain model, but in doing so stimulates people to think that the classes in the RIM are the equivalent of the concepts in the domain. There are real benefits in staying close to a domain model, such as easier comprehension of the modelled information by involved clinicians, and an more natural mapping from the real world to its derived information, as amply described in Eric Evans "Domain Driven Design" book. However, we as modellers should keep in mind that we are producing classes that contain information we wish to retain about their real-world counterparts. This gives us the freedom to collapse an comprehensive set of concepts in reality to a single information property, much like terminologists collapse whole concepts to a single term. It should therefore no longer be a surprise that we can model a single real-world concept in two different ways using the RIM, or that one concept in the RIM covers two separate concepts in the real world. Anyone who has ever modelled questionnaires as deeply nested Observations understand that some data is in the end just that, data.

  • LM: Where things get fuzzy is when you create an electronic order or document where its entry and representation in the system is the only "concrete" representation of that real thing. However, even in that circumstance, the FHIR instance is still not the "real" thing - it's an encoded representation of the real thing. It's the data, not the reality. That data can drive real behaviors though, even if the data is wrong.

The real word, information and records

So far, I have tried to separate our notion of the things in the real world and the information we like to retain about these things in our systems, the "informational entities" as Thomas Beale calls them. But how does information about the world get into our systems? Who is "transcribing" reality into our predefined informational entities? Ideally, this involves a kind of fictional, objective observer, looking at a process going on from behind a glass wall, his all-seeing senses registering reality and translating it into data. He's objective in that he does not look at nor draws conclusions from the read-outs of our devices, but instead he records the interpretations of the clinician. He's solely responsible for the recording. The observer not only produces the informational entities, but also maintains a fair bit of information about himself and his process of recording. When did he record the information? Did he later on find mistakes in the recording that needed correction? Is the record still relevant or valid? Did the clinician agree with his recording of facts? This is information about the recording, not the record itself.

This distinction is relevant to us and I think the RIM did not make this point clear enough. Implementors got into problems when trying to figure out what it means to have multiple recordings or versions of an Act or Entity in the database and, for example, what thing's status Act.status is actally trying to record. Obviously, an Act can be "completed", but how can it be "new" or "created in error"? A clear case of where information about the act ("completed") and the record of this information ("created in error") are confused.

Versions

The concept of an observer also guides us to define what a "version" of a record is in the context of FHIR. Generally speaking, a new version emerges when the observer makes a new record of the same observed fact or thing. The meaning of the simple sentence "the same fact" and thus the word "version" is connected to a distinction between two categories of real world things for which we record information:

  • LM: Something else to consider is that objects can change all by themselves based on pre-defined rules. E.g. a suspended drug scheduled to become active again, a prescription that transitions to "complete" when the patient would be presumed to have finished.

One is the category of things that exist only for a small amount of time, are part of processes. These things occur at some moment, and are gone the next. Examples are measurements, opinions, procedures. We can record data about these "events", but if we want to go back to them later to gather more information, we need to rely on memory or more permanent artifacts like images or printouts. They are history almost immediately. Philosophers call these things "occurrents". If the event reoccurs (i.e. at the time of a new measurement), it is actually a new event and we'll get new information and a new record, possibly totally different from the recording of the old event. Conversely, if the observer makes a new transcription of an old event, we get a new _version_ of a record, which is necessarily just a correction to the existing record of facts that occurred in history.

The second category are for things that persist over time. You can look at them one moment and again at some later point in time. In between, these things might have changed, but that does not mean you are not still looking at the same thing. Examples are persons, organisations, patients and employees. Even if a person dies, his notion survives and so do his basic attributes. Another term for this category of things is "continuants". When we take a second look at such a thing, we create a new version of its record. This time, we are not correcting and old event, we are taking a fresh look at the object and thus create a new "shapshot" of its state at that time. Multiple versions thus feel more like successive "updates" than "corrections".

Now, to guide us in the decision which attributes belong to the record of information and which to the recorded information, we will have to ask ourselves what would happen if there were multiple authors around transcribing the same fact or thing. Assuming full objectivity of the observers, attributes which turn out to be the same are part of the recorded data, but those which are specific to the observer (or its version of the record) become attributes of the record itself.

  • LM: Presume this corresponds to the isDocumentCharacteristic property?
  • EK: I guess it is, but I could not find clear documentation on this property, so I don't know for sure.
  • LM: Clear documentation? HL7? :> The gist is that "document characteristics" are specific to the record and are not the same across moods. Non-document characteristics define the characteristics of the intended event. So, for example, "author" and "id" are document characteristics. The author and identifier of an order are not intended to be the author and identifier of the eventual event. On the other hand, "performer" and "effectiveTime" are not document characteristics. The performer listed on the order is who is expected to perform the requested event and the effectiveTime on the order indicates when the desired event should occur.

Storage and communication of records

There is yet a fourth category of data: attributes associated with the communication and/or storage of the records. This information is generated by the technical process of communication or storage and are out of scope for FHIR. Examples are "date received", "username of sender" or the collection of records received in the same transaction. Identifying these attributes is simple: imagine sending a record to another party. Data which is not influenced by this transmission is part of the record, otherwise it is part of the receiving system's data. Note that this data is not necessarily private to the system, as some of it might be needed to fulfill functional requirements like those specified in the EHR-SFM.

Practical consequences

FHIR will define, for each type of resource, which attributes are part of its recorded information. The resource is our "informational entity". We will call this set of information informally "the data". Any resource, however, is created by the same idealized process of recording, so there FHIR defines a fixed set of attributes belonging to the record itself, and which are not resoure-specific. We'll call this information "the metadata". Depending on the technology used to communicate a resource, the data may end up in "bodies" or "documents" while the metadata will be communicated in "headers". Indeed, these notions map easily to technologies like RESTful HTTP, ATOM RSS and hData. Users of openEHR and EN13606 will find that this separation is reflected in the distinction between archetyped data and RM classes like VERSION<T> or "Audit_info".

Finally, we will discuss some notorious attributes and give them their place in the categories just described:

  • Identifier

Identifiers actually show up in all categories. Of some real world objects we can definitely record an informational id, like most "human" identifiers: passport numbers, order numbers, patient numbers, and so on. When we record multiple versions, each version will require its own identifier, identifying a specific "recording" of this information. This id will be communicated and is thus part of some "universal" identification of a version of a resource. Lastly, the storage-systems itself will probably have database-generated private id's in the tables storing the versions.

  • Version number

This is specific to a record of information, and therefore part of the "metadata".

  • Created/Updated date

There are two sets of them: one the date of creation/update of a specific version, and probably one in the receiving system, beloning to the date an actual database-record was inserted. For FHIR, only the first of these two is relevant and so it is part of the "metadata".

    • LM: This is iffy. Prescription creation date seems like a piece of the record more than a piece of the metadata. Perhaps that's because, in this case, the record directly corresponds to the real-world object.
    • EK: Right. Indeed, here the record and the real-world object could align. But consider this: I could state that a Prescription is a product of a Prescription Act. This Act has its own record. Just like for example an imaging procedure, the Act leads to an artifact, which has actual meaning in the modelled domain, so is a continuant and has its own record. I guess "date taken" on an MRI-image would then be comparable to the creation date of the Prescription, so it feels like an aspect of the modelled reality, rather than part of the record...
    • LM: Actually, we'd say it's the subject of a ControlAct. All state changes and other trigger event-type things are ControlActs in the RIM. There's a special rule that says that for "activation" control acts (where you're changing the status of something to active), the author participation (including participation time) on the ControlAct is the same as that on the subject act being activated. Obviously that doesn't hold for other state changes such as suspend, resume, etc. All of these state changes themselves are "part of the record". Not sure I'm comfortable saying they're all metadata (they have authors, supervisors, data-entry people, data entry locations, responsible organizations, reasons, effective dates, occur in the context of encounters, may have associated detected issue overrides, etc.
  • Status

There are at least two different kind of statuses: one which can be observed on a process ("started") or thing ("alive"), and one which concerns the status of recorded information ("draft", "created on error"). This information must not be collapsed and is either part of the data or the metadata.

  • LM: I think "new" can apply to the underlying action. A "new" prescription is one that has been conceived and is in the process of being activated. (e.g. The physician has started writing stuff on their pad, but not yet affixed their signature.) That's a real-world action. Agree that "created in error" probably shouldn't be part of the state machine but should be a separate attribute.
  • EK: Mmm....or is this the Act of Prescribing that is just underway?
  • LM: Sure, but the "Act of Prescribing" (the activation ControlAct) has a subject act that has information in it, and that record needs to have some sort of status.


Issues: 1. We need to discuss how we chain actions that modify record. E.g. "cancel the request to suspend the prescription" or "A said that B said the C did".

  • EK: The first example looks to me like two events. When I used the RIM to build a prescription-system, we kept the original ControlActs to keep track of this. So we had multiple records of these separate events. Though, they would both have the original prescription as their subject, not the "suspend prescription", the net result being a cancelled prescription. These kind of "decorated" acts, just like "A said, that B said, that C did" is something I would LOVE to avoid, unless we have a clear usecase for this....
  • LM: I agree we avoid it without a use-case. But if we're going to start sticking audit stuff in HTTP headers or something, I'm worried it's going to break when the complex "A said B said C did" use-case does arise. I don't want us to adopt a solution that works for the simple case but won't scale to the complex case in those (limited) circumstances we decide the complex case needs to be supported.

2. How do we associate reasons/responsibility for changes to specific attributes rather than the whole record. E.g. "This phone number was changed because of an entry error" vs. "This phone number was changed because of a move".

  • EK: Some of your points should be covered by some yet-to-describe way to generally handle "state" information, maybe.
  • LM: Sure. All I'm saying is we need to figure out how that will be handled. Essentially - "What will replace the ControlActRef in the HXIT datatype?" (because that's how we handle this on the v3 modelling side.)

Related materials