V2 lengths Open Discussion Document
In the v2.7 ballot, there was some comment about errors of min and max values. We have agreed to create a taskforce to consider this and related issues such as how to better document publish the min and max lengths.
This document was originally drafted by Frank in discussion with Grahame and has now become the working analysis page of the v2 lengths task force.
1 Type of lengths
Firstly, we clearly have the prospect of 6 different type of lengths:
|normative||n..m||An implementer shall take care of the length in the way it is specified - no excuse.||ID, DTM, DT, ..|
|informative||(n..m)||An implementer may take the presented length as a recommendation. He may decide to use either shorter or longer information. But the presented length has been shown to be a reasonable size in that specific context. The basis for the recommendation should be stated (in the field description).||IS, ST, TX, FT, ..|
|derived normative||empty||Same as "normative", but the length can be calculated from the underlying information, i.e. the length for a composite data type can calculated from its components.
Note: It should be mentioned that this calculation is sometimes not that easy, esp. if conditions (co-occurance rules) must be evaluated.
Derived information is not substantive to ballots.
|CNE, TS, PT, MSG|
|derived informative||empty||Same as "informative", but the length can be calculated.||CWE|
|depending normative||see normative||Same as "normative", but the length is depending from other information. E.g. a text can be investigated if a code (in a sibling message constituent) is present.||-|
|depending informative||see informative||Same as "informative", but the length can be calculated from other information.||-|
- Frank - this isn't at all clear to me. Can you clarify what "depending xxx" is.--GrahameGrieve 16:42, 19 May 2008 (CDT)
- it's possible to get a permutational explosion of lengths when you mix normative and informative lengths on composite fields. Hence, don't report them. --GrahameGrieve 16:42, 19 May 2008 (CDT)
2 Appearance in the chapters
The length column within the attribute table and the component table will remain. But the cell will only be populated, if the underlying data type is not composite.
The paragraph after the component table listing the total length will be deleted.
3 General Statements
- The min length normally starts with "1". ("0" is definitely wrong as an empty field is controlled by optionality.)
- Min length must be shorter or equal to max length. In any case it should not be "infinite (*)".
- In some cases min length may be greater than "1". This is given if a text format is presented (e.g. DTM) or the underlying table values indicate a greater length (e.g. MSG).
- It should be checked whether fields/components using "TX" or "FT" should have an infinite length! (Note: we do not have that many: 24 TX data elements with 3 infinite and 7 components with 1 infinite.)
- Should ED also be checked? .--Tony Julian 200805201053
- The length information from v2.6 is taken as the default recommendation for length, unless the data type indicates to make it normative.
- Normative length information may be supported by a more precise recommendation for a certain context of use. (Regarding the remaining amount of time to prepare N2 ballot for v2.7 the TCs may decide to add a recommendation not before v2.8.)
- A small paragraph on top of each chapter should guide the reader to the sections explaining the enhanced syntax of length information.
4 Counting of special characters
Special characters using escape sequences are counted as "1" character.
5 Example 1: CWE
We propose that to take CWE as our working example (cause that was Grahames test case when he reviewed the ballot).
What is the min/max lengths for CWE, and it's components?
|Seq.||Component||DT||length in ballot||correct length||comment|
|1.||Identifier||ST||(12..20)||(1..20)||The min length must be 1 because even the shortest codes must be transmittable.
It stays as informative, because the values of the underlying codesystem cannot be predicted. E.g. Snomed CT with its post-coordination allows for arbitrary length.
|3.||Name of Coding System||ID||(20..20)||1..12||Where did 20..20 come from for #3, #6, #12? This is clearly wrong. The min length is 1 and the max length is 12.
|6.||Name of Alternate Coding System||ID||(20..20)||1..12|
|7.||Coding System Version ID||ST||(1..10)|
|8.||Alternate Coding System Version ID||ST||(1..10)|
|9.||Original Text :||ST||(1..199)|
|10.||Second Alternate Identifier||ST||(1..20)|
|11.||Second Alternate Text :||ST||(1..199)|
|12.||Second Name of Alternate Coding System||ID||(20..20)||1..12|
|13.||Second Alternate Coding System Version ID||ST||(1..10)|
|14.||Coding System OID||ST||(1..199)|
|15.||Value Set OID||ST||(1..199)|
|16.||Value Set Version ID||DTM||(4..24)||4..24||see example #2|
|17.||Alternate Coding System OID||ST||(1..199)|
|18.||Alternate Value Set OID||ST||(1..199)|
|19.||Alternate Value Set Version ID||DTM||(4..24)||4..24||see example #2|
|20.||Second Alternate Coding System OID||ST||(1..199)|
|21.||Second Alternate Value Set OID||ST||(1..199)|
|22.||Second Alternate Value Set Version ID||DTM||(4..24)||4..24||see example #2|
6 Note in chapter 2.A (pg.33) for CWE:
"Minimum length set to 25 to accommodate largest code found thus far in a field associated with the CWE data type, plus the length of the CWE.3 Coding System, plus separators.
The QRI-3 is associated with HL7 table 0393 which has a value 12 characters in length. The longest entry in table 0396 coding Systems is 11. It may be advisable to simply set a value of 20 for each of these components, thereby making the minimum length 42. "
This is a truly confusing note. The min length is 25 to accommodate the largest code? It has to be based on the shortest code, not the longest code. so, for example, this CWE: |U^^HL70353| is now illegal.
The min length has to be 2. Now this is something that cannot be calculated, since I had to read all the field definitions in order to determine what combination is valid. This is the shortest legal CWE we can think of: |^A|.
Recommendation: Delete this note to balloters!
7 Example 2: DTM
DTM can be used as another example to demonstrate the use of informative information as a guideline to implementers. The first row presents the normative information. The second and third row indicates alternative guidance depending on the use case:
|#||Component||DT||length in ballot||correct length||comment|
||Dates must have a min length of 4 according to its pattern.|
|2||(14..24)||In context of use where the timestamp probably represents a point in time a precision of seconds is recommended.|
|3||(8..24)||In context of use where the timestamp probably represents a day a precision of days is recommended.|
The first row (#1) and one of the last two rows (#2 or #3) can be combined.
Even if an additional recommendation for a certain context of use is presented, an implementer may decide not to follow the recommendation.
8 Example Recommendations
The following may be a good example of recommendations for different min/max length:
|Data Element||Data Type||normative length||recommendation||Reason|
|MSH-7||DTM||4..24||(14..24)||Here it should indicate that the timestamp of a message should be seconds, but not years. If an application would like to use milliseconds – this is fine.|
|CWE.16||DTM||4..24||(8..24)||This component is the version information for a codesystem. Most probably this is based on days. Variations may be highlighted in the field descriptions for a certain context of use.|
A problem is that if I truncate CWE.1 the results are disastrous. So I'd really like to know how long I have to make my persistent store. And from another thread, users want to put SNOMED CT expressions in CWE.1, so that would make it arbitrarily long. CWE.1 is probably good case to look at.
Even if you have a (good) recommendation of - say - "1..100" this may be too short. Hence, you are not allowed to truncate this information in CWE.1.
But in CWE.2 you may truncate? Because it is just text. But you should definitely not truncate (1..200) at 10 or 20 characters. In catalogs texts are designed for certain lengths. Take ICD 10 as an example. The difference to other texts is only visible within the last characters. Maybe it is in position 50 or 90 or even higher:
- Chirurgischer Eingriff oder sonstige Maßnahme wegen fehlender Zustimmung des Patienten nicht durchgeführt
- Chirurgischer Eingriff oder sonstige Maßnahme wegen Kontraindikation nicht durchgeführt
Having said that, truncation for CWE.2 should not be allowed because there is no means to compare a truncated value with the original value. Which is necessary for "coded" entries.
There are other fields/components where a truncation is possible (For information like the name of a city truncation is fine.): You can truncate at length-1 and append a recognised truncation character. This is a time honoured approach which should at least be mentioned when we get away from hard limits.
However this doesn't act as any solution to the conundrum of CWE.2 length. (let alone CWE.1)
But if you use CWE.2 without CWE.1 and use it you do not have a clue whether another truncated string refers to the same concept or not.
10 Open Issues
How to mark a variation in length for components for a certain context of use? E.g. how to express