This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

Difference between revisions of "VML Processing Widget"

From HL7Wiki
Jump to navigation Jump to search
Line 353: Line 353:
 
<big>'''Warning:'''</big> Owing to a glitch in the ANT scripts, the script may fail to respond to a "y" or "n" command that follows a previous "y". If keying your response plus return causes '''no action in the command line log''', simply repeat the command. It '''will''' work the second time.  
 
<big>'''Warning:'''</big> Owing to a glitch in the ANT scripts, the script may fail to respond to a "y" or "n" command that follows a previous "y". If keying your response plus return causes '''no action in the command line log''', simply repeat the command. It '''will''' work the second time.  
  
[[File:VMLWidget-AccessUpdatePrompt.png|431px|thumb|right|Access data base prompt to update]]
+
{|
 +
|-
 +
|[[File:VMLWidget-AccessUpdatePrompt.png|431px|thumb|right|Access data base prompt to update]]
 
'''Actions:'''
 
'''Actions:'''
 
# Uses a transform to split the designated vmif file into: an SQL update file to be imported by the DB (if needed) and/or a "traditional" (pre-extension) VML file for processing by Jave to the DB (if needed)
 
# Uses a transform to split the designated vmif file into: an SQL update file to be imported by the DB (if needed) and/or a "traditional" (pre-extension) VML file for processing by Jave to the DB (if needed)
Line 365: Line 367:
 
#:# the data base will roll-back any updates made as a result of this file, and
 
#:# the data base will roll-back any updates made as a result of this file, and
 
#:#:
 
#:#:
#:# the error from the data base will cause the ANT script to terminate immediately.  
+
#:# the error from the data base will cause the ANT script to terminate immediately.
 +
|-
 +
|}
  
 
====For the '''MakeNested''' command====
 
====For the '''MakeNested''' command====

Revision as of 03:25, 24 June 2014

Summary

One day soon, this page will document the VML Processing Widget. This an ANT-scripted process that:

  • consumes proposals for change expressed in the HL7 Vocabulary Maintenance Language (VML),
  • splits these proposals into two source streams targeted at the Access repository in which the HL7 Vocabulary content is stored::
    • an SQL update stream for updating files of properties assigned to vocabulary objects, and
    • an XML stream processed in Java to update the primary Access tables
    together, these streams update the vocabulary content stored in the Access repository,
  • uses a Java process to "clean up" the Access content,
  • invokes RoseTree to express the complete vocabulary content in HL7 Model Interchange Format (MIF), and
  • combines this MIF content from the Access repository with an externally maintained MiF extension file to arrive at a final, complete expression of HL7's vocabulary in MIF.
Jump to top of page

Background

Management of HL7 Vocabulary Content

In brief, the HL7 Vocabulary Content:

  • Is published and releases as a formal expression of the content in Model Interchange Format (MIF)files; This format:
    • Documents the data that define each of the artifacts (Concept Domains, Code Systems, Value Sets, etc.)
    • Documents the relationships between the artifacts (such as bindings);
    • Provides a vehicle for using XML X-path logic to query and/or analyze the content;
    • Provides the source for transforms used to publish the content in HL7 Ballots and Normative Editions.
  • Is archived and persisted two complimentary forms:
    • an Access data base, designed over 13 years ago, that serves as the "repositiry" for this content; plus
    • “Extensions files,” represented in MIF format, that primarily document those Value Sets that are defined against external code systems) expressed in MIF
  • Vocabulary Content is updated through the Vocabulary_and_RIM_Harmonization_Process, the results of which drive updates to the Repository and/or extension files.

Prior to the development of this widget, the means for processing updates to the HL7 Vocabulary Content involved:

  • Creating update files expressed in the HL7 Vocabulary Maintenance Language (VML) files.
  • "Posting the data in these files to the repository database using the Java-encoded procedures compiled as “VocMaint_and_dependencies.jar”. Weaknesses of this process include:
    • Some updates require manual changes (table/row updates) in Access – but only when VML cannot be used. This requirement arose primarily when it was necessary to update concept properties and “object properties” on Concept Domains and Value Sets
    • The original VML schema encapsulates “descriptive markup” (html formatting of descriptions), such as the definition of a concept or the description of a value set's content and purpose)in a CDATA wrapper. This encapsulation precluded validation of the markup prior to its being posted to the data base with the too-frequent requirement to cretae a subsequent correction update.
  • Once the content was posted to the database, the definitive “coremif” was assembled from the Access data by RoseTree, with the “Extensions” merged in by transform.

Capabilities of the New Widget

What does this Widget provide?

  • A revision of the VML Schema and its placement in the “mif namespace”, that:
  • Allows descriptive markup to be validated as the VML files are created
  • Extends the VML to include the ability to “update” and “delete” concept properties
  • Extends the VML to add the ability to “add”, “update”, and “remove/replace” object properties
  • Extends the VML to enable the creation of Value Sets using "extensional definition" against External Code Systems from VML

These extensions are not processed through the extant Java-process. Rather it relies upon pre-defined queries and code (activated from the widget) that perform the necessary data base updates without interfering with the ability to use the original VML for its intended purpose. This is accomplished by:

  • Using XSLT transforms to convert and split “revised VML files” into:
    • “Standard” VML that can be posted to the data base using the current vocabulary maintenance Java programs, AND
    • Creation a standard data import table (in pipe-delimited format) that defines the data and control codes needed to perform the “extension” queries,
  • establishing queries and logic in the Access repository (using Visual Basic fr Applications - VBA - in Access) to post these changes.
  • Process controlled by the Java-based ANT scripting language that are controlled from a Work List, and that automate all non-editorial processes
Jump to top of page

System Requirements and Installation

System requirements

  • Windows - 7 or 8 32- or 64-bit: Although the core elements Script SHOULD work in a Unix or Mac environment, they have not been tested other than in Windows amd other key elements woill NOT work outside of Windows. Specifically:
    • RoseTree, a Windows application, is needed to complete the extraction from the repository and the expression in MIF
    • Access is required in order to post the VML "extension" content into access.
  • Memory should exceed 2 (preferably 4( GB as the RoseTree memory demands while converting to MIF are sizable.
  • 32-bit Java JRE in environment: Even if your installation is a 64-but machine, there must be a 32-bit JRE installed in order for the Java VML posting process to connect to the data base. The 32-bit JRE can be installed side-by-side with a 64-bit JRE.
  • XML editor that will validate from schema: is highly beneficial when creating VML source files for processing. The author uses XML Spy.
  • Microsoft Access: Access is required to process the SQL updates that implement the "VML Extensions". This can be either a 32=bit or 64-bit Office installation.
  • RoseTree III: This application performs the final extraction and MIF expression step. It can be downloaded from the executables in the RoseTree project on Gforge.
  • Text Editor: (your choice) for editing the "working list" that governs the "widget processes." The ANT script will activate whichever text editor is the default for the *.txt" file type on your system,

Widget Installation

The Widget is distributed as a release of Vocabulary Maintenance Widget on HL7 Gforge, and is available as a hyperlink under "Quick Downloads" on the Gforge home page. The widget is distributed in a ZIP archive named like hl7_vocabulary-vmlwidget-m.n.o.zip, where m.n.o is the current release identifier.

Base Installation

  1. Download the archive
  2. Place the archive in a home directory of your choosing, but do not place it inside “Program Files”
    The archive includes its own “root” which will appear in the target directory as it is extracted.
  3. Extract file, maintaining the directory structure; the root directory will appear as named vmlwidget-m.n.o (as before where m.n.o is the release identifier.
  4. In the root directory find the file 00.00_1st_INSTALL_ALL_Change_to_dot_bat_and_RUN_this
  5. Rename it by adding the “.bat” extension at the end.
  6. Run the renamed ".bat file (by double-clicking)
  7. The installation will list the license terms and then pause for you to accept them. Accept the license agreement by responding YES

Directories and Initial Content Post-installation

Post-installation Directory and File Structure

The initial installation results in a file and directory structure as seen in the "screen-shot" shown at right. A summary of these follows, and a more complete explanation of the directory structures is found on a separate VML Processing Widget - Directory Structure page.

Specific Sub-directories

The specific Sub-directories used by the tool include:

  • output - as its name implies, it contains further sub-directories into which the resulting vocabulary repositories and MIF files will be placed. It is worth noting that although the directories are empty (newly created) at the beginning, they are not cleared out by the Widget. Rather it writes new files in, perhaps over-writing old.
  • source - has sub-directories that represent
    • user-generated source, such as new VML files
    • intermediate results such as delimited SQL source tables, and "traditional" VML files, and
    • MIF Extension files that are source content to be updated for value sets defined against external code systems
  • support - whose sub-directories hold the ANT scripts, schema files for the VML content, XSLT transforms for converting and preparing files, etc.
  • working - whose database sub-directory holds both the initial source of truth repository for the prior vocabulary releases, and the new repository being built by the widget.
  • zip archives - these may be treated as directories, but here, they simply hold the source material for the various "batch" files that invoke the widget processes.

Batch (*.bat) Files

The function of this widget is invoked entirely by "running" one of the four "batch" files (ending in ".bat") whose names begin with an integer. (The exact function of these is covered in detail in a separate section of this manual.) Executing or running these files is done by: "double=clicking" on the file in Windows Explorer; "right-clicking" and selecting "Open" from the right-click drop down menu; or selecting the file in Windows Explorer by clicking on it, and then pressing "enter".

The two "batch" files (ending in ".bat") whose names begin with "Run" are required intermediate files that are "called" by one of the primary batch files noted above.

The remaining "batch file (supp_00.00...ExtractSupportingBatFiles.bat) is a utility that will unzip and present additional "supplemental" batch files that are used in maintaining the widget, but are not needed for ordinary use.

Remaining Files

  • defined-environment.properties This file is a "configuration" file for the tool whose function and settings are discussed under section Configuring and Providing Sources for the Widget
  • installation_...log - is a log of the installation steps
  • InstallationGuide.txt is a quick-start guide, just like section Base Installation above.
  • LICENSE.txt Lists the terms you agreed to when you accepted the license.
Jump to top of page

Configuring and Providing Sources for the Widget

Whenever the VML Processing Widget is taken up for use, it is necessary to establish the parameters that define the content to be defined and the environment in which the it will be used. This involves four considerations:

  1. Determining the Release Identifier and Release Date for the vocabulary release towards which the changes are targeted
  2. Collecting the Source of Truth Repository and the latest Vocabulary EXTENSION files from which to start the changes and placing these files in the appropriate directories, and
  3. Setting any necessary properties in the defined-environment.properties file.

Each of these topics is covered in a sub-section below.

Determining the Release Identifier and Release Date

Each set of vocabulary proposals to be implemented (by preparing and posting VML files) is processed within the context of a planned vocabulary "release". By custom, HL7 process three vocabulary releases a year, each resulting from the Harmonization meetings scheduled between Working Group Meetings.

Using this tool suite, the Release Identifier is designated in a pattern of yyyyTn where yyyy is the calendar year, and n is the trimester number, with "1" starting at the beginning of the January Working Group Meeting and continuing to the start of the May Working Group Meeting when trimester "2" begins. If there is a need for extra releases between two releases, they will be designated as yyyyTnCm where yyyyTn is the identifier for the most recently completed trimester release, and m is the sequence number of the special (usually corrective) releases.

The Release Date is a secondary, but important parameter for any release. For value set versions and code system released versions that are dated, this is the "version date" that will be assigned. In the case of the planned trimester releases, this date will be the day before the ballot opening for the ballot held at the end of that trimester. For any extra releases, a "close date" will need to be determined and assigned prior to the initiation of processing for this release.

Collecting and Placing Initial Repository and MIF EXTENSION files

Since each vocabulary release builds upon its predecessor, the next critical step is to collect the source of truth for the vocabulary repository (in Access) and for the vocabulary EXTENSION file. Together these are the authoritative sources fromwhich to begin.

Both of these files can be found by downloading the latest hl7_rimRepos....zip file from the rimRepos releases in the Design Repository project on Gforge. The full release file name will be something like hl7_rimRepos-2.44.7.zip (where the 2.44.7 reflects point releases for RIM release 2.44). The contents of the zip archive will contain a variety of files, but two among them are needed here:

  1. Vocabulary repository in Access named like rim_none_Voc1287_20140516_Repository20140610.mdb where the release numbers (like 1287_20140516) and dates (like 20140610) vary from release to release.
    Place a copy of this file in the Widget sub-directory working\database
    AND Rename the file SourceOfTruthRepository.mdb
    (renaming is optional, but if the file is not renamed the property file.source.of.truth.database will need to be set in the defined.environment.properties file)
  2. Vocabulary EXTENSION file in MIF is contained within a zip archive named like EXTENSION=UV=VO=1287-20140516.zip where the release number (like 1287-20140516) varies from =release to release. Inside the arvhiceve is a single *.coremif file with the same name as the archive.
    Extract the coremif file and place a copy of it in the Widget sub-directory source\extensionCoremif
    AND Rename the file StartingEXTENSION.coremif
    (renaming is optional, but if the file is not renamed the property file.initial.extension.coremif will need to be set in the defined.environment.properties file)

Setting properties in the defined-environment.properties file

The final preparatory step is to set several critical elements in the file defined-environment.properties in the root directory of the Widget. This file is, in effect, a configuration file for the Widget determining a number of critical properties. The following shows the opening lines of this file.

##############################
# Users Software Environment #
##############################

#env.hl7.tools.directory  - the root directory UNDER the 32 bit "program Files" directory, and 
#=======================    in which RoseTreeIII and other HL7 programs are installed
#Default: HL7

########################
# Vocabulary Release ID #
########################
#Default: releaseId=2014T2_2014-08-07

########################
# DATA BASE FILE NAMES #
########################

#file.source.of.truth.database - File name for source-of-truth data base 
#=============================
#Default: file.source.of.truth.database=SourceOfTruthRepository.mdb

As is customary in most property file, any row beginning with the "#" (hash mark) will be ignored, and thus is used for documentation. Blank lines are similarly ignored. true property lines start with thr property name (like file.initial.extension.coremif followed by an "=" (equal sign) and the value of the property.

In this particular file, if the two "source of truth" files were renamed as suggested in section Collecting and Placing Initial Repository and MIF EXTENSION files, the default values for file.source.of.truth.database and file.initial.extension.coremif are correct. Otherwise these these properties will need to be set to the actual file names.

Thus, the only property that will almost certainly need be changed here is the releaseId property. The value of this property is the Release Identifier (listed above) concatenated with the Release Date (expressed as an XML date yyyy-mm-dd) with an underscore ("_") separating the two elements. The default value is 2014T2_2014-08-07 which is correct for the summer trimester of 2014.

Jump to top of page

Creating VML Files From Proposals

The task of creating VML files based on the Harmonization proposals is straightforward, albeit tedious and challenging when one discovers that the data in the proposal is incomplete.

Editing and Validating VML Files

The extended VML language is documented in detail elsewhere. The complete schema for the VML is distributed with the widget in the file: support/xsd/VocabularyRevisionMif.xsd.

It is strongly recommended that VML authors use an XML editor that can continually validate the file against the VML schema and can "prompt" for the XML elements and attributes that might be used at any point in the file. XML Spy is frequently used to this end, but other validating XML editors are equally usable.

VML Example Template

<editDescreiption/> element that acts as VML header

An example file that can be used as a template for building VML files is distributed here as file support\templates\VML-MIF starter template.vmif This template serves two primary purposes:

  1. The root element includes a binding and link to the Extended VML schema to facilitate file validation and the presence of element and attribute prompts; and
  2. The first child element (<editDescription/> as at right) acts as the proposal header and lists the set of data, drawn from the proposal that SHOULD be incorporated in each VML file.
Header section from a Harmonization Proposal

An example of the required header data for VML posting can be seen in the following example. The screen-shot at right shows the top elements from a Harmonization proposal. The elements marked with green highlighter are those that will be positioned within the actual VML file (as seen in the figure below).

Example VML header

Specifically, the elements and attributes of <editDescription/> come from:

  • attr:creationDate is the date on which the VML file is created
  • attr:primaryContact is the name of the first person in the proposal "Editor/Author" field
  • attr:proposalId is the "Recommendation ID" at upper right on the proposal header
  • attr:committee is the Work Group listed as "Sponsored by:" in the second box on the left of the proposal header
  • elem:proposalName holds the "PROPOSAL NAME" field of the proposal header
  • elem:descriptionHolds the text (optionally with html markup) that appears under "SUMMARY RECOMMENDATION" in the proposal
Jump to top of page

Working List to Manage Posting Process

The sequence of tasks that is undertaken by the Widget is determined by a working list file. This file is a simple ASCII text file that is stored in directory support\manifests\WorkingList.txt. (Underneath the covers, the working list is maintained in xml (documented elsewhere), but the entire user interface centers around the text version.) The following is an example Working List

#> Manifest: 201406102313
# ResetDb

#> Start example second trimester proposals
#> ==============================
# VML-MIFstarterTemplate.vmif
# VML-MIFstep2Template.vmif
# VML-MIFcloseTemplate.vmif
#> end t2

#> Start example third trimester proposals
#> =============================
  VML-MIFstep3Template.vmif
  VML-MIFclose2014T3.vmif
#> end T3 

  MakeNested
#> ## The End ##

Rules for Working List Entrries

  • The first row SHALL open with the string #> Manifest: followed by a space and a date time stamp
  • The last row SHOULD be #> ## The End ##
  • The remaining (intervening) rows are of one of four types:
    • blank row is recognized by its being blank or empty, These may be used to segregate or group content in the other rows;
    • annotation row is a row that starts with "#>". These may be used to document the working list;
    • file row starts with a File Pattern (defined below) that is followed by the file name for an extended VML file.
    • command row starts with a Command Pattern (defined below) that is followed one of two command strings:
      • ResetDb - SHOULD be the first non-blank, non-annotation row; or
      • MakeNested - SHOULD be the last non-blank, non-annotation row
    These commands initiate processes to either reset the data base to the "source of truth", or to complete processing of the repository, including extracting coremif files.
  • With the exception of the preferred locations for the "command" lines, the other types can be in any sequence or number that the author prefers.

File Pattern

The file pattern that may be used to open a line of type file provides values for two Boolean properties, skipProcessing and tested. The default value (signified by the absence of an indicator in the pattern) for both properties is false.

Specifically, the pattern may include spaces and further may include:

# (if present) indicates that skipProcessing is "true", and if present, must be placed in the first non-space position on the line
$ (if present) indicates that tested is "true", and must beplaced after # (if that is present) and before the file name

Thus, the following patterns are interpreted:

   fileName - Untested file, and do not skip processing
 $fileName - Tested file to be processed
# fileName - Untested file, and skip processing
# $ fileName - Tested, but skip processing.
Note that extra spaces, as in the last example, are optional at any place in the opening; they will be dropped when the line is “normalized”

Command Pattern

The command pattern may be used to open a line of type command and provides a value for the Boolean property skipProcessing. The default value (signified by the absence of an indicator in the pattern) for the properties is false.

Specifically, the pattern may include spaces and further may include:

# (if present) indicates that skipProcessing is "true", and if present, must be placed in the first non-space position on the line

Process Control by the Working List

As noted at the opening of this section, the sequence of tasks that is undertaken by the Widget is determined by the working list file. In simple terms, the automated process steps through the Working List and :

ignores all rows of type blank or annotation and
ignores all rows of type file or command where the skipProcessing property is "true"
processes all other file or command rows.
Note: That the "process" may include a user interaction acceptance, which would allow the user to manually "skip" a row that had not been marked for "skipProcessing". However, the inverse is not true in that a row that is marked with "skipProcessing" true will not be offered for user interaction acceptance..
Jump to top of page

Running the VML Widget

Primary VML Batch Commands

The VML Widget processing is initiated by one of four "batch" files (*.bat) installed in the root directory (see screen shot from Windows Explorerbon right). The commands are executed from Windows Explorer by: "double-clicking" on the command; right-clicking on the command and selecting "Open" from the menu that appears; selecting the item and clicking Menu:File...Open, or selecting the item and pressing "Enter" key.

The first of the three commands initializes the Working List, while the remainder process the working list commands. The distinction among the final three is whether or not they start with an "edit session" on the work list (2nd vs 3rd or 4th), and whether they ask the user to verify each step manually (2nd and 3rd vs 4th).

01...INITIALIZE_File_Manifest

This command creates a new Working List based upon the source files (*.vmif) in directory source\sourceMifVml. It creates a pro forma list that starts with the ResetDb command, lists each vmif file in alphabetic order, and closes with the MakeNested command.

If there is a pre-existing Working-list, the command-line execution will pause with the following message:

 [input] WorkingList file exists. Do you wish to MERGE content; DELETE existing file; or IGNORE the update? (MERGE, DELETE, [IGNORE])
Batch Command Closing Confirmation

The "MERGE" option is not supported (and will be rejected), therefore the only options are to "DELETE", which replaces the previous list with the newly generated list, or "IGNORE" which leaves the prior list in place. In either case, successful completion will show as a closing element like the screen-shot at right.

05...UPDATE_LIST_and_PROCESS

This command is probably the most frequently used. It invokes a multi-step interactive process to first edit the Working List, and then to interactively execute it.

Edit Working List

The first event is that the current Working List is opened in the user's default "txt" file editor. In Windows the default editor is NotePad, but the you are free to designate any editor you choose. During the edit session, you can: add additional files, move rows (files) around, document the lists with annotations, and most importantly, change the skipProcess property for one or more files by removing hash marks or adding them. When your editing is done, save the updated list, and close the application.

These actions will initiate the next set of steps., and the next set of steps are identical to the steps involved in 10...INTERACTIVE_PROCESS below.

10...INTERACTIVE_PROCESS

The interactive process starts whenever the Working List is prepared, either by starting with step "05" above, or starting with this step ("10") and using a previously saved Working List.

The process goes down the list processing, in order, each command or file line that has not been marked "skipProcessing" by having the line open with a hash ("#"). At each line, it pauses, and in the command line stream poses a question to the user in one of the following forms:

For the resetDb command

Display and prompt -

10...INTERACTIVE_WORK_LIST_PROCESS:
   [input] Reset repository to SourceOfTruth? (y/n) Default: [y] 

To execute this steps, respond with "y" and "return" keys To manually skip this step, respond with "n" and "return" keys

Action: The script deletes the previous working repository and replaces it with a copy from the designated source-of-truth data base file.

For any file row

Display and prompt -

-processSingleMifVmlFile:
   [input] VML-MIFstarterTemplate.vmif to respository? (y/n) Default: [y]

Warning: Owing to a glitch in the ANT scripts, the script may fail to respond to a "y" or "n" command that follows a previous "y". If keying your response plus return causes no action in the command line log, simply repeat the command. It will work the second time.

Access data base prompt to update

Actions:

  1. Uses a transform to split the designated vmif file into: an SQL update file to be imported by the DB (if needed) and/or a "traditional" (pre-extension) VML file for processing by Jave to the DB (if needed)
  2. If there is to be a SQL update, the script issues a command to open the repository (in Access) and pass a command-line string designating the actions to be performed. In this interactive mode, the data base displays a secondary "prompt" for confirmation that is like the screen-shot at right. The upper Yellow box lists the name of the SQL update file that will be processed. The lower yellow box labeled Process Vocab Update File is actually an "accept" button. Click on this box/button to continue processing. If you click close, the data base will not be updated.
  3. Once the SQL update is completed (if any) the script will pass the "traditional VML" file to the Java-based vocabulary updater to process.
    If there are any fatal errors in this process
    1. the log file will display the error,
    2. the data base will roll-back any updates made as a result of this file, and
    3. the error from the data base will cause the ANT script to terminate immediately.

For the MakeNested command

Display and prompt -

   [input] Finish repository (Make Nested, etc.)? (y/n) Default: [y]

Actions:


At the successful completion of a file row, the log will display:

   [java] [INFO] =============== Successful load! ===============
   [echo] Done VML-MIFstarterTemplate.vml
Jump to top of page