Difference between revisions of "Proposed new FHIR IG build Process"
Line 128: | Line 128: | ||
"temp" : "[temp]", | "temp" : "[temp]", | ||
"output" : "[output]", | "output" : "[output]", | ||
+ | "txCache" : "[txCache]", | ||
"qa" : "[qa]", | "qa" : "[qa]", | ||
"specification" : "[specification]" | "specification" : "[specification]" | ||
Line 182: | Line 183: | ||
* [output]: where the final output from the tool (the complete IG) will be placed | * [output]: where the final output from the tool (the complete IG) will be placed | ||
* [qa]: a folder where the validation output (validation.html) will be produced, along with a page for each fragment (for css style checking) | * [qa]: a folder where the validation output (validation.html) will be produced, along with a page for each fragment (for css style checking) | ||
+ | * [txCache]: where the terminology service cache goes. see below. | ||
All these folders are file paths that are relative to the control file. They are usually sub-folders, for version control/build convenience, but do not need to be. There is one more path: | All these folders are file paths that are relative to the control file. They are usually sub-folders, for version control/build convenience, but do not need to be. There is one more path: | ||
Line 188: | Line 190: | ||
This is an HTTP page (http:// or file:// that points to the version of the specification on which this IG is based (use a version specific reference, not http://hl7.org/fhir itself, unless the IG is synced to the current build). It should also be the version on which the IG itself is based. | This is an HTTP page (http:// or file:// that points to the version of the specification on which this IG is based (use a version specific reference, not http://hl7.org/fhir itself, unless the IG is synced to the current build). It should also be the version on which the IG itself is based. | ||
+ | |||
+ | === Terminology Service Cache === | ||
+ | |||
+ | There are 2 workable choices for this value : present, or in version control with the IG (e.g. a subdirectory, and store in version control) | ||
+ | |||
+ | Note that using the terminology server - unless you run a local copy - is slow. The build tool keeps a local terminology cache that speeds up the build process (e.g. from hours to seconds). But the cache is aggressive, and must be managed, since there's no arrangement to flush the cache when the terminology service updates it's content. | ||
+ | |||
+ | Choice 1 is not to specify this; the cache will be automatically created and maintained in the user's local directory. The problem with this is that all users have to delete and rebuild the cache manually (e.g. have to do it, and wait for the slow build). | ||
+ | |||
+ | Choice 2 is to put it in a subdirectory, and version control the content. This can be tiresome because everyone has to commit the files. But only one person needs to maintain the content. this is better for CI builds too | ||
+ | |||
== Loading Resources == | == Loading Resources == |
Revision as of 22:08, 16 July 2016
These are some notes about the proposed FHIR build process for publishing implementation guides
Contents
Summary
Building an IG is a 3 part process:
- Defining the resources that underpin the Implementation Guide resources (Implementation Guide, Conformance Resources, Knowledge statements, examples)
- Building the publishing process that will publish it (linked to where it will be published)
- Developing the structure and the narrative content that turns the resources into a useful implementation guide
The FHIR team provides an IG Publishing tool that takes the implementation guide resources and converts them to a set of 3 different types of files:
- generated resources ready for inclusion into the published guide (xml, json, ttl formats)
- a set of fragments ready to include in generated html files
- 6 standard zip files: definitions.[fmt].zip and examples.[fmt].zip - these are used by implementers for various purposes, so should be included in the final published version
The publishing process itself is a 3 step process:
- pre-process spreadsheets and logical models
- run the FHIR IG Publisher to validate all the resources & generate outputs and fragments
- run a web site build tool to generate the final IG from the fragments and page templates
Supported tools for the third part of the process:
- Jekyll
To add tools to this list, discuss your tool of choice with Grahame Grieve
Rules for Implementation Guides
The Implementation Guide author is free to lay out the content in what form they choose. However there are some rules about the arrangement of the implementation guide that must be followed:
- the home page should be 'index.html' (this is not a technical requirement, but is a human convenience)
- for each resource in the implementation guide, whether a conformance resource or an example, the IG Publisher will produce
- [Type]-[id].html - the home page for the resource
- [Type]-[id].[fmt] - the resource for the specified format (pretty printed)
- [Type]-[id].canonical.[fmt] - the resource for the specified format in canonical format (not produced for ttl)
- redirects from /[Type]/[id] to one of the pages above
- [Type]-[id].[fmt].html - An HTML wrapper around the specified format, with a link to the native form
- the IG publisher will produce the files definitions.[fmt].zip. The community will expect that these are published along with the guide as these enable the conformance tooling to work with the guide. These should be referenced somewhere from the guide, but the tooling will just expect that they exist at [base]/definitions.[fmt].zip, irrespective of whether they are published
Format Support
The Implementation Guide supports 3 formats, as defined in the FHIR specification:
- XML
- JSON
- Turtle (RDF)
Note: in this documentation, [fmt] is one of 'xml', 'json', and 'ttl'.
By default, the IG publisher will produce all 3 of these formats. It is recommended to produce all 3, though specific formats can be turned off - but you must produce at least one.
Canonical URL
The Implementation Guide must nominate a canonical URL. The canonical URL is used throughout all the resources (this is enforced by the publisher). The Canonical URL should point to the current version of the specification e.g. if someone enters the canonical URL into their browser, they will get the specification home page (or, if a FHIR client uses the URL as it's server URL, a GET of [base]/[Type]/[id] will return the correct resource.
History
Each Implementation Guide is also responsible for maintaining it's own published history. The standard way that this works for FHIR should be followed by the implementation guides:
- The current copy of the specification lives at [canonical]
- the published version history lives at [canonical]/directory.html (or similar name) and is manually maintained. It lists the current version, maybe the dev version location, and a history of past milestones
- the past milestones live at [canonica]/[id] where id is either a milestone name like "stu1" or a date (recommended format = YYYY)
- at least the home page of all the versions (current or historical) references the published version history in a prominent location
Note that the IG publisher is only able to produce a single snapshot of the IG - it cannot produce the entire history.
Styles
All the generated fragments are generated assuming that the standard FHIR styles in fhir.css apply to the specified classes in the generated content. To keep the content valid, the simplest way is to include fhir.css in all your html pages as a style reference. But you don't have to do this; if you have some other styling system, you can make arrangements for the correct css definitions - which are deliberately very static - to be made available in some other form. You can even replace the styles completely and use your own.
Note, however this warning: The FHIR team does not provide support for replacing any CSS styles. In order to assist with this, for every generated fragment, there is a matching .html file created in the qa directory. The FHIR team will respond to any issue relating to incorrect display of the content of the qa html files. If the content does not appear correctly in the IG, but does appear correctly in the qa files, then this is assumed to be a problem with the IG styling, and not the problem of the IG Publisher authors.
Using FHIR IG Publisher
Note: Before running the IG Publisher, you must have installed the build tool properly. See below.
This is a java jar called org.hl7.fhir.igpublisher.jar. You can get it from the downloads (hl7.org/fhir/downloads.html) for the version of FHIR you are using (or, if you build locally, from your own publish directory). The jar includes everything from the spec that is required to generate the implementation guide. Make sure you use the correct version of the IG publisher for your guide (check the versions in the log).
The publisher cab be run as a GUI application, or run from the command line (or, it can be hosted in a server. If you want to host it, talk to Grahame Grieve)
GUI Mode
To run the publisher as an application:
java -jar org.hl7.fhir.igpublisher.jar
This is the IG builder:
To use it, 'Choose' an implementation guide control file, and click 'Execute'. The implementation guide will be built, and then the IG publisher will watch for changes until and do incremental rebuilds until you click 'Stop'.
Command Line Mode
To run in command line mode, run the IG Publisher like this:
java -jar org.hl7.fhir.igpublisher.jar -ig [source] (-tx [url]) (-watch)
parameters:
- -ig: a path or a url where the implementation guide control file is found. see below for Documentation of that format
- -tx: (optional) Address to use for terminology server (default is http://fhir3.healthintersections.com.au)
- -watch (optional): if this is present, the publisher will not terminate; instead, it will stay running, an watch for changes to the IG or its contents and re-run when it sees changes. Note that changes the spec or to dependent implementation guides (see below) are not picked up during watch mode
Operation of the IG Publisher
once running, it:
- reads the control file
- reads the IG
- processes any spreadsheets, bundles, and logical models
- loads all the resources in the IG
- processes code systems, value sets, structure definitions, structure maps
- validates the all resources in the IG and produces an HTML file with any errors encountered
- for each resource in the IG, generate a set of files - renderings of the artifact for possible inclusion in the published IG, plus the outputs defined above
- generates summary output
- gets the tool (Jekyll) to generate the final output. The Jekyll source is in /pages (see below)
Control file
When the IG publisher is executed, it is pointed at a control file. This is a json file that contains all the information that the publisher needs to publish the implementation guide. It has this structure:
{ "tool" : "jekyll", // the tool used for the 3rd step of the build. See tooling below "paths" : { // see paths below "resources" : "[resources]", "pages" : "[pages]", "temp" : "[temp]", "output" : "[output]", "txCache" : "[txCache]", "qa" : "[qa]", "specification" : "[specification]" }, "canonicalBase": "[where this will be published - see above]", "dependencyList": [ // a list of other implementation guides that this guide depends on // e.g. uses profiles, value sets, code systems etc // zero or more of this object: { "name" : "[name of the IG, for logging, and Jykell variable name for location]", "location" : "[http address where the IG lives]", "source" : "[folder to get the definitions from if running ahead of publication at location (relative)]" } ] "defaults": { // this object contains the default publishing policy for different types. // Anything not mentioned defaults to true "Any": { // fragment options - see below. example: "xml" : false // don't produce xml example }, "[Type]": { // fragment options - see below } }, "source": "[ig]", // the name of the ig file to load "spreadsheets" : [ "[filename]" // see using spreadsheets below ], "bundles" : [ "[id]" // see using bundles below ], "resources": { "[Type]/[id]": { "source" : "[optional source file]", "base": "[destination page for things referring to this resource", "defns": "[destination page, for structure definitions]", "version" : "[optional version]" // and fragment options - see below } } }
The control file must be maintained by the editor of the implementation guide.
Paths
The IG publisher defines the following paths that can be configured in the control file:
- [resources]: the directory where all the input FHIR resources (usually conformance resources and examples)
- [pages]: the directory that contains all the jekyll source (not process, just passed to Jekyll for processing)
- [temp]: a scratch directory that is used for the temporary source for jekyll processing (do not use this directory for anything else)
- [output]: where the final output from the tool (the complete IG) will be placed
- [qa]: a folder where the validation output (validation.html) will be produced, along with a page for each fragment (for css style checking)
- [txCache]: where the terminology service cache goes. see below.
All these folders are file paths that are relative to the control file. They are usually sub-folders, for version control/build convenience, but do not need to be. There is one more path:
- [specification]
This is an HTTP page (http:// or file:// that points to the version of the specification on which this IG is based (use a version specific reference, not http://hl7.org/fhir itself, unless the IG is synced to the current build). It should also be the version on which the IG itself is based.
Terminology Service Cache
There are 2 workable choices for this value : present, or in version control with the IG (e.g. a subdirectory, and store in version control)
Note that using the terminology server - unless you run a local copy - is slow. The build tool keeps a local terminology cache that speeds up the build process (e.g. from hours to seconds). But the cache is aggressive, and must be managed, since there's no arrangement to flush the cache when the terminology service updates it's content.
Choice 1 is not to specify this; the cache will be automatically created and maintained in the user's local directory. The problem with this is that all users have to delete and rebuild the cache manually (e.g. have to do it, and wait for the slow build).
Choice 2 is to put it in a subdirectory, and version control the content. This can be tiresome because everyone has to commit the files. But only one person needs to maintain the content. this is better for CI builds too
Loading Resources
Resources are loaded as follows:
- The IG resource is loaded (by literal filename)
- any spreadsheets are loaded (by literal filename) and converted to resources (for editor support, all the identities of the generated resources are noted in the validation output)
- any bundles are loaded (by Type/id)
- any resources referenced in the IG resource are loaded (by Type/Id)
Note: The IG resource must use sourceReference references by id. e.g.
<sourceReference> <reference value="ValueSet/my-value-set"/> </sourceReference>
A Type/Id reference is resolved to a to a local file by the control file, using the object /resources/"[Type]/[id]". If this object does not exist, the IG publication will fail. If this object exists, and has no "source" property, then the source property specifies the location of the file source relative to the IG file. If there is no "source" property, the IG publisher will look for [resources]/[Type]-[id].xml/json or [Type]-[id].xml/json (it's at editor discretion whether to store files using the simpler form, since this can cause conflicts between different resources with the same id (e.g. both Patient and Practitioner with the id of 'example')
By default, the version of the input resource is assumed to be the same version is the IG Publisher itself. However the IG publisher can also load resources from DSTU. Specify the version "1.0.2" for this. Note: this is only known to work reliably for StructureDefinition resources.
Using Bundles
There's 2 different ways to use Bundles. The first way is examples of type Bundle. These are treated like any other resource. The second is where the bundle is a collection that contains a set of resources that need to be processed individually by the IG Publisher. To specify one of these bundles, use the "bundles" property:
"bundles" : [ "[id]" ]
The bundles property is an array of string, where each entry is the id of a bundle. The bundle will be located using the standard resource location process, but once loaded, the bundle itself will be ignored, and the individual resources processed directly. Each resource in the bundle must have an entry in the resources section, and should be entered in the implementation guide.
Using Spreadsheets
For legacy reasons, it's possible to author resources using spreadsheets. This approach is deprecated. To get the IG publisher to process a spreadsheet:
"spreadsheets" : [ "[filename]" ]
This is an array of string, which each entry is the filename of the spreadsheet, relative to the control file. Each resource represented in the spreadsheet (profiles, value sets, search parameters) must have an entry in the resources section, and should be entered in the implementation guide.
There's some differences between the spreadsheets used in the build directly, and the spreadsheets used by the IG Publisher, and these are changes that must be made manually:
- Value Set references - use either
- http(s)://... a reference to a value set from outside the IG
- ValueSet/xxx a reference to a value set defined in the IG, and registered directly in the implemnentation guide resource
- xxx where:
- xxx is the name of a file found in the same directory as the spreadsheet
- The filename must not have .xml or .json on it, but a file with either .xml or .json appended to it must exist
- The file must be a ValueSet resource, with an id of xxx and the appropriate canonical URL
- Search Parameters
- You have to provide a fluent path expression directly ("Expression")
- you have to provide a description directly
- you have to specify the target types directly
- Types
- the build tool allows for the use of "SimpleQuantity" and other data type profiles (not that the build tool uses any other) as types, but they have to be invoked as profiles in the IG spreadsheets
Fragment Options
When deciding whether to produce a particular kind of fragment, the IG Publisher will look for a property of type boolean with the name given below. It will look in the following places, in order:
- on the resource entry for the resource in question
- on the defaults entry for the resource type in question
- on the defaults entry for "Any"
if it doesn't find anything, it will produce the fragment.
Fragment Codes
- xml: XML version of the resource (all resource types)
- json: JSON version of the resource (all resource types)
- ttl: Turtle version of the resource (all resource types)
- xml-html: html representation of XML version of the resource (all resource types)
- json-html: html representation of JSON version of the resource (all resource types)
- ttl-html: html representation of Turtle version of the resource (all resource types)
- html: narrative of resource as html
- summary: An html summary for the resource (all conformance resources)
- content: An HTML representation of the content in resource (code system, concept map, structure map)
- xref: A list of all the places where the resource is used (all conformance resources)
- cld: An HTML representation of the content in resource (value set)
- expansion: The expansion of the value set (Value set)
- shex: ShEx statement for the structure (Structure Definition)
- sch: schematron statement for the structure (Structure Definition)
- json-schema: JSON Schema statement for the structure (Structure Definition)
- header: Description of the identification of the structure (Structure Definition)
- diff: Logical Table of the diff (Structure Definition)
- snapshot: Logical Table of the snapshot (Structure Definition)
- template-xml: XML template for the snapshot (Structure Definition)
- template-json: JSON template for the snapshot (Structure Definition)
- template-ttl: Turtle template for the snapshot (Structure Definition)
- uml: UML diagram for the structure (Structure Definition)
- tx: Terminology Notes for the structure (Structure Definition)
- inv: invariant summary for the structure (Structure Definition)
- dict: Detailed Element Definitions (Structure Definition)
- maps: Presentation of the mappings (Structure Definition)
Instance Wrapper
In addition to the fragment control flags, there are the instance wrapper templates. These are the files used to generate the files for the base page for each resource, and the format specific pages ([Type]-[id].html and [Type]-[id].[fmt].html respectively). These are "template-base" and "template-fmt" respectively.
These are string values that name a specific file that is the template for the resource. The content of the file is a jekyll source file (e.g. html or markdown etc). They have the same resolution pathway as the fragment flags documented above). If there is no template specified for a resource (or the value is null), then no file will be produced. The assumption in this case is the the IG author will generate the correct output file directly through the a static jekyll page (giving the author total control over the content, but requiring them to have a maintained file for each resource).
When the template files are used, they are pre-processed and then copied to the correct place for the xml/json/ttl wrapper for each resource. When copied, the following strings will be replaced:
- {{[title]}} - a description of the content of the resource (typical use: <h2>{{[title]}}</h2>)
- {{[name]}} - the path for the source fragment to include (proper use: {% include {{[name]}}.xhtml %})
Build Tools
Jekyll
Installation: (see [Jekyll on Windows] for windows users)
in config file:
"tool":"jekyll"
Troubleshooting
Notes about troubleshooting:
- before doing any trouble shooting, make sure you are running the latest IG publisher for the version of FHIR you are using
- if the Jekyll part of the build fails, it fails completely, and the old output is left in place
- If you're going to ask Grahame for help, run the build from the UI version, and then click on the 'Debug Summary' button. This generates a large amount of text and puts it on the clipboard so you can send it in an email to grahame@hl7.org