This wiki has undergone a migration to Confluence found Here
<meta name="googlebot" content="noindex">

Difference between revisions of "201809 FHIR Storage and Analytics"

From HL7Wiki
Jump to navigation Jump to search
 
(16 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
* [https://chat.fhir.org/#narrow/stream/73-analytics-on.20FHIR Zulip Analytics stream]  
 
* [https://chat.fhir.org/#narrow/stream/73-analytics-on.20FHIR Zulip Analytics stream]  
* [https://chat.fhir.org/#narrow/stream/134-storage-for.20FHIR Zulip Storage stream]
 
  
  
 +
Previous tracks:
 +
 +
* [http://wiki.hl7.org/index.php?title=201805_FHIR_Storage_and_Analytics 2018-05]
  
 
__NOTOC__
 
__NOTOC__
 
=Track Name=
 
=Track Name=
  
FHIR Storage, Query & Analytics ([https://github.com/fhir-fuel/fhir-storage-and-analytics-track])
+
FHIR Storage & Analytics ([https://github.com/fhir-fuel/fhir-storage-and-analytics-track])
  
 
=Track Overview=
 
=Track Overview=
  
More and more developers about to start design storage for FHIR data.  
+
With this track we want to put together people, which are interested in implementation and usage of
We hope this track will share experience about FHIR storage implementation,
+
FHIR persistence layer, implementing FHIR servers on top  and analysing FHIR data.
as well as analytics on FHIR datasets.
+
 
 +
We hope this group can produce useful feedback for FHIR community and potentially
 +
propose new aspects for standard - like SQL on FHIR or FHIRPath for databases.
 +
 
 +
Another goal is to collect solutions and best practices to store and query FHIR data,
 +
using popular databases.  
 +
 
  
 
Please fill out:  TODO [TODO Track Registration Spreadsheet]
 
Please fill out:  TODO [TODO Track Registration Spreadsheet]
Line 23: Line 31:
 
==Submitting WG/Project/Implementer Group==
 
==Submitting WG/Project/Implementer Group==
  
Should we create a group?
+
We want to create S&A working group or FHIR Foundation Project.
 +
 
 
   
 
   
 
==Justification==
 
==Justification==
  
Questions to be answered or discussed
+
Initial set of questions to be answered or discussed
  
 +
* How we can organise our FHIR Foundation Project?
 
* How to store FHIR data?
 
* How to store FHIR data?
 
* What is database schema design?
 
* What is database schema design?
* Which databases can be used?
+
* SQL (as common query language) on FHIR
* SQL on FHIR
 
 
* FHIR data representation for storage
 
* FHIR data representation for storage
 
* FHIRPath subset for databases
 
* FHIRPath subset for databases
* What i have to do to be part of it?
 
 
* How to approach FHIR search?
 
* How to approach FHIR search?
 
+
* Best practices for popular databases (postgresq, spark, mongo, bigquery etc)?
  
 
==Proposed Track Lead==
 
==Proposed Track Lead==
  
* Nikolai Ryzhikov ?
+
Nikolai Ryzhikov
* Ryan Brush ?
 
* Patrik Sundberg ?
 
* Kenny Blanchette ?
 
 
 
  
==Expected participants==
+
== Participants ==
  
* Nikolai Ryzhikov
 
* Tim Zallin
 
* Alexander Zautke
 
* Ryan Brush
 
* Patrik Sundberg
 
* Kenny Blanchette
 
  
 +
* Ryan Brush (Cerner)
 +
* Patrik Sundberg (Google)
 +
* Kenny Blanchette (Athena Health)
 +
* Nikolai Ryzhikov & Mike Lapshin (Health Samurai)
 +
* ADD YOURSELF!
  
 
==Roles==
 
==Roles==
Line 69: Line 72:
  
 
Original document by Ryan Brush  - https://docs.google.com/document/d/1IPyI-4GYlF400vmZvrtjDYL6jauHdjeLfjGJMo6yWtA/edit#heading=h.ch8y6lqurlaj
 
Original document by Ryan Brush  - https://docs.google.com/document/d/1IPyI-4GYlF400vmZvrtjDYL6jauHdjeLfjGJMo6yWtA/edit#heading=h.ch8y6lqurlaj
 +
 +
Github repo - https://github.com/rbrush/sql-on-fhir/blob/master/sql-on-fhir.md
  
 
=== FHIR data representation in database ===
 
=== FHIR data representation in database ===
Line 83: Line 88:
  
 
Issue to discuss - https://github.com/fhir-fuel/fhir-fuel.github.io/issues/10
 
Issue to discuss - https://github.com/fhir-fuel/fhir-fuel.github.io/issues/10
 
 
  
 
==Scenarios==
 
==Scenarios==
  
===Scenario 1: FHIR search===
+
===Scenario: Bulk Loader===
 
 
* Design or take an existing database schema to store Patient, Encounter & Practitioner resources
 
** relational (consider schema generation)
 
** document oriented
 
*** postgresql jsonb
 
*** mongodb
 
*** big query
 
** tripple store (datomic, EAV)
 
** xml database (?)
 
* Load sample data
 
* Implement FHIR search for
 
** Patient by name, address
 
** Encounter by date and location/practitioner
 
** Encounter include patient/practitioner
 
** Encounter chained params
 
* On fly convertion to FHIR if format is different
 
 
 
===Scenario 2: Advanced FHIR search===
 
 
 
* Design or take an existing database schema to store Patient & Observation
 
* Implement search by quantity with respect to system and units
 
 
 
===Scenario 3: Complex Queries / CQL===
 
 
 
* Implement CQL to SQL (or other query lang) translation (automatic or manual)
 
* Another analytic queries???
 
 
 
 
 
===Scenario 4: Analytical databases replication===
 
 
 
* Get `transaction log` / history of all CRUD/transaction operations from kafka topic
 
* Transform and load into analytical databases
 
** Click House
 
** Elastic Search
 
** Vertica
 
** Relational databases (MS SQL, Oracle, Postgresql, Mysql)
 
* Run analytical queries
 
 
 
===Scenario 5: Graphql implementaton===
 
 
 
* prototype efficient graphql => sql transpilation
 
 
 
===Discussion===
 
 
 
fhirpath implementation/subset for databases
 
 
 
==Assets==
 
 
 
* we will provide you with test datasets
 
* jupyter environment with examples (will be used for demo after track)
 
* access to existing databases
 
** fhirbase
 
** Biq Query
 
** aidbox
 
 
 
 
 
==Outcomes==
 
 
 
* make participants familiar with different approaches
 
* report/guidelines for implementation of FHIR database
 
* discuss open questions in a group :)
 
 
 
 
 
==Databases==
 
 
 
===Relational===
 
 
 
* PostgreSQL
 
* Big Query
 
 
 
===Document databases===
 
 
 
* MongoDB
 
  
===Analytical===
 
  
* ElasitcSearch
+
Implement and demonstrate bulk loader script for your specific database.
* ClickHouse
 
* Vertica
 
* Spark / Hadoop?
 
  
===Integration bus===
+
* Provide Bulk Loader for your specific database to easily and efficiently load data from FHIR server thro Bulk API into db.
 +
* Loader for Mitre Synthea (https://github.com/synthetichealth/synthea) - where source is file system as synthea generate.
 +
* Loader for FHIR examples - zipped dir of resources and bundles in json
  
* Kafka
+
Discussion at https://github.com/fhir-fuel/fhir-storage-and-analytics-track/issues/6

Latest revision as of 03:13, 21 September 2018

Return to Fall 2018 Proposals



Previous tracks:


Track Name

FHIR Storage & Analytics ([1])

Track Overview

With this track we want to put together people, which are interested in implementation and usage of FHIR persistence layer, implementing FHIR servers on top and analysing FHIR data.

We hope this group can produce useful feedback for FHIR community and potentially propose new aspects for standard - like SQL on FHIR or FHIRPath for databases.

Another goal is to collect solutions and best practices to store and query FHIR data, using popular databases.


Please fill out: TODO [TODO Track Registration Spreadsheet]

Submitting WG/Project/Implementer Group

We want to create S&A working group or FHIR Foundation Project.


Justification

Initial set of questions to be answered or discussed

  • How we can organise our FHIR Foundation Project?
  • How to store FHIR data?
  • What is database schema design?
  • SQL (as common query language) on FHIR
  • FHIR data representation for storage
  • FHIRPath subset for databases
  • How to approach FHIR search?
  • Best practices for popular databases (postgresq, spark, mongo, bigquery etc)?

Proposed Track Lead

Nikolai Ryzhikov

Participants

  • Ryan Brush (Cerner)
  • Patrik Sundberg (Google)
  • Kenny Blanchette (Athena Health)
  • Nikolai Ryzhikov & Mike Lapshin (Health Samurai)
  • ADD YOURSELF!

Roles

  • Database Provider: provides access to database with FHIR data
  • Database User: design queries

Discussions

SQL on FHIR

Initial notes for creating a standard representation of FHIR that is directly usable in scalable SQL-based systems.

Original document by Ryan Brush - https://docs.google.com/document/d/1IPyI-4GYlF400vmZvrtjDYL6jauHdjeLfjGJMo6yWtA/edit#heading=h.ch8y6lqurlaj

Github repo - https://github.com/rbrush/sql-on-fhir/blob/master/sql-on-fhir.md

FHIR data representation in database

JSON and other JSON like formats like yaml, edn, avro, protobuf are very popular in avantе-garde of programmers, some modern databases now could understad JSON as first-class data-structures, that’s why we think, good design for it is strictly required for FHIR adoption.

Original github - https://github.com/fhir-fuel/fhir-fuel.github.io/issues

FHIRPath subset for databases

It worth to discuss subset of FHIRPath, which is implementable in databases

Issue to discuss - https://github.com/fhir-fuel/fhir-fuel.github.io/issues/10

Scenarios

Scenario: Bulk Loader

Implement and demonstrate bulk loader script for your specific database.

  • Provide Bulk Loader for your specific database to easily and efficiently load data from FHIR server thro Bulk API into db.
  • Loader for Mitre Synthea (https://github.com/synthetichealth/synthea) - where source is file system as synthea generate.
  • Loader for FHIR examples - zipped dir of resources and bundles in json

Discussion at https://github.com/fhir-fuel/fhir-storage-and-analytics-track/issues/6