Difference between revisions of "Database with native ISO datatypes"

Revision as of 12:40, 14 February 2009

Summary

The ISO (or: Datatypes R2) data types can be implemented as 'native datatypes' at the database level. This may be an attractive option for RIMBAA applications.

We're not talking about "simple" UDT's. They can be declared using SQL and are based on the standard SQL types. E.g. you could define a TelephoneNumber-type that is a "VARCHAR(30) NOT NULL". We're talking about CLR User Defined Types (see [1]). This is also possible in some other database, e.g. Postgres with C++.

Levels

Level 1: dumb, II table
Level 2: UDT without functionality
Level 3: UDT with functionality, e.g. CD with SNOMED, II with OID check, or PQ with UCUM

You could have a Level 1 database and have a Level 3 Object Model, for example. It would be nice to add this to the Technology Matrix pages. Level 3 would be very interesting if you plan to do research on the clinical data. But if you just use the database to store forms (DCM's), than maybe Level 1 of Level 2 is sufficient.

Discussion

It would be great to have an Ad Hoc query and don't bother with the details of e.g. a PQ.

Advantage: can express constraints at the database level

Performance is an issue.
- SET operations will be much faster(optimized indexing)

There might be a problem with current ORM implementations witch cannot currently handle UDT's.
- Need to change serialization format used by Hibernate, instead use UDT datatype literal format.

SQLServer implementation

Grahame tried to implement UDT's in SQL-Server, but he did not like it because
- no support for inheritence/substituability
- the data is stored serialized as strings. --Michael 12:38, 14 February 2009 (UTC) Is this really true? I serialized to a binary stream.
- very difficult to address nested content (lists/sets)
- Dennis Nguyen: I look at CLR User defined types as suggested, and attempt to do a prototype with IsoANY and IsoBL. Unfortunately, the result doesn't look very good. The biggest hurdle is that SQL server doesn't support polymorphism. In SQL server 2005, I created a table with two columns, IsoANY and IsoBL. But when insert an IsoBL value into IsoANY, it either causes an error (when using Transact-SQL), or remove the boolean value of IsoBL (via table view in SQL server management studio), effectively turn it into a basic IsoANY. As I've been told, this can be a significant problem for ISO datatypes, as IsoANY and IsoQTY is used as place holder for a lot of thing.
Ravi Luthra (Axolotl): They are not really native in the DB just have hooks to the datatype's definition, and SQL Server and .NETs data access layer provides a facade at runtime. It's the same level of abstraction you already get with Hibernate (or NHibernate if you are using .Net), just a form of ORM implemented in the wrong place - because it would only be supported by one DB rather than all.

PostgreSQL implementation

Yeb Havinga and Willem Dijkstra implemented UDT's in the PostgreSQL database management system, after a feasibility test with a few types in september '09 that turned out positive.
As of yet we did not perform comparative tests, though we expect the PostgreSQL implementation to outperform implementations in the application layer.
They've studied the use of Hibernate as the ORM on top of the 'database with Datatypes R2 UDTs'. For the object representation itself the Java SIG code is used. In order to use the datatype definitions it is sufficient to create a Hibernate mapping which maps the javasig data types to our Postgres literal representation. This doesn't just involve changing the hibernate configuration, one also has to develop a mapping in Java.

NCI

John Koisch (NCI): For what it is worth, the NCI's adoption of the ISO datatypes has been focused on them as a wire format standard with implications about persistence, and not as a persistence model per se. That is, our service interfaces expose ISO datatypes with the contractual understanding that what ultimately resides in the persistence layer MUST be able to map to those. We arrived at this strategy in part because we encountered issues as well with modeling the ISO's in RDBMS. Depending on this group, we may modify that strategy in certain circumstances, but we are just as comfortable having the ISO's being the wire format of choice as we are having it be a feature of our persistence layers.
[15/09/2008 21:24:44] Grahame Grieve says: NCI is also working on this. They are struggling with the recursion of translations. Recursion is something they otherwise work hard to keep out of their persistent stores. There's recognised patterns for handling that, so I don't have much to say about that. NCI have also struggled with how to do nullFlavor. This is because they don't want to do V3 as a primary construct, rather than inherently because of persistence issues. So I would think this is less of a problem here.

Skype chat

(extract to be made for relevant content)

[15/09/2008 22:48:22] Grahame Grieve says: Javasig have their idiosyncratic data types working with hibernate. I don't know what trickery was required, but it seems as if there was some required
[15/09/2008 22:48:44] Grahame Grieve says: the ISO data types would map directly to hibernate - they are normal UML class models with attribute
[16/09/2008 03:05:52] Michael van der Zel says: Yes, but some datatypes have logic behind them. For example de PQ with ucum. It "understands" that 1g is the same as 1000mg.
[16/09/2008 03:06:15] Grahame Grieve says: this has nothing to do with persistence - that's just an operation you add
[16/09/2008 03:07:15] Michael van der Zel says: Exactly. This discussion is not only about persisting, but in general about v3 (ISO) datatypes in a database.
[16/09/2008 03:07:54] Grahame Grieve says: well, the database won't generally understand the semantics like PQ.canonical unless you specifically choose to persist them.
[16/09/2008 03:08:21] Grahame Grieve says: I can't speak for javaSIG, but the ISO data types won't persist PQ.canonical unless you make special arrangments
[16/09/2008 03:08:48] Grahame Grieve says: unless you do, you can't use canonical values in sql joins
[16/09/2008 18:55:05] Peter Hendler says: Gunther has hand written most of our datatypes and has hand mapped them with Hibernate.

@@ Line 3: / Line 3: @@
 The ISO (or: Datatypes R2) data types can be implemented as 'native datatypes' at the database level. This may be an attractive option for [[RIMBAA]] applications.
 *We're not talking about "simple" UDT's. They can be declared using SQL and are based on the standard SQL types. E.g. you could define a TelephoneNumber-type that is a "VARCHAR(30) NOT NULL". We're talking about CLR User Defined Types (see [http://msdn.microsoft.com/en-us/library/ms131120.aspx]). This is also possible in some other database, e.g. Postgres with C++.
+===Levels===
+Level 1: dumb, II table<br/>
+Level 2: UDT without functionality<br/>
+Level 3: UDT with functionality, e.g. CD with SNOMED, II with OID check, or PQ with UCUM<br/>
+<br/>
+You could have a Level 1 database and have a Level 3 Object Model, for example.
+It would be nice to add this to the Technology Matrix pages.
+Level 3 would be very interesting if you plan to do research on the clinical data. But if you just use the database to store forms (DCM's), than maybe Level 1 of Level 2 is sufficient.
 ==Discussion==

Difference between revisions of "Database with native ISO datatypes"

Revision as of 12:40, 14 February 2009

Contents

Summary

Levels

Discussion

SQLServer implementation

PostgreSQL implementation

NCI

Skype chat

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

groups

meetings

general

Tools