NCI Requirements

Functional Characteristics ....... [SS #1]

Retrieval of uniquely specified concept ....... [SS # 1.1]

Code->Preferred Name ....... [SS # 1.1.1]

User specifies a Concept Code; server returns the concept Preferred Name.

Mayo Notes

This requirement is a specialization of a general access requirement:

f(code system, code, [version], [language], [{property}]) -> {property/language/value}

Preferred Name -> Code ....... [SS # 1.1.2]

User specifies concept Preferred Name; server returns Concept Code(s).

Mayo Notes

This requirement is an (over) specialization of a general access requirement:

f({code system}, [property, [language], [value]]) -> {code system, code, version}

Code/Name -> Attributes ....... [SS # 1.1.3]

User specifies concept Preferred Name or Concept Code plus (optional) desired attributes; server returns selected attributes (and associated values) if concept exists, error return otherwise.

Code->Attributes

Mayo Notes

This requirement is a specialization of a general access requirement:

f(code system, code, [version], [language], [{property}]) -> {property/language/value}

Preferred Name->Attributes

Mayo Notes

This requirement assumes that the underlying model has a unique "preferred name" per concept - something that may not be true because of (a) language dependency and (b) lack of name uniqueness. A more generic function would retrieve the concept code(s) from one or more code systems that have the specified string matches in these specified properties / languages / preference settings

Synonym -> Preferred Names ....... [SS # 1.1.4(a)]

User specifies a synonym; server returns the concept's (concepts') Preferred Name(s).

Mayo Notes

This is an (over) simplification because "synonyms" have various flavors, are language specific, and the targeted preferred name is language specific as well.

Concept->Relationships ....... [SS # 1.1.4(b)]

User specifies a concept; server returns all, or a specified subset of, relationships and attributes for that concept.

Mayo Notes

This is yet another variation on a theme - code or preferred name maps to attributes or relationships

Retrieve concepts by version or date ....... [SS # 1.1.5]

Retrieve any version of a concept stored in a vocabulary, either by version or by date

Retrieve concept by version

Given a concept identifier, this returns the concept "neighborhood" as it existed in a particular version.

Mayo Notes

Retrieve concept by date

Given a concept identifier, this returns the concept "neighborhood" as it existed at a particular point in time.

Retrieve most recent version of concept ....... [SS # 1.1.6]

Automatically retrieve the most recent version of a concept stored in a vocabulary no matter how many changes have been made.

Mayo Notes

Need clarification on this requirement, as it isn't clear what it is trying to say. Note that the LQS spec differentiated between "production" and "latest" release tags. Perhaps the LQS approach to this would make more sense

No separate versioning API ....... [SS # 1.1.7]

Access data from any version of vocabulary using the same API as is used for the current version

Mayo Notes

This may not be as desirable trait as it may seem, as all API signatures will have to be able to support versioning throughout. Needs discussion

Paged returns ....... [SS # 1.1.8]

search results shall be returned in a manner that permits the user to review the entire set of qualifying concepts. for example, consider Google's method of presenting a selectable sequence of pages, each containing a reasonable (or settable) number of concepts ( 50). The user reviews one page and then has the option to select another, so that all concepts are viewable.

Mayo Notes

This was done in LQS and a similar mechanism can be done here.

Note that API's that support the notion of paging have to support some notion of session context.

Limit number of return values ....... [SS # 1.6.5]

The server shall support an API request to limit the number of returned values from a search.

Identification of concepts fitting search criteria ....... [SS # 1.2]

Approximate string match ....... [SS # 1.2.1]

User specifies approximate match values; server returns summary information (see Glossary) for concepts resulting from applying lexical search techniques to: Preferred Name, Concept Code, Synonym(s) (if present).

Mayo Notes

May want to revisit concept summary to include code system, version and language of preferred name.

Would like to know the use case for the semantic type in the concept summary - this is quite domain specific

What of other attributes (definitions, notes, etc)

Concept code - should this *really* be included in this call?

Attribute presence ....... [SS # 1.2.2]

User specifies an attribute; server returns all concepts for which the attribute is present.

Mayo Notes

Attribute is very broadly defined, so this is an interesting requirement.

We would like to see the use case for this to see whether it might be more readily satisfied with simpler inquiries in set-based combinations.

There are issues involving language and namespace separation (is the namespace of roles, attributes and builtin types mutually exclusive?)

Q: With the exception of flags such as isPrimitive, which could equally be rendered as isPrimitive (true), what is the value of the presence field?

Possible A: Defaults. There is a big difference between isPrimitive:true/false and isPrimitive present.

Role Presence

Return concepts who participate as the source or target of the specified role or roles. Some scenarios may require that the API allow the caller to specify source/target This overlaps with DAG traversal scenarios Note that a role cannot be present w/o a value

MetaAttribute Presence

This is the presence of things like concept codes, status flags and the like. We don't plan to implement this feature unless a convincing use case can be made for it.

Property Presence

ASSUMING that properties must have values, this should be covered by some sort of wild card function on strings. Interesting modeling question, however. IF you substitute the name for the concept (which is philisophically valid), you could really treat properties and roles as interchangeable.

Valued attribute presence ....... [SS # 1.2.3]

User specifies a concept attribute; server returns all concepts having that attribute with any filler value.

Mayo Notes

Are there any non-valued attributes? If so, can they be treated as the equivalent of attribute w/ value of "true" or "null"?

Role Value Presence

Note that this is equivalent to role presence, w/ the same source/target caveats.

MetaAttribute Value Presence

Property Value Presence

Note the issue about the subtle distinction between presence and value presence

Specific attribute value match ....... [SS # 1.2.4]

User specifies a concept attribute and filler value; server returns all concepts having that attribute and that filler value, using string matching (including waldcarding) for the value.

Role value match

Not certain what the intent is when it comes to role matching. Are target roles against concept codes? Strings? Just code matches?

MetaAttribute match

As with the other MetaAttribute thingies, we aren't inclined to treat metaattributes as first-class properties or roles. MetaAttributes are used in teh API itself

Property value match

This is a more generic case of approximate string match. It should include the ability to select language, lexical matching techniques, etc. It should also support data type specific matching (numeric, date, etc) and, somewhere and/or/not type of syntax. Maybe it is time to switch to XSLT syntax here?

Search Domain Constraint ....... [SS # 1.2.5]

Optionally constrain the domain to be searched

Mayo Notes

Not certain of the definition of "domain". If this refers to "content domain" as defined in the glossary, then this is subsumed by the role value match and . Otherwise, no clue

Lexical matching techniques ....... [SS # 1.2.7]

During any of the foregoing searches that specify lexical matching, the server shall apply lexical techniques as provided by Apelon's Metaphrase software, spelling correction, stemming, ignore case and word order, etc.

Mayo Notes

Apelon's Metaphrase is a proprietary tool and, as a consequence, cannot be used as a model or guide for this work. Recommend removing this part of the description.

The behavior of this needs to be further defined.

Basic Lexical Match will be the minimum

Other matching techniques ....... [SS # 1.2.8]

server provides flexible matching algorithms for concept search, including: phrase match, word match, multiple word match, pattern match, search modifiers (consider or ignore case; defined word breaks, Soundex, match similar meanings)

Mayo Notes

This list needs to be limited and precisely defined.

Specify search status ....... [SS # 1.2.9]

consider status (in current use vs. obsolete, etc.; restrict search to a particular domain, HL7 terms; consider relationships to other concepts).

Mayo Notes

This is a metaattribute that could either be treated a value above or be treated specially.

This requirement is broad enough that it needs to further refined

Consider status

Current vs. Obsolete

Consider partial domain

This needs clarification. Is a "partial comain" a "content domain" as defined in the glossary, a code system as it appears in the example or something else entirely.

Consider relationships to other concepts

Isn't this covered eariler? What is special about this?

Persist search criteria ....... [SS # 1.2.10]

Save search criteria for future searches

DAG Traversal ....... [SS # 1.3]

Many of the following requirements are more applicable to client applications; Mayo should allocate requirements as appropriate, and indicate how the application can accomplish each function efficiently.

Graph Traversal ....... [SS # 1.3.1]

Graph traversal - navigating concept relationships via a directed acyclic graph (DAG). This typically could return the parents or children of a concept.

Traverse Graph via Role links ....... [SS # 1.3.4]

The server shall support reasoning based on DL vocabulary contents by stepping from concept to concept via specified roles linking the concepts.

DAG Walking ....... [SS # 1.3.10]

The user walks the DAG by specifying a new center for the preceding function; the server delivers the newly exposed concepts and their relationships to the selected concept.

Query by relationship domain ....... [SS # 1.6.1]

User specifies a relationship domain; server returns all concepts with relationships that have the specified domain.

Enumerate relationship types for concept ....... [SS # 1.6.3]

User uniquely identifies a concept; server returns a list of all relationship types applicable to that concept, including domain and range information.

Enumerate relationships by range ....... [SS # 1.6.4]

User specifies a relationship range; server returns all concepts with relationships that have the specified range.

Enumerate Concepts by Relationsip ....... [SS # 1.3.2]

User specifies a relationship (other than subsumption); server returns all concepts for which the relationship is present.

Enumeration Source Concepts for rel + target ....... [SS # 1.3.3]

User specifies a relationship and a range target concept; server returns all concepts having the specified relationship with the specified target concept.

Create named vocabulary subsets ....... [SS # 1.3.5]

Allow an application to construct a “tree-browser” widget by supporting the creation and maintenance of a specialized named subset of a vocabulary:

Mayo Notes

Being a named subset, this gets dangerously close to the authoring area. What is the scope of named subsets? Are they distributed? Updated?

It is our understanding that the primary purpose of these named subsets is performance.

Specify root of named tree ....... [SS # 1.3.6]

Allow the user to designate any concept as the root of a named tree.

Return node neighborhood in named tree ....... [SS # 1.3.7]

The user specifies any node in a named tree; the system returns all subsumption descendants of the specified node.

Enumerate concepts by relationship(s) in named tree ....... [SS # 1.3.8]

The user specifies a node in the existing tree and (optionally) one or more relationships; the server returns all concepts linked to the specified concepts by at least one of the designated relationships. (If no relationships are designated the server returns all concepts linked to the specified concept by any relationship). For each concept, the server also returns an indication of which relationship provided the link.

Create dag given concept, relationships and hops ....... [SS # 1.3.9]

Support the creation of an directed acyclic graph (DAG) data structure: the user specifies a concept and (optionally) a set of relationships and (optionally) a number of hops; the server returns all concepts, starting with the specific concept and extending to concepts within the stated number of hops from the specified concept via the specified set of roles; the server also indicates which relationship was used to reach each concept returned.

Mayo Notes

We plan to implement this to the extent that a query can return as a DAG. We may limit the hops to 1 or infinit The notion of "Create", however, is troubling - is this intended to be a subset of ?

Relationship inquiry ....... [SS # 1.3.11]

Relationship Inquiry - user supplies concept A, concept B, possible a-->b relationship; server responds TRUE or FALSE based on vocabulary contents.

Resource discovery: Server and vocabulary metadata access ....... [SS # 1.5]

Whenever the user is to specify vocabulary in the following requirements, "vocabulary" is understood to include the optional specification of version. If version is not specified the most recent available version is to be used.

Available Vocabularies and versions ....... [SS # 1.4.1]

On request, server returns a list of vocabularies available and the available versions of each (version number and date of release)).

Enumerate Kinds ....... [SS # 1.4.2]

User specifies a vocabulary; the server returns list of vocabulary's principal content domains, subdivisions of the root concept. (The NCI Thesaurus refers to these as Kinds.)

Mayo Notes

Kinds is a very APELON specific construct. We believe that this can be implemented with the existing graph traversal operations (give me all children of the root node that are also a kind of "KIND" or something similar)

Enumerate Relationships ....... [SS # 1.4.3]

User specifies a vocabulary; the server returns the list of relationships available in that vocabulary, and, if applicable, the domain and range for each.

Mayo Notes

We need to extend the API so support domain and range directly. At the moment, these are both first class roles (hasDomain and hasRange)

Enumerate Associations ....... [SS # 1.4.9]

Association Discovery -- user wants to discover all associations (relationships) in a vocabulary and the behavioral characteristics of each.

NCI

"behavioral charactieristics" is not defined in the LQS spec. It may be that the need is adequatly represented in the preceding requirements.

Mayo Gripe

Behavioral characteristics are INDEED defined in the LQS spec, perhaps under different wording, but there is a whole chapter dedicated to it. This is something that is needed in all terminologies.

As this has been one of the major issues in terminology development for years, it is patently offensive to say that LQS doesn't speak to this!

Enumerate attributes ....... [SS # 1.4.4]

User specifies a vocabulary; the server returns the list of attributes available in that vocabulary, with data element descriptors ( syntax) for the values associated with each.

Mayo Notes

Perhaps this is a miswording - the glossary defines pretty much everything as an attribute, but I expect that the intent is property in this case?

Map attributes to types (synonym, etc) ....... [SS # 1.4.5]

User specifies a vocabulary; the server returns the attribute names used in that vocabulary for such standard attributes as synonym, definition, semantic type, etc., (as available).

Mayo Notes

At the moment, this is supported via the "supportedXXX" construct, although further review needs to be done before we can say absolutely that we can do this.

Q: is "map" passive or active in this case - do we report what has been done or does it say we should do it?

Enumerate Possible Values ....... [SS # 1.4.6]

Possible Value Enumeration - user provides value domain, coding scheme; usage context; system returns list of valid values

Mayo Notes

This is a caDSR function, not an EVS function

Return Pick List ....... [SS # 1.4.7]

Pick List Generation - user provides value domain, coding scheme, usage context; systems provides pick list (set of ordered pairs: code and corresponding term.

Mayo Notes

This is closely coupled with the caDSR. It may be a vocab function as well, but needs to be postponed to a later phase

Enumerate all concepts ....... [SS # 1.4.8]

user wants to discover all concept codes supported by a particular coding scheme, and a particular version of that coding scheme

Versioning and authority enumeration ....... [SS # 1.4.10]

Allow the client application to determine whether the vocabulary is complete, up- to-date, authoritative; provide information such as version number; release date; autorized source, limitations on availability

Mayo Notes

"Complete" version number, release date, source and copyright will all be available. Things like "authoritative", etc. are clearly out of scope

Describe supported search criteria ....... [SS # 1.4.11]

Retrieve details of search criteria supported for this terminology

Mayo Notes

Will require a terminology of search criteria

Describe search techniques ....... [SS # 1.4.12]

Provide information regarding the search techniques and mechanisms supported by the server.

Combinatorial access - combinations of other modes ....... [SS # 1.5]

DL search constraint ....... [SS # 1.2.6]

During any of the foregoing searches, the user shall be able to specify applicable description logic semantic restrictions (such as negation, valid values, value high-low limits, etc.) during the search.

Mayo Notes

Don't agree with the name to begin with, as it what it is describing appears to be at partially lexical rather than semantic.

Not certain what the scope of "foregoing" is, but some of the preceeding searches appear to already support "valid values" already.

Hi-low limits are data type specific to properties and data types.

Lumping all of these things together

Combinatorial access ....... [SS # 1.5.1]

Combinatorial access - The various types of data access may also need to be combined, list the immediate parents of all concepts that have a term that contains the word "infarction."

NCI

No direct API call is available, but it is straightfoward for a client application to check if a concept returned by the server contains a particular keyword

Named Subsets ....... [SS # 1.6.7]

The server shall allow the user to designate named subsets of the vocabulary by applying one or more of the following methods:

Combination of searches ....... [SS # 1.6.7.1]

applying a combination of the searching and browsing techniques described above;

Extending subset by subsumption ....... [SS # 1.6.7.2]

extending a named subset to include (subsumption) ancestors and/or descendants of one or more concept(s) in the named subset;

Extending subsets by roles ....... [SS # 1.6.7.3]

extending a named subset to include all concepts related to any concept in the named subset by one or more specified relationship(s);

Setbuilder operations ....... [SS # 1.6.7.4]

forming the union, intersection, complement, etc., of existing subsets.

Issue

There are two parts to this whole thing - the first is the notion of combinatorial query building and the second is the ability to name and save it.

Miscellaneous content inquiries and access ....... [SS # 1.6]

Specify return order ....... [SS # 1.6.6]

The server shall allow the user to specify the order in which qualifying concepts are to be returned from a search: lexical goodness of fit, alphabetical by Preferred Name, etc.

Field validation ....... [SS # 1.6.9]

Field Validation - user provides value domain, coding scheme; usage context, proposed code; system returns TRUE/FALSE validity indicator

Retrieve all concepts in a particular vocabulary domain ....... [SS # 1.6.18]

Retrieve all concepts or terms in a particular vocabulary domain

Mayo Notes

This is a caDSR function, not an EVS

Test membership in vocabulary domain ....... [SS # 1.6.19]

User specifies a concept code; server returns TRUE or FALSE to indicate whether that concept exists in the vocabulary.

Mayo Notes

This is a caDSR function, not an EVS

Test for current/obsolete concept ....... [SS # 1.6.20]

Allow user to determine whether a term is current or obsolete; allow user to determine whether a term belongs to a specific subset of the vocabulary, HL7 terms;

Print subsets ....... [SS # 1.6.26]

Print selected subsets from screens

Ancester query ....... [SS # 1.6.a]

Determine whether a code was created by splitting an ancestral concept, and if so, identify the ancestral concept

NCI Note

implied by NCI's concept history requirement.

Session Management

Session default vocabularies ....... [SS # 1.6.16]

For servers offering multiple vocabularies, give the user the choice of specifying the vocabulary for each request or connecting to a vocabulary for an entire session.

Set of vocabularies for search ....... [SS # 1.6.17]

For servers offering multiple vocabularies, give the user the choice of specifying one or more vocabularies upon which search is performed.

(new) Session default language

(new) Session default context(s)

(new) Session default page size

(new) Session default timeout

(new) Other session settings

Vocabulary Acquisition and Persistence ....... [SS # 1.7]

Import RRF/OWL content ....... [SS # 1.7.1]

Import all vocabularies compliant with either the Rich Release Format (RRF) or Web Ontology Language- Description Logic (OWL-DL) syntax and semantics.

Insert RRF content

Will insert RRF content fields documented and distributed by NCI. Will not insert arbitrary UMLS RRF in this go around.

RRF Minimum content ....... [SS # 1.7.6]

Minimum semantic content for RRF vocabularies: Concept Preferred Name; Unique Concept Code; Semantic Type.

Insert OWL content

Will insert OWL content as currently defined in the NCI release. WIll not import arbitrary OWL content at this go-around

OWL Minimum content ....... [SS # 1.7.7]

Minimum semantic content for OWL-DL vocabularies: Concept Preferred Name; Unique Concept Code; Subsumption-based hierarchy.

Additional OWL Role ....... [SS # 1.7.8]

Support additional relationships among concepts (aka Roles) [applies to OWL-DL- compliant vocabularies].

Other OWL syntax and semantics ....... [SS # 1.7.9]

Support other syntax and semantics per the OWL specification.

Mayo Notes

Import will be restricted to the portion of OWL used at NCI on this go around

Import OBO ....... [SS # 1.7.2]

Import vocabularies compliant with the OBO format

Import eVOC ....... [SS # 1.7.3]

Import vocabularies compliant with the eVOCs format Note: Outside of scope for this release - Mayo doesn't know enough about eVOC to commit at this point

Import RRF History ....... [SS # 1.7.4]

Import and persist history information for RRF vocabularies

Mayo Notes

History will be imported as an opaque blob - Mayo will not interpret semantics on the first run

Import OWL History ....... [SS # 1.7.5]

Import and persist history information for OWL vocabularies

Mayo Notes

History will be imported as an opaque blob - Mayo will not interpret semantics on the first run

OWL-DL - history of any concept ....... [SS # 1.7.14]

For OWL-DL vocabularies, import, persist, and provide on request the complete editing history of any uniquely specified concept.

OWL DL Classification ....... [SS # 1.7.10]

OWL-DL vocabulary may have been classified in its native maintenance environment. If so, retain and distinguish among the following:

Immediate stated subsumption ....... [SS # 1.7.10.1]

immediate subsumption parentage as stated by the editor.

Inferred subsumption parentage ....... [SS # 1.7.10.2]

revised subsumption parentage as generated by classification in the native environment.

Other stated relationships ....... [SS # 1.7.10.3]

other relationships as stated by the editor.

Other inferred relationships from inheritence ....... [SS # 1.7.10.4]

other relationships inferred by inheritance processing in native environment.

Preserve stated/inferred status of subsumption ....... [SS # 1.7.10.12]

Preserve the stated/inferred status of subsumption and other relationships (DL vocabularies only).

Other inferred relationships from classification ....... [SS # 1.7.10.5]

other relationships inferred by classification in the native environment.

Mayo Notes

Mayo will maintain this information to the extent that it is recorded in the OWL document itself. (There will be no automatic mechanism or before/after testing to determine these classes)

Persist imported vocabularies ....... [SS # 1.7.11]

Persist imported vocabularies indefinitely in an internal store.

Persistence restriction ....... [SS # 1.7.13]

Persistence capability need only support the functions specifically identified; the server need not provide edit/update capabilities.

Arbitrary concept codes ....... [SS # 1.7.15]

Allow concepts to have coded values with unusual names, such as ' (single quote), “ (double quote), etc. as required by some code systems. A concept name may consist one or more arbitrary characters.

Mayo Notes

Mayo is able to do much of this requirement, but, unless instructed otherwise, doesn't believe that this is important to include in the current release

HL7 RIM Support ....... [SS # 1.7.16]

Support the HL7RIMM model

Mayo Notes

Outside of scope. The RIM model, to the extent that it is supported, needs to go into the

Nested Value Sets ....... [SS # 1.7.17]

Allow nested value sets

Arbitrary metadata ....... [SS # 1.7.18]

Allow the storage of arbitrary metadata on all vocabulary objects, including concepts, value sets, code systems, maps, and the vocabulary itself.

Create or delete version labels ....... [SS # 1.7.19]

Create or delete version labels which encapsulate all versioned changes up to a given point in time.

Roll back vocabulary changes ....... [SS # 1.7.20]

Roll back changes in a vocabulary to a previous date or previous marked version.

Support W3C Document ....... [SS # 1.7.21]

Support the interoperability principals for shared ontologies expressed in W3C's OWL Web Ontology Language Use Cases and Requirements (http://www.w3.org/TR/webont-req)

Support Semantic Web Document ....... [SS # 1.7.22]

Content will use, and software will support, Semantic Web sharing techniques and standards (http://www.asis.org/Bulletin/Apr-03/MillerSwick.pdf ) • URI • RDF, RDFS, XML • OWL

Submit revision suggestions ....... [SS # 1.7.23]

User can suggest/submit through the repository that a new concept be added to a vocabulary

Tab Delimited Import ....... [SS # 1.7.b]

Import vocabulary from Tab delimited files, such as those created by Excel. out of scope; import is limited to four formats: RRF, OWL, OBO, eVOCs

NCI Note

out of scope; import is limited to four formats: RRF, OWL, OBO, eVOCs

Ontylog Import ....... [SS # 1.7.f]

Import vocabulary from Authoring/Editing Tools ( Apelon DTS), using a configurable system to handle format variations out of scope; import is limited to four formats: RRF, OWL, OBO, eVOCs

NCI Note

out of scope; import is limited to four formats: RRF, OWL, OBO, eVOCs

Preserve complete history including local ....... [SS # 1.7.g]

Preserve the complete version history of how vocabulary objects change, whether updates arise from data import or local authoring. history requirement subsumed by NCI concept history capability; local authoring requirement is out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing.

NCI Note

history requirement subsumed by NCI concept history capability; local authoring requirement is out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing.

Local authoring through secure API ....... [SS # 1.7.i]

Support local authoring through a secure API, enabling updates of arbitrary vocabulary objects while preserving the master version history. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

Create and modify concepts locally ....... [SS # 1.7.j]

Create and modify concepts locally, with changes appearing in current but not previous versions. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

Join concept ....... [SS # 1.7.l]

Join a concept, enabling two concepts to be treated as one in current but not previous versions. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

Split concept ....... [SS # 1.7.m]

Split a concept, enabling the two new concepts to be used independently. Current and future versions are affected while previous versions are not. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

Retire concept ....... [SS # 1.7.n]

Retire a concept, ensuring it still exists in previous versions but is no longer visible in future versions. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

(new) HL7 V3 Vocabulary Import

(new) LOINC Import

(new) MGED Import

Content Extraction ....... [SS # 1.8]

Subset extraction ....... [SS # 1.8.1]

The server supports the ability to export a specified subset of a specified vocabulary. Export formats include at least RRF and OWL-DL compliant XML.

Mayo Notes

Need clarification on this requriement - what formats? How are subsets specified?

RRF Subsets

OWL DL Subsets

XML Format ....... [SS # 1.8.3]

Extract vocabulary content into an easily processed XML format

Mayo Notes

"Easily processed" XML format is an odd requirement. If there is an easier format than OWL, perhaps this should replace it on the input.

In any case, Mayo will commit to being able to produce OWL XML for a subset of the content. Memory restrictions, etc. would prevent the whole thing.

Code Translation ....... [SS # 1.8.2]

Code Translation - user specifies concept code and specifications for formatting the returned value; system provides appropriate content formatted as requested.

Tab Delimited format ....... [SS # 1.8.4]

Extract content into a Tab delimited format compatible for import into Excel for processing

Mayo Notes

"Easily processed" XML format is an odd requirement. If there is an easier format than OWL, perhaps this should replace it on the input.

In any case, Mayo will commit to being able to produce OWL XML for a subset of the content. Memory restrictions, etc. would prevent the whole thing.

Present via Web Portal ....... [SS # 1.8.5]

Present vocabulary content via a web portal, for easy viewing of vocabulary across the enterprise.

Present via Web Service ....... [SS # 1.8.6]

Present vocabulary content via a web service, for easy use of vocabulary in local applications.

Retrieve Concept History - RRF ....... [SS # 1.8.7]

Return on request stored concept history information for specified concepts in RRF-format vocabularies

Mayo Notes

Mayo will store and return concept history as blobs, but plans to do no interpretation in this implementation.

If the history is supplied, in RRF format, it can be retrieved as the same.

Retrieve Concept History - OWL ....... [SS # 1.8.8]

Return on request stored concept history information for specified concepts in OWL-format vocabularies.

Mayo Notes

Mayo will store and return concept history as blobs, but plans to do no interpretation in this implementation.

If the history is supplied, in OWL format, it can be retrieved as the same.

Version changes ....... [SS # 3.7.2]

Report on changes from one version of a vocabulary to the next.

Mayo Notes

This will be implemented only to the extent that the change information is included in the OWL or RRF. There will be no automatic differencing or reporting beyond what is supplied in the input

Web authoring ....... [SS # 1.8.d]

Present an authoring interface in the web browser, for easy handling of common cases: create a few codes, modify the phrasing of a few codes, modify metadata, create value sets, code systems, etc. out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

NCI Note

out of scope; server shall import; persist; retrieve and export only. No provision for local authoring/editing. In NCI model, the authoring/editing environment will be provided by Protégé/OWL.

Classification ....... [SS # 1.9]

Classification

The vocabulary may have been classified in its host environment and imported in such a way as to preserve the results of the classification (see the requirements under Vocabulary Acquisition and Persistence above.) This group of requirements addresses the need to allow the user of the vocabulary server - The requirement to support classification in the server environment is withdrawn; instead, the server shall support a general purpose module plug-in mechanism to permit additional functionality to be developed in an open software environment. Classification is an example of a service that may be performed in such an environment. See requriement .

Ignore prior classification results ....... [SS # 1.9.1.1]

to ignore prior classification results and perform de novo classifications on the vocabulary or a selected subset of it.

Supplement prior classification results ....... [SS # 1.9.1.1]

to supplement the results of the earlier classification and role inheritance with an additional classification performed in the vocabulary server environment.

NCI Note

The requirement to support classification in the server environment is withdrawn; instead, the server shall support a general purpose module plug-in mechanism to permit additional functionality to be developed in an open software environment. Classification is an example of a service that may be performed in such an environment. See requriement .

Accept or reject prior results ....... [SS # 1.9.2]

Within a specified vocabulary subset accept or disregard the results of prior classifications and/or inheritance of roles.

Reprocess with different classifier ....... [SS # 1.9.3]

Provide the ability to reprocess the resulting subset using the classification tool provided (this may be an integral function of the server or may be provided by a plug-in extension - see architectural constraints below).

Code Refinement ....... [SS # 1.6.2]

Code Refinement - user supplies concept code or ID plus other terms; system returns best match more specific concept.

Code Transformation ....... [SS # 1.6.10]

Code Transformation - user provides concept code, source coding scheme, target coding scheme and optional value domain; server transforms the code from source code to target code

TMJ: What is the value domain in this context?

Vocabulary to Vocabulary Mapping ....... [SS # 1.6.22]

User specifies unique concept identifier in a specific vocabulary; server returns unique identifier in another terminology of a concept with the same meaning.

Mayo Note

The implementation will support direct code->code mapping as specified using the UMLS CUI's and/or reference links from the NCI Thesaurus

Code Mapping ....... [SS # 1.6.11]

Code Mapping - same as previous case, except server determines an exact equivalent does not exist and returns best match.

Closest match mapping ....... [SS # 1.6.23]

Where applicable, retrieve from vocabulary A the closest matching concept from vocabulary B.

Composition ....... [SS # 1.6.12]

Composition -- user specifies two linked concepts and the relationship between them; server returns a single concept that represents the triplet. inflamation has location liver-->hepatitis.

Composition ....... [SS # 1.6.25]

Compose a post- coordinated expression into a pre-coordinated concept

Decomposition ....... [SS # 1.6.13]

Decomposition -- user specifies a single complex concept and the server decomposes it into two distinct concepts and the relationship.

Decomposition ....... [SS # 1.6.24]

Decompose a concept into a post-coordinated expression

Normalization ....... [SS # 1.6.14]

Normalization -- user specifies a set of hierarchical relationships and concepts

Cannonicalization ....... [SS # 1.6.15]

Normalization -- the server returns another hierarchy that represents the canonical form of the set of concepts.

Subsumption testing ....... [SS # 1.6.21]

Allow user to determine whether a concept is a descendent of another concept.

Performance Characteristics ....... [SS # 2]

Static performance

Static response time ....... [SS # 2.1.1]

Static performance - response time to a single request, absent competing system user load.

Linear search times ....... [SS # 2.1.2]

Search performance should vary linearly with the number of concepts

Classification performance ....... [SS # 2.1.3]

Performance for classification that increases not more than n^^1.5 where n is the number of concepts [This is a place-holder. We have to find out what results are attainable by well- designed algorithms.]

NCI Note

subject to the availability of open source classfication tools

Mystery requirement ....... [SS # 2.1.a]

Create a version into which new updates go, with disk space and time used independent of the size of the vocabulary store to which the version is being added.

NCI Note

needs clarification for inclusion;

Dynamic performance

Predictibility ....... [SS # 2.2.1.1]

Does the system behave in a predictable fashion when subjected to different loads?

Guaranteed Response times ....... [SS # 2.2.1.2]

Can it be configured to guarantee a minimum average and worst case response time?

Reasonable hardware ....... [SS # 2.2.1.3]

Can this configuration be accomplished with a reasonable amount of hardware and minimal or no software changes?

Large databases ....... [SS # 2.2.2]

Support with reasonable performance collections containing in excess of 100,000 code elements. this is a modest requirement. NCI vocabularies now have approximately 500,000 elements.

Large vocabulary support ....... [SS # 2.2.a]

Support collections of vocabulary too large to fit in the server's main memory.

NCI Note

covered in persistence requirement.

Architectural Characteristics ....... [SS # 3]

Robust Architecture for high availability

Available ....... [SS # 3.1.1]

Available - the ability for services to sustain hardware failure and recover from software failure without crashing.

Availability ....... [SS # 3.1.2]

Provide an appropriately high level of system availability - like a production web or e-mail server critical to an enterprise's operation; but not as high as an air-traffic control system or support for a shuttle mission. Comment: need to be more specific; or requirement won't be testable. Response: testing would be difficult in any event; probably best evaluated during design review.

CDC Tools ....... [SS # 3.1.3]

Support the use of CDC standard Microsoft tools, and the Java language.

Information Cache ....... [SS # 3.1.4]

cache the information prepared in response to requests for concept tree structures to expedite additional retrievals requiring the same information. (reference requirements of section 1.3) Referred to as "tree caching"

Mayo Notes

This is an implementation detail that is dictated strictly by performance.

Any SQL DB ....... [SS # 3.1.5]

Server shall support any SQL database, including, but not limited to, MySQL, Oracle, SQL Server.

Mayo Notes

Why was this rejected? In any case, we can do this, but would be happy to keep it off of the requirements list.

Federated ....... [SS # 3.1.c]

Server shall support vocabularies on remote machimes, including vocabularies distrbiuted across multiple remote machines.

Mayo Note

How come this was rejected, but remains?

NCI Note

requirements for distributing content and functionality across the grid are more complex than these; needs further thought.

Version insert ....... [SS # 3.1.e]

Should be able to load a new vocabulary version without disturbing the current running server instance. Once version is loaded, should then be able to switch the server to point to the new version

Mayo Notes

We think the response to this is out of place. This appears to be a legitimate architectural requirement and, if it can be done by multiple servers, this solution should be documented!

NCI Note

Would be preferable to do this with no downtime required. Expect to be able to switch the service to the new database with less than 10 minutes downtime. Is acceptable if caches (if any) are built after server is available. --

Not a server requirement; can be accomplished by hosting the service om multiple servers (also used for backup and redundancy).

Access and integrity controls

Secure ....... [SS # 3.2.1]

Secure - protection of services from alteration or disruption and restriction of access to authorized users.

Selective modification ....... [SS # 3.2.e]

Allow users the right to access or modify some, but not all code sets

NCI Note

covered by an NCI requirement together with application software

Prevent unauthorized modification ....... [SS # 3.2.d]

Prevent unauthorized modification to collections of vocabulary

NCI Note

covered by an NCI requirement together with application software

Mayo Note

The tooling will be secure to the extent that the underlying databases, etc. are secure. As the tooling provides no direct modification API's besides the import functions in the first release.

Restricted access ....... [SS # 3.2.2]

Server shall employ a suitable mechanism to limit access to restricted vocabularies to those authorized to use them. mechanism to be defined. It may be sufficient to display the text of any applicable license agreement and secure the user's agreement that he or she is authorized to access the vocabulary.

Sensitive content ....... [SS # 3.2.c]

Prevent unauthorized access to collections of vocabulary marked as sensitive.

NCI Note

covered by an NCI requirement together with application software

Mayo Notes

The tooling will provide the ability to record a copyright notice for any vocabulary. It will the responsibility of the applications, however, to acquire and display the copyright before proceeding.

Authorization key ....... [SS # 3.2.3]

methods providing access to content shall require an authorizing key; compare key contents to stored metadata associated with controlled resources. allocate to application software the creation and maintenance of the keys; assignment of keys to vocabulary metadata and to individual users, etc.

Limit use of keys ....... [SS # 3.2.f]

Specify user limitations by use of keys, which may be composed of multiple items, for ease of administration. For instance “NHSN, Atlanta” describes an NHSN worker with Atlanta Jurisdiction. A user posessing that key would be able to access NHSN data, Atlanta data, and “Atlanta, NHSN” data.

NCI Note

covered by an NCI requirement together with application software

Attach tags to vocabulary ....... [SS # 3.2.g]

Allow the tags that specify which keys are required for access to be attached to the vocabulary data itself, for ease of administration.

NCI Note

covered by an NCI requirement together with application software

Maintain consistency among distributed versions

Distributed ....... [SS # 3.3.1]

Distributed - capacity to rapidly synchronize changes introduced into a master server to local copies along a chain of distribution. Updates must not destroy unique local content coresiding on a terminology directory server.

Alert remote servers that new versions are available ....... [SS # 3.3.a]

Send alerts to remote servers, alerting them that new versions are available. NCI - not a server requirement; accomplished by application software

Mayo Note

While we will be happy to leave this out of the requirements, we DO believe that it is core to an effective terminology service.

Application SW is not the answer.

NCI Note

not a server requirement; accomplished by application software

Receive updates from a master server ....... [SS # 3.3.b]

Receive updates from a master server, and apply these updates automatically, after an optional review of the changes to be applied. NCI issue of master server not decided; architecture for distribution of content and functionality requirements more thought.

Mayo Notes

While we will be happy to leave this out of the requirements, we DO believe that it is core to an effective terminology service. The fact that it needs further design isn't a reason to omit it.

NCI Note

issue of master server not decided; architecture for distribution of content and functionality requirements more thought.

Preserve local changes ....... [SS # 3.3.c]

Preserve local changes to a single concept or a collection of concepts, while accepting updates to that concept or collection of concepts.

Mayo Note

This is worth serious consideration.

NCI Note

issue of master server not decided; architecture for distribution of content and functionality requirements more thought.

Allow overriding of local values ....... [SS # 3.3.d]

Allow local values to vocabulary entities to be overridden where the master server's update specifically conflicts with the local update. (For instance a master server modification to locally modified metadata on a concept.)

NCI Note

issue of master server not decided; architecture for distribution of content and functionality requirements more thought.

Federated integration of multiple terminologies

Federated ....... [SS # 3.4.1]

Federated - capacity to maintain cross linkages between/among components of a single large terminology or related terminologies with cross- referenced content. Similarly, the capacity to cross terminology boundaries to create composite terms from different sources, disease and anatomy.

Crossing value sets ....... [SS # 3.4.a]

Support value sets created from one or more code systems, without requiring special structures Mayo Note Value sets, per se, are out of the EVS scope and are a function of the caDSR.

NCI Note

requires clarification for insertions as requirement

Cross system reference ....... [SS # 3.4.b]

Codes in one code system may refer to codes in another code system. NCI - mapping function accomplished in content creation environment, not terminology server.

Mayo Notes

We strongly disagree. Cross code system references need to be done dynamically, NOT in the content creation environment - this is especially true when external (non-NCI) code systems are being referenced

NCI Note

mapping function accomplished in content creation environment, not terminology server.

Process embedded links ....... [SS # 1.6.8]

The server supports the ability to identify and process appropriately imbedded linkages, web URIs, links to other concepts in the same namespace or other namespaces in the server's persistence store. Note clash w/ Cross System Reference

Open Architecture ....... [SS # 3.5]

Programming language, OS and platform independent services ....... [SS # 3.5.1]

Programming language, operating system and platform independence- services that are unconstrained by operating system, hardware, or database system.

Mayo Notes

Note: This means that we will *support* platform independent implentations, not require

Open software only ....... [SS # 3.5.2]

Open software only: no proprietary products (formats, interfaces, modules, specifications, etc., may be employed.

Mayo Note

This means that we will *support* open software, not restrict the implementation to only software that is open

API specification ....... [SS # 3.5.3]

Server to provide an API that surfaces all server functions to the application level. need clarification on "all server functions". All query functions should be accessible to an anonymous user. Update functions (if any) should be accessible to a secure user or a secure site (Ex: must be logged into the server where the service is running, Must pass Admin username and password to service to update it)

Mayo Notes

Not sure where folks are going with this one. The API will expose everything that the API exposes, full stop. There *will* be functions (say, for instance, insert a new ontology) that may not be API driven

API implementation ....... [SS # 3.5.4]

API permits access from a variety of wrapper techniques, to include at least: SOAP, XML, JAVA, Web forms, …

Persistence layer ....... [SS # 3.5.5]

Persistence layer is accessible via open, object based interface methods.

Mayo

Not sure what this intends. The persistence layer is wrapped by the API and *shouldn't* be exposed at a level that may cause it to be directly embedded in other applications

Extendibility ....... [SS # 3.5.6]

Server provides an open architecture method for extending its basic functionality, plug-ins. The architecture must support additional plug-ins while ensuring that existing base functionality and other plug-ins are safeguarded from interference from new extensions.

Mayo Notes

We will do our best on this one, although it could subject to many interpretations. We are looking for suggestions and input

Open content ....... [SS # 3.5.7]

Content must be open

Mayo

This is out of scope. We will serve whatever we are given and have no control or say over whether it is "open" or not.

Standard registry ....... [SS # 3.6.8]

Server software and ontology content must be listed in standard registry suitable for Semantic Web

Mayo

What "standard registry" did the authors have in mind?

Multilingual support ....... [SS # 3.6]

Multi character sets ....... [SS # 3.6.1]

Modern terminologies are becoming multilingual, and the back-end services need to be able to support multiple character sets. Unicode UTF-8

Mayo Notes

We intend to support UTF-8 for the short term, although other character sets can probably be introduced if needed.

(new) Multi language support

new) Multi language support for all textual elements (excluding metadata elements such as status, etc.)

Tools and documentation

Mature infrastructure ....... [SS # 3.7.1]

Availability of tools and documentation - Whatever mechanism the back-end service takes, it must be based on an infrastructure that is mature, well documented, and widely available. The infrastructure also needs to have a firm base of utilities, tools and support mechanisms already in place.

Vocabulary Profile ....... [SS # 3.7.3]

Profile a vocabulary: report on the number of concepts; max tree depth; average tree depth

Integrity and consistency checks ....... [SS # 3.7.4]

Perform integrity and internal consistency checks on the persisted vocabularies.

Report generation ....... [SS # 3.7.5]

Report generation, including custom

Mayo Notes

Too general, should be the function of a vocabulary

Import / replace versions ....... [SS # 3.7.6]

specify whether to replace a current version of a vocabulary on import, or to carry both versions simultaneously

Version number and metadata ....... [SS # 3.7.7]

the ability to designate a version number and other metadata for each imported vocabulary: the name of the vocabulary; some universal vocabulary identfier (URI);

Rename vocabulary ....... [SS # 3.7.8]

the ability to rename an existing vocabulary

Alter vocabulary metadata ....... [SS # 3.7.9]

the ability to alter metadata stored with a vocabulary, to change a version number.

Reliability ....... [SS # 3.8]

No false negatives ....... [SS # 3.8.1]

A content search must not yield false negatives;

Mayo Notes

Not certain what the intent is here. Currently, there are only a couple of functions that return a negative, period.

When it comes to search functions, it should be sufficient to be able to test them without having to add this additional requirement.

Verification and validation ....... [SS # 3.8.2]

The system should provide verification and validation to ensure that all syntax allowed by the specified formats are supported for import and searching; includes the ability to collect, generate, store and execute test cases; to save test results; to compare new test results against saved results of earlier tests, etc.

Need clarification on the statement "all instances of syntax allowed by the specified formats are supported for import and searching;" Response -- this refers to the requirement that the server be able to import and persist RRF, OWL, OBO and eVOCs formats. All syntactic features of these formats must be fully supported, able to be imported, persisted, retrieved and exported without loss of content or semantic relationships. The specific requirement addresses tools providing the ability to test a version of the server software to verify that content has not been lost or distorted.

Mayo Notes

This will be implemented to the extent that the import functions are implemented.

We are not commiting to "All syntactic features of these formats must be fully supported, able to be imported, persisted, retrieved and exported without loss of content or semantic relationships." - just the features of the formats that are required for phase 1

LexBIG - Req

Contents

Functional Characteristics ....... [SS #1]

Retrieval of uniquely specified concept ....... [SS # 1.1]

Code->Preferred Name ....... [SS # 1.1.1]

Preferred Name -> Code ....... [SS # 1.1.2]

Code/Name -> Attributes ....... [SS # 1.1.3]

Code->Attributes

Preferred Name->Attributes

Synonym -> Preferred Names ....... [SS # 1.1.4(a)]

Concept->Relationships ....... [SS # 1.1.4(b)]

Retrieve concepts by version or date ....... [SS # 1.1.5]

Retrieve concept by version

Retrieve concept by date

Retrieve most recent version of concept ....... [SS # 1.1.6]

No separate versioning API ....... [SS # 1.1.7]

Paged returns ....... [SS # 1.1.8]

Limit number of return values ....... [SS # 1.6.5]

Identification of concepts fitting search criteria ....... [SS # 1.2]

Approximate string match ....... [SS # 1.2.1]

Attribute presence ....... [SS # 1.2.2]

Role Presence

MetaAttribute Presence

Property Presence

Valued attribute presence ....... [SS # 1.2.3]

Role Value Presence

MetaAttribute Value Presence

Property Value Presence

Specific attribute value match ....... [SS # 1.2.4]

Role value match

MetaAttribute match

Property value match

Search Domain Constraint ....... [SS # 1.2.5]

Lexical matching techniques ....... [SS # 1.2.7]

Other matching techniques ....... [SS # 1.2.8]

Specify search status ....... [SS # 1.2.9]

Consider status

Consider partial domain

Consider relationships to other concepts

Persist search criteria ....... [SS # 1.2.10]

DAG Traversal ....... [SS # 1.3]

Graph Traversal ....... [SS # 1.3.1]

Traverse Graph via Role links ....... [SS # 1.3.4]

DAG Walking ....... [SS # 1.3.10]

Query by relationship domain ....... [SS # 1.6.1]

Enumerate relationship types for concept ....... [SS # 1.6.3]

Enumerate relationships by range ....... [SS # 1.6.4]

Enumerate Concepts by Relationsip ....... [SS # 1.3.2]

Enumeration Source Concepts for rel + target ....... [SS # 1.3.3]

Create named vocabulary subsets ....... [SS # 1.3.5]

Specify root of named tree ....... [SS # 1.3.6]

Return node neighborhood in named tree ....... [SS # 1.3.7]

Enumerate concepts by relationship(s) in named tree ....... [SS # 1.3.8]

Create dag given concept, relationships and hops ....... [SS # 1.3.9]

Relationship inquiry ....... [SS # 1.3.11]

Resource discovery: Server and vocabulary metadata access ....... [SS # 1.5]

Available Vocabularies and versions ....... [SS # 1.4.1]

Enumerate Kinds ....... [SS # 1.4.2]

Enumerate Relationships ....... [SS # 1.4.3]

Enumerate Associations ....... [SS # 1.4.9]

Enumerate attributes ....... [SS # 1.4.4]

Map attributes to types (synonym, etc) ....... [SS # 1.4.5]

Enumerate Possible Values ....... [SS # 1.4.6]

Return Pick List ....... [SS # 1.4.7]

Enumerate all concepts ....... [SS # 1.4.8]

Versioning and authority enumeration ....... [SS # 1.4.10]

Describe supported search criteria ....... [SS # 1.4.11]

Describe search techniques ....... [SS # 1.4.12]

Combinatorial access - combinations of other modes ....... [SS # 1.5]

DL search constraint ....... [SS # 1.2.6]

Combinatorial access ....... [SS # 1.5.1]

Named Subsets ....... [SS # 1.6.7]

Combination of searches ....... [SS # 1.6.7.1]

Extending subset by subsumption ....... [SS # 1.6.7.2]

Extending subsets by roles ....... [SS # 1.6.7.3]

Setbuilder operations ....... [SS # 1.6.7.4]

Miscellaneous content inquiries and access ....... [SS # 1.6]

Specify return order ....... [SS # 1.6.6]

Field validation ....... [SS # 1.6.9]

Retrieve all concepts in a particular vocabulary domain ....... [SS # 1.6.18]