Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
10.1.1.41.7910
1. Linking a Medical Vocabulary to a Clinical Data Model using Abstract Syntax Notation 1
Stanley M. Huff1,2, Roberto A. Rocha2,3, Harold R. Solbrig3,
M. Wade Barnes4, Steve P. Schrank3, Matt Smith5
1
Intermountain Health Care, Salt Lake City, UT, USA
2
Department of Medical Informatics, University of Utah, Salt Lake City, UT, USA
3
3M Health Information Systems, Murray, UT, USA
4
Department of Computer Sciences, University of Texas, Austin, TX, USA
5
Microsoft, Redmond, WA, USA
We have created a clinical data model using Abstract Syntax Representation and Integration Language (GRAIL) [8, 9],
Notation 1 (ASN.1). The clinical model is constructed from a templates, and frames [10-12].
small number of simple data types that are built into data
ASN.1 [13, 14] is an international standard for describing
structures of progressively greater complexity. Important
abstract syntax. It was initially created as a language
intermediate types include Attributes, Observations, and
independent mechanism for representing the structure of data
Events. The highest level elements in the model are messages
passed in computer-to-computer interfaces. It has subsequently
that are used for inter-process communication within a clinical
been used as a platform-independent means of distributing
information system. Vocabulary is incorporated into the model
databases [15] and as the storage structure for a national
using BaseCoded, a primitive data type that allows vocabulary
genetics database [16]. ASN.1 consists of two parts: an abstract
concepts and semantic relationships to be referenced using
syntax notation and a set of encoding rules. The abstract syntax
standard ASN.1 notation. ASN.1 subtyping language was
notation describes the logical form and structure of data while
useful in preventing unbounded proliferation of object classes
the encoding rules describe how to make a physical instance of
in the model, and in general, ASN.1 was found to be a flexible
data suitable for passing in an interface or for storing in a
and robust notation for representing a model of clinical
database. There are several encoding rule variations in the
information.
standard targeted for different situations and uses. We
currently use Basic Encoding Rules (BER) within our
Introduction applications. The fact that any logical model expressed in
ASN.1 can be readily converted into a storable form is one of
The representation of medical concepts in computer systems the strengths of ASN.1. A second strength is that compilers
is one of the continuing challenges of medical informatics. and other tools that support the use of ASN.1 are available as
Rector et al [1] have characterized the problem by identifying software in the public domain or as tool kits from commercial
two components that contribute to the formal representation of sources.
medical concepts: vocabulary elements and an information
model. The vocabulary (lexicon) is a set of words or terms used We have previously used ASN.1 (and its predecessor X.409-
to denote medical concepts, like the names of drugs, signs, X.410 [17]) in computer-to-computer interfaces [18]. We were
symptoms, diagnoses, and devices. The information model impressed with the expressive power of ASN.1 and with the
represents a grammar, or a set of rules that state how ease of implementing and maintaining the interfaces. We
vocabulary elements can be combined to make a representation wanted to explore the use of ASN.1 to define and implement an
of medical information. An information model that is based electronic medical record (also known as a lifetime patient data
semantic types is called a semantic data model (SDM). Several repository). We describe our approach and experiences below.
notations have been proposed for representing SDMs in
medicine including, conceptual graphs [2-5], Model for
representation of Semantics in Medicine (MOSE) [6],
Structured Meta Knowledge (SMK) [7], the GALEN
2. The ASN.1 Clinical Data Model Primitive data types
A sample of primitive data types is defined in Figure 1. To
General approach and ASN.1 basics create a “protective buffer” between ASN.1 types and our user
defined clinical types, all Universal types used in the model are
Our information model for clinical data was created by renamed as user types. For instance, the user defined type
defining simple data types that were then combined to create OctetString (a data type consisting of a sequence of 8-bit bytes)
progressively more complex data types in a layered approach. is a simple renaming of the Universal type OCTET STRING.
Each of the six layers in the model adds flexibility and Similarly, EmbeddedPDV is a renaming of the Universal type
functionality. The six layers consist of: primitive data types, EMBEDDED PDV. The EMBEDDED PDV type allows for an
attributes, attributes with error handling, observations, events, external type (one defined outside of ASN.1) to be encapsulated
and clinical events. Each of these layers will be described in inside an ASN.1 data structure. Thus, by defining Image,
detail in the following sections. Sound, Video, and Document as EmbeddedPDV types,
Since the model is expressed in ASN.1, we will describe a few externally defined types of these items can exist without conflict
ASN.1 basics before proceeding. With a few of the basics in an ASN.1 instance.
understood, we will describe each part of the model, adding ULong2 has been defined as an INTEGER, but it has been
descriptions of more complex ASN.1 syntax as it is used in the constrained to have a value from 0 to 264-1. In ASN.1, this is
examples and Figures. called a subtype constraint. In other words, the new type is
ASN.1 syntax is used to create descriptions of the structure of derived from the parent type, but the new type represents a
data (metadata definitions). The definitions are then used in proper subset on the domain of the parent type. In our example,
programs to guide the creation of actual instances of the data Ulong2 is an INTEGER, but its value has been constrained to
using the BER. Each item that is encoded in a BER string is those that can be represented in a 64 bit integer variable type.
described by three parts: an identifier, the length of the value of Subtype constraints are always enclosed in parentheses. A
the item, and the actual value of the item. The identifier of an different type of subtype constraint is shown for Decimal. It is
item in an encoded string is derived algorithmically from its defined as IA5String (a particular character set), but
ASN.1 definition. Because a length specification is included constrained so that the only values that can be in the string are
with each field in the encoded string, ASN.1 has almost no numeric characters and a period (decimal point). Thus, the
inherent size restrictions. Decimal type is a character string representation of a decimal
number.
ASN.1 has many of the features of an object oriented data
definition language. It allows unlimited depth in hierarchical OctetString ::= OCTET STRING
data structures, and also allows unlimited reuse of types,
BERString ::= OctetString
including recursive type definitions. ASN.1 does not describe
the behaviors (methods) of an object, but only describes the EmbeddedPDV ::= EMBEDDED PDV
structure of the object. Ulong2
64
::= INTEGER (0..2 -1)
All object definitions (data types) are built by reference to Decimal ::= IA5String (FROM(“0123456789” | “.”))
other objects. There are currently over 25 predefined types Ncid ::= ULong2
(called Universal Types) in the ASN.1 standard, including RelSFormNumId ::= ULong2
INTEGER, BOOLEAN, REAL, NULL, etc. All user-defined
types must ultimately be defined by reference to one or more of DateTime ::= Ulong2
the Universal types. Most of the Universal type names consist Text ::= CHOICE {
of all upper case letters. Names for user defined Types must english [0] BasicString,
spanish [1] LatinOne,
begin with a capital letter, but should use mixed case thereafter. portuguese [2] LatinOne,
A new type is defined from an existing type using the french [3] LatinOne,
assignment characters “::=”. For example, a new integer type german [4] LatinOne,
italian [5] LatinOne }
based on the Universal INTEGER type could be defined as
follows: Image ::= EmbeddedPDV
NewInteger ::= INTEGER Sound ::= EmbeddedPDV
Video ::= EmbeddedPDV
With this background we will now proceed to the first
definitions used in the clinical data model and describe other Document ::= EmbeddedPDV
ASN.1 features as they occur in the model. Figure 1: Definition of some primitive Types by reference to
universal ASN.1 Types.
3. Figure 1 also shows a somewhat more complex definition for subtract, and compare exact chronological times. DateTime has
Text. Text is defined by reference to the Universal type two possible resolutions. If the resolution is to the millisecond,
CHOICE. An ASN.1 CHOICE is similar to the union type in then dates between 1800 and 2517 are supported. If the
the C programming language. It means that when an actual resolution is seconds or greater, dates between 0001 and
instance of the type is created, the type may take one of several 183738 are supported. Having the two different resolutions
alternative forms. In the case of Text, the text could be means that we can store very precise times for current events
English, Spanish, Portuguese, French, German, or Italian. This like electrical activity on an EKG tracing, while being able to
is an “exclusive OR” relationship; an instance of type Text must cover a very broad time range for less precisely known dates,
be one and only one of the possible choices. You will notice like the birth dates and death dates of a patient’s ancestors.
that the set of possible choices is enclosed in curly braces, and
We would next like to describe the definition of BaseCoded,
that there are three parts to each line item in the choice. The
which is the elementary type that creates the link between the
first token in the first line is “english.” In ASN.1, “english” is
information model, as expressed using ASN.1, and the coded
called an element name. It is similar to a variable name in the
medical vocabulary. For the definition of BaseCoded to be
C programming language. It must be unique within the
understood, we need to briefly review the structure of our
CHOICE construct, and it is a unique textual identifier of the
vocabulary, which we have described in detail in a previous
first item in the choice. The second token in the first item is
publication [19].
“[0],” and is called an ASN.1 tag. ASN.1 tags are only used by
the algorithm that creates an item identifier while encoding an As shown in Figure 2, the vocabulary is organized around
instance of the type. In the context of a BER encoded type, the concepts, where a concept is “a unit of thought constituted
tag is the part of the definition which ensures that the identifier through abstraction on the basis of characteristics common to a
created is unique. Since the ASN.1 tag is meaningful only to set of objects” [20]. In our vocabulary, each concept
the computer algorithm, it can be ignored when reading an (representing a unique object, identity, or meaning) is assigned
ASN.1 definition. a numeric concept identifier (Ncid). The Ncid is a globally
unique identifier of a concept. As shown in Figure 1, an Ncid
The final token of the first CHOICE line item is
is defined as a Ulong2, meaning that there can be 264
“BasicString.” BasicString is the ASN.1 type of the first
(approximately 1.6 x 1019) concepts in the vocabulary.
CHOICE item. As can be seen, the other items in the CHOICE
have the type of “LatinOne.” BasicString actually corresponds
to an ASN.1 definition that specifies the set of characters that
Related Surface Forms
can legally be used in the English language. The primitive
definition of the BasicString character set is not shown in the
example. The type “LatinOne” also refers to a specific set of Concept Pen-Vee K
characters that can be used for representation of words in the RelSFormNumId 456
other languages indicated in the CHOICE. To summarize, an
item of type Text could be a set of characters in one of six penicillin V potassium V-Cillin K
possible languages, where English text would be represented Ncid 123 RelSFormNumId 295
using the character set defined by the type BasicString, and text
in any of the other five languages would use the character set Veetids
defined by the type LatinOne. This example is syntactically RelSFormNumId 746
correct, but it is obviously incomplete in terms of the content
represented.
Also shown in Figure 1 is the definition of DateTime, which Figure 2: The relationship between concepts and related
represents an exact chronological time. As a Ulong2, surface forms
DateTime is a 64 bit quantity, but this definition obscures the
fact that DateTime actually has six subparts. The six parts are: Each concept can have one or more representations (also
resolution, number of days from starting date, time since called surface forms). A representation is one or more words
midnight, time offset to GMT, daylight savings time flag, and that a person would use to refer to the concept. As shown in
time zone indicator. Some of these quantities are slightly Figure 2, the concept “penicillin V potassium” (Ncid 123) has
redundant with one another, but this was an intentional overlap at least three representations: “Pen-Vee K,” “V-Cillin K,” and
that facilitates ease of use. DateTime is a user defined “Useful “Veetids.” In a manner similar to the concept identifiers, each
Type,” meaning that the internal parts and behaviors of representation is also assigned a unique numeric identifier,
DateTime are not directly manipulated by the user, but are called the related surface form numeric identifier
implemented in the software modules that set, get, add, (RelSFormNumId). In the example, the representation “Pen-
Vee K” has been assigned the RelSFormNumId of 456. Thus,
4. concepts are the fundamental building blocks of the vocabulary, Besides BaseCoded, three other coded types are defined in
and each concept is uniquely identified by an Ncid. Figure 3: CodedWRSform, CodedWOSform, and Coded. The
Furthermore, each concept can have one or more associated purpose in defining these types is to create subtypes of
representations (surface forms), and each representation has a BaseCoded that have specialized uses. BaseCoded itself is
unique RelSFormNumId. Both Ncids and RelSFormNumIds actually a virtual type, meaning that it is used to derive other
are of type Ulong2. With this background, we can now move to types, but it is not instantiated as part of the clinical database.
the definition of BaseCoded, which is shown in Figure 3. Clinical records actually use one of the three derived types,
depending on the situation. In some situations it may be
BaseCoded is the data type used in the information model to
mandatory that the database capture the concept representation
refer to items in the vocabulary. It is defined as a SET,
that was seen by the user. For instance, in a drug ordering
meaning that it is a data type that is made up of a group of
application that allows ordering by brand name, if the user
other data types. The items in the SET are enclosed in curly
ordered “Pen-Vee K,” it is important to capture not only the
braces, and elements can be marked as OPTIONAL as
ncid of the drug being ordered, but also the RelSFormNumId.
appropriate. Any item in a set not marked optional is required.
By storing both the Ncid and the RelSFormNumId we know not
The BaseCoded set consists of three elements: ncid, text, and only what drug was ordered but we also know exactly what
relSFormNumId. The syntax of how the elements in the SET textual representation of the drug name the user saw in the
are defined is the same as previously described for items in a order entry application. The CodedWRSform type (which
CHOICE construct. Thus, the ncid element is said to be of type means a coded data type with required surface form) was
Ncid, the text element is said to be of type Text, and the created for just this situation. At other times, it would be good
relSFormNumId element is said to be of type RelSFormNumId. to record the surface form that the user saw, but the surface
Each of these three types was previously defined in Figure 1. form may not always be available. In this situation, the type
CodedWOSform (meaning a coded data type with optional
BaseCoded ::= SET { surface form) would be used. Finally, there are situations where
ncid [1] Ncid OPTIONAL, only capturing the concept is needed. In these cases, the Coded
text [3] Text OPTIONAL,
relSFormNumId [4] RelSFormNumId OPTIONAL }
type (meaning a coded data type without a surface form) would
be used.
CodedWRSform ::= BaseCoded (WCS {
ncid, The three new coded data types are derived from BaseCoded
relSFormNumId,
using ASN.1 subtype notation. As before, the subtype
text OPTIONAL })
constraint is contained in parentheses. When constraining a
CodedWOSform ::= BaseCoded (WCS { SET, the WITH COMPONENTS syntax is used to indicate
ncid,
relSFormNumId OPTIONAL, which elements of the parent type are included in the new child
text OPTIONAL }) type. For the sake of brevity, we have substituted “WCS” for
Coded ::= BaseCoded ( WCS{
WITH COMPONENTS in all Figures. Elements within the
ncid }) subtype constraint are referenced by their element name, and
the tag and type of the element are not repeated in the subtype
Figure 3: Definition of the Coded data types by subtype because it would be redundant with the tag and type definition
constraint of the BaseCoded type. in the parent type. A component within a subtype constraint is
required unless it is followed by the keyword OPTIONAL.
The three parts of the BaseCoded set refer to the
Thus, in the definition of CodedWRSform, ncid and
corresponding parts of the vocabulary described in the
relSFormNumId are required elements, while text is optional.
preceding paragraph. A BaseCoded item contains the ncid,
In the definition of CodedWOSform, ncid is required while
text, and relSFormNumId of a single concept from the
both relSFormNumId and text are optional. For the definition
vocabulary. Thus, to create an instance of BaseCoded that
of Coded, ncid is required, and no other elements are allowed.
referred to the concept “penicillin V potassium” as described in
the previous example, ncid would have a value of 123, text
would have a value of “Pen-Vee K,” and relSFormNumId
would have a value of 456. Because the text of a concept has a
one-to-one correspondence with the relSFormNumId, the text
element of a BaseCoded is redundant. “Text” is included as an
optional part of the definition of BaseCoded as a way to
decrease the time needed to decode and display coded data
types.
5. Value ::= CHOICE { relationship between the Attribute and its value. For instance,
decimal [0] Decimal, if you had an Attribute called "systolic blood pressure" and the
long1 [1] Long1, patient's systolic blood pressure measurement was "less than
long2 [3] Long2,
dateTime [6] DateTime, 60," the value of "systolic blood pressure" would be set to "60,"
coded [7] CODED, and numericOp would be set to “<” (less than). Likewise, the
codedWOSform [8] CODEDWOSFORM, units and precision fields allow units of measure and precision
codedWRSform [9] CODEDWRSFORM,
to be specified for any measured continuous variable. The
text [10] Text,
document [11] Document, probability fields allow for each Attribute to have a user stated
sound [12] Sound, probability (userStatedProb) as in the phrase "there is a 60%
image [14] Image, chance of malignancy" or a calculated probability
video [16] Video }
(machineProb), generated from a computerized instantiation
Attribute ::= SET { process.
value [0] Value,
negation [1] LCodedWOSformAtt(Negation) OPTIONAL, The definition of Attribute introduces a new shorthand ASN.1
numericOp [2] LCodedWOSformAtt(NumOperator) OPTIONAL,
notation that we have not used previously.
uncertainty [3] LCodedWOSformAtt(Uncertainty) OPTIONAL,
machineProb [4] Probability OPTIONAL, “LCodedWOSformAtt(Units)” is shorthand which expands to a
userStatedProb [5] Probability OPTIONAL, definition of a constrained subtype of LAttribute. The
modifier [6] LCodedWOSformAtt(AttModifier) OPTIONAL, derivation of LAttribute is described later (Figure 7) The
units [7] LCodedWOSformAtt(Units) OPTIONAL,
precision [10] Precision OPTIONAL }
shorthand expression means that “units” is a loosely coded
attribute whose value must come from the domain of Units.
Figure 4: The definition of Attribute, by reference to Units is a set of vocabulary elements containing units-of-
previously defined Types. measure like, “minute,” “second,” “kilogram,” and “liter.”
Though the intermediate definitions are not shown, the
Attributes LCodedWOSformAtt type is derived from Attribute. Thus,
Attribute indirectly references itself as part of its definition.
The primitive ASN.1 types, as defined in Figures 1 and 3, are This is the first example of the recursive use of ASN.1
used to define a more complex data type called Attribute, as definitions. For the sake of brevity, the definitions of
shown in Figure 4. Attributes are the building blocks of the Probability, and Precision are also not shown.
information model, and they are used to store the various
characteristics of the clinical events found in the medical One of the most important implicit attributes is “negation.”
record. The ASN.1 CHOICE construct is used in the definition Negation is used in conjunction with coded Attributes to
of “Value” to represent the fact that the “value” of an Attribute express that a given finding is NOT present. For example, it
can be any one of the primitive types. For instance, patient might be important clinically to state that the appearance of a
identifiers may be alphanumeric character strings, blood trauma victim’s urine was “not bloody.” In this case, the
pressures might be integers, a hepatitis surface antigen test attribute being evaluated would be urine color, value would be
could be a titer, a microscopic urine cell count could be a range set to “bloody” and negation would be set to “not.” Use of
of cells per high power field, and the date and time of birth negation allows pertinent negative findings to be represented
would be a date/time element. without explicit creation of finding codes that include the words
“not” or “no,” or “none.”
Many attributes within the medical domain have values that
are expressed using words or text rather than numbers. The implicit attributes could have been handled explicitly as
Attributes like patient names, symptom names, drug names, “ordinary” attributes, but their frequent use and general
disease names, and anatomic locations are examples of textual applicability favored making them an integral (implicit) part of
data. It is worth noting, however, that the possible values for Attribute.
an attribute like "drug name" come from a finite set of possible One problem with the definition of Attribute, as shown in
choices which means that it is possible to represent these terms Figure 4, is that not all of the implicit attributes are appropriate
by one of the Coded types shown in Figure 3, rather than by for all value types that could be chosen from the CHOICE list
their full text representation. This ability to encode the value of for Value. For example, it would be inappropriate to use
textual attributes has important implications in the physical precision to describe a coded term, and it is unclear what
database. negation would mean if attached to an image. To clarify
In addition to “value,” a set of optional characteristics (called exactly which implicit attributes can be used with which value
implicit attributes in a previous publication [11]) are also types, specialized subtypes of Attribute have been created. By
defined for the Attribute type. The implicit attributes (Figure 4: design, Attribute is another virtual type, i.e. one that is used to
tags 1 to 10) are used to store complementary details about the derive other types, but not used directly to create instances of
6. data in the patient database. Only the specialized subtypes of DateTimeIntervalSet ::= SET {
Attribute are used to create data instances. startTime [0] BaseDateTimeAttribute,
endTime [1] BaseDateTimeAttribute }
Three examples of specialized Attribute subtypes are shown in
TiterSet ::= SET {
Figure 5. BaseCodedAttribute has been defined as a subtype of lower [0] BaseDecimalAttribute,
Attribute, where value has been restricted to be of type coded. upper [1] BaseDecimalAttribute }
The implicit attributes of negation, uncertainty, machine IntegerRangeSet ::= SET {
probability, user stated probability, and modifier can be lower [0] BaseLong2Attribute,
optionally present. Numeric operator, units, and precision have upper [1] BaseLong2Attribute }
been deleted as implicit attributes in the definition of
Figure 6: The definition of an interval of chronological time
BaseCodedAttribute. For BaseDateTimeAttribute, numeric
using the specialized BaseDateTimeAttribute.
operator, negation, uncertainty and precision are included as
optional elements, while machine probability, user stated Figure 6 also shows the definition of TiterSet and
probability, modifier, and units have been deleted. The IntegerRangeSet. TiterSet defines a complex type which allows
definition of BaseDecimalAttribute was created in a similar the recording of clinical laboratory observations that are
manner. Specialized subtypes of Attribute were made for all of measured by serial dilution methods. It uses two component
the primitive types, but only these three examples are shown. elements, “lower” and “upper,” to represent the separate parts
of a titer expression. Each of the components in TiterSet is of
BaseCodedAttribute ::= Attribute ( WCS { type BaseDecimalAttribute. Similarly, IntegerRangeSet is a
value (WCS {coded}),
negation OPTIONAL, complex type which would typically be used to record
uncertainty OPTIONAL, microscopic slide observations where the number of items per
machineProb OPTIONAL, high power field varies within a bounded range. The
userStatedProb OPTIONAL,
modifier OPTIONAL })
components of IntegerRangeSet are of type
BaseLong2Attribute.
BaseDateTimeAttribute ::= Attribute ( WCS {
value ( WCS {dateTime}),
Attributes with error handling
numericOp OPTIONAL,
negation OPTIONAL,
uncertainty OPTIONAL, Attribute represents a very flexible data type that is the
precision OPTIONAL }) foundation for many clinically useful types. However, at the
BaseDecimalAttribute ::= Attribute ( WCS { level of abstraction provided by Attribute, some practical
value ( WCS {decimal}), considerations about how processes will instantiate these types
numericOp OPTIONAL, becomes important. One of these considerations is error
precision OPTIONAL })
handling. For example, if we have defined a clinical
Figure 5: The creation of specialized attributes by subtype measurement like hematocrit as a numeric quantity (say as a
constraint of Attribute. BaseDecimalAttribute type), and an interface from a clinical
application sends a value for a hematocrit of “not done” in a
The specialized subtypes of Attribute can be used to create laboratory interface message, an exception condition exists.
more complex data types that have more than one component, The information model has defined the clinical variable as a
as shown in Figure 6. DateTimeIntervalSet is a new type that is numeric type, but the source system has sent a value that is of
used to represent an interval of chronological time. It contains type text. In order to handle these kinds of exception
two components, a start time and end time. Each of the conditions, a new type called LAttribute was defined, as shown
components is typed as a BaseDateTimeAttribute, meaning that in Figure 7.
each end of the interval can include uncertainty and
LAttribute (which stands for Loosely-typed Attribute) is
imprecision as part of the specification. This new type
defined as a choice between BaseAttribute or ErrorType.
represents a very flexible representation of chronological time
Furthermore, BaseAttribute is a choice containing all of the
that is often needed in recording a patient’s signs, symptoms,
specialized versions of Attribute, and ErrorType is a choice
and diseases.
between four possible types that could be used to represent
errors. The intended use of this structure is as follows. Data
coming into the system is expected to be represented as one of
the types available in the BaseAttribute choice. If the incoming
data is of the type expected, then the data is instantiated as that
type. However, if the incoming data is not of the expected type,
the data is instantiated as one of the error types. Referring back
to the hematocrit example, if hematocrit was typed as
7. BaseDecimalAttribute, and a value of “42.8” was sent by a BaseObservation ::= SET {
clinical laboratory system, the value would be instantiated obsValue [0] LAttribute,
inside of the decimalAtt element of BaseAttribute. If the value commonMods [1] ObservationSequence OPTIONAL,
obsMods [2] ObservationSequence OPTIONAL,
of the incoming hematocrit was “not done,” then the value comments [3] Comments(ObsComment) OPTIONAL,
would be instantiated as errorText inside of ErrorType. Using actionsInfo [8] EventActionSequence OPTIONAL,
LAttribute in this way provides for exception handling when semanticLinks [9] SemanticLinks OPTIONAL }
incoming data is not of the expected type. This ultimately ObservationSequence ::= SEQUENCE OF ObservationObject
means that clinical data in the system is strongly typed to the
ObservationObject ::= INSTANCE-OF-SUBTYPE(
greatest extent possible, given the limitations of a sending SUBTYPES OF (BaseObservation))
application.
Figure 8: The definition of BaseObservation by reference to
LAttribute ::= CHOICE { LAttribute.
type [0] BaseAttribute,
error [1] ErrorType } The next element, commonMods, is a sequence of common
BaseAttribute ::= CHOICE { modifiers of a class of observations. For laboratory
decimalAtt [0] BaseDecimalAttribute, measurements, typical kinds of common modifiers include
long1Att [1] BaseLong1Attribute, abnormal flags, delta check flags, and reference ranges. The
long2Att [3] BaseLong2Attribute,
dateTimeAtt [6] BaseDateTimeAttribute,
exact set of possible common modifiers for a given observation
codedAtt [7] BaseCodedAttribute, are not specified in the definition of BaseObservation, but are
codedWOSformAtt [8] BaseCodedWOSformAttribute, specified when subtypes of BaseObservation are created.
codedWRSformAtt [9] BaseCodedWRSformAttribute, ObsMods is also a sequence of modifiers of the observation, but
textAtt [10] BaseTextAttribute,
documentAtt [11] BaseDocumentAttribute, it contains modifiers that are appropriate for a specific
soundAtt [12] BaseSoundAttribute, observation, rather than common to a general class of
imageAtt [14] BaseImageAttribute, observations. For example, in evaluating a patient for rales,
videoAtt [16] BaseVideoAttribute,
timeIntAtt [1004] DateTimeIntervalSet }
typical observation modifiers would be quality and severity.
Examples that further illustrate the use of commonMods and
ErrorType ::= CHOICE { obsMods are shown later (Figures 19, 21, and 23).
errorCoded [0] CodedWOSform(LooseTypeErrorCode),
errorText [1] Text, As shown in Figure 8, both commonMods and obsMods are of
errorByte [2] OctetString,
errorBit [3] BitString } type ObservationSequence. ObservationSequence is then
defined as a SEQUENCE OF an ObservationObject.
Figure 7: The definition of LAttribute (a loosely typed SEQUENCE OF is a standard ASN.1 type, and signifies an
attribute) by reference to the specialized attribute definitions. ordered set, where all of the elements in the set are of the same
ASN.1 type. ObservationObject is defined as an INSTANCE-
Observations OF-SUBTYPE.
LAttribute becomes the building block for the next level of INSTANCE-OF-SUBTYPE is a new type that plays a special
abstract data type in the clinical information model. Clinical role in the information model. There were at least three
data is usually accompanied by other information that motives for the creation of INSTANCE-OF-SUBTYPE. First,
establishes the context and circumstances surrounding the we saw a need to manage the structural evolution of ASN.1
collection of the data, as well as possible modifiers of the data. subtypes over time. Second, we saw the need to handle
It is useful to keep this set of information together as an information at different levels of abstraction, i.e. to create a
information object. To accommodate this need, degree of polymorphism in the ASN.1 types. Third, we wanted
BaseObservation was created, as shown in Figure 8. an ASN.1 type that would allow us to create subtype definitions
BaseObservation was patterned after the OBX Segment from a known type after ASN.1 compile time.
(Observation Results Segment) as defined in the HL7
The definition of INSTANCE-OF-SUBTYPE is shown in
specification [21]. The principle component of
Figure 9. The “type” component contains the actual data of an
BaseObservation is obsValue, which is of type LAttribute, and
instance of the subtype. The structure of the data inside of
represents the value of whatever was observed or evaluated.
“type” must conform to the ASN.1 supertype (symbolically
ObsValue is modified by three associated data elements,
named TypeReference) from which the specific subtype was
commonMods, obsMods, and comments. For the sake of
derived. It is required that the definition of TypeReference be
brevity, the Comments type is not further defined here, but it
present in the same ASN.1 module where a subtype of
represents a sequence of either coded or text comments
INSTANCE-OF-SUBTYPE is defined. This ensures that the
associated with the observation.
8. information to code and decode the ASN.1 type is present in value contained in “type” must be a subtype of BaseType. Note
executable applications. that it is possible to decode the value of the subtype knowing
only the definition of BaseType. The Ncid and Version of the
INSTANCE-OF-SUBTYPE ::= SEQUENCE { type are only needed when an instance is to be validated against
id [0] Ncid, its subtype definition.
version [1] Version,
type [2] TypeReference }
NewSpecificSubtype ::= INSTANCE-OF-SUBTYPE (
Figure 9: The definition of Instance of Concept. SUBTYPES OF (BaseType))
BaseType ::= TypeReference
The “id” element of INSTANCE-OF-SUBTYPE contains the
name (identifer) of the specific subtype that is instantiated in Figure 10: The syntax for defining a constrained subtype of
the “type” element. The id element is of type Ncid. We require Instance of Concept.
that all ASN.1 types and subtypes be defined as concepts in the
data dictionary. Furthermore, each concept which represents an One further conceptual problem remains. We have defined an
ASN.1 type includes validation information as part of its abstract observation (BaseObservation) as a set of elements
definition. Since each concept is assigned an Ncid, the Ncid (obsValue, commonMods, obsMods, etc.), and we have defined
serves as a unique identifier of the subtype contained in an ObservationSequence as a SEQUENCE OF ObservationObject,
INSTANCE-OF-SUBTYPE. where ObservationObject is an INSTANCE-OF-SUBTYPE,
which must be a subtype of BaseObservation. (Note that
The “version” element is include because the validation BaseObservation is another case where we have recursively
information associated with a subtype may change over time. used a type definition within itself.) Given these definitions,
The “version” element represents the version number of the ASN.1 provides the ability, using WITH COMPONENT
subtype validation information that was current at the time a (abbreviated here as WC), to constrain ObservationSequence to
subtype was instantiated. Thus, the combination of the “id” and be of specific types as shown in Figure 11.
“version” elements provide a unique key to access validation
information associated with a subtype, while the “type” field LaboratoryModifiers1 ::= ObservationSequence (WC {
contains the actual data as defined by TypeReference. Data WCS ( AbnormalFlag | ReferenceRange ) })
dictionary services are supplied so that subtype validation AbnormalFlag ::= BaseObservation ( WCS { ... })
information can be used by an application at runtime.
ReferenceRange ::= BaseObservation ( WCS { ... })
The kind of polymorphism that we were trying to achieve
Figure 11: The syntax for defining a constrained subtype of
using INSTANCE-OF-SUBTYPE can be illustrated by an
Instance of Concept.
example. An electrolyte panel that consists of four
measurements (sodium, potassium, chloride, and bicarbonate) Unfortunately, this use of ASN.1 notation constrains
can be thought of as a sequence of laboratory observations ObservationSequence to be a sequence consisting of zero or
because they can all be represented using a common data more AbnormalFlags and/or zero or more ReferenceRanges.
structure. That is, they can be represented as subtypes of The desired result is a sequence of elements that are subtypes of
BaseObservation. However, each measurement can also be BaseObservation, but consisting of exactly one AbnormalFlag
thought of as a distinct object class. As a class, serum sodium and exactly one ReferenceRange. We have not been able to
measurements have a different reference range and different achieve this result within existing ASN.1 notation and have had
clinical meaning than serum potassium measurements. It may to resort to an extension, WITH TYPES.
be desirable for a data entry program to validate that a serum
sodium value came from the range that was appropriate for The WITH TYPES notation provides the ability to
serum sodiums, and that serum potassium values came from the positionally constrain each individual element within the
range that was appropriate for serum potassiums. So in SEQUENCE OF type. Using the same example as in Figure
different situations, there is a need to treat an electrolyte panel 11, the WITH TYPES notation can be used to correctly
as a sequence of the same structural type (BaseObservation), or constrain ObservationSequence as shown in Figure 12. This
alternatively, as a sequence of four specific ASN.1 types. notation indicates that ObservationSequence is a sequence of
BaseObservations and must consist of exactly two
The syntax for creating a constrained subtype of INSTANCE- BaseObservations; one occurrence must be of type
OF-SUBTYPE is shown in Figure 10. This syntax constrains AbnormalFlag and one occurrence must be of type
the contents of the id and type fields of the INSTANCE-OF- ReferenceRange.
SUBTYPE to be subtypes of any other specified and known
ASN.1 type. This definition means that NewSpecificSubtype is The WITH TYPES and SUBTYPES OF notation are the only
derived from INSTANCE-OF-SUBTYPE where the encoded “extensions” to the semantics of the 1994 ASN.1 specification
9. that we have made. All of the notation described in this The SemanticLinks component of BaseObservation enables
document with the exception of WITH TYPES and SUBTYPES separate instances of objects and events to be linked via a
OF is either taken directly from the ASN.1 specification or is named relationship. For example, SemanticLinks could be used
“shorthand” which can be readily mapped into the to connect a positive throat culture result with a prescription for
corresponding ASN.1 notation if needed. The SUBTYPES OF penicillin, using the named relationship “treated-by” to make it
notation is an extension using ideas from the specification of clear that the prescription was written as the treatment for the
the standard ASN.1 type INSTANCE-OF. WITH TYPES has positive throat culture. The purpose and use of semantic links
no precedent in standard ASN.1. was also described in a previous publication [11].
The structure of SemanticLinks is shown in Figure 13.
LaboratoryModifiers2 ::= ObservationSequence (
WITH TYPES { SemanticLinks is a sequence of SemanticLink, meaning there
AbnormalFlag, can be zero to many semantic links for each observation.
ReferenceRange }) SemanticLink contains three components. The “relationship”
AbnormalFlag ::= BaseObservation (WCS { ... }) element is the coded name of the relationship that links the two
ReferenceRange ::= BaseObservation (WCS { ... })
instances. Possible relationships include: caused-by, associated-
with, treated-by, contains, etc. The second element,
Figure 12: The syntax for defining a constrained subtype of objectPointer, contains the unique key (equivalent to a unique
Instance of Concept using the WITH TYPES syntax. object identifier) to the linked observation or event. The final
element, linkInfo, is of type EventActionSequence, and
Neither of these exceptions is viewed as an impediment to the provides attribution information about who, what, when, and
use of standard ASN.1 compilers. For example, the simple where the link was created. EventActionSequence is also the
removal of the WITH TYPES portion of the specification underlying type of actionsInfo in BaseObservation, and is the
provides valid ASN.1 which accurately describes the content of common data type used for recording attribution information
the data. With the removal of the WITH TYPES syntax, for observations, semantic links, and events.
LaboratoryModifiers2 would simply be a sequence of
BaseObservations. By removing WITH TYPES, what is SemanticLinks ::= SEQUENCE OF SemanticLink
omitted is validation information that specifies that the
SemanticLink ::= SET {
sequence should contain two, and only two, specific subtypes of relationship [0] CodedWOSformAtt(LinkRelationship),
BaseObservation (AbnormalFlag and ReferenceRange). objectPointer [1] ObjectPointer,
Without this information, the user of a generic ASN.1 compiler linkInfo [2] EventActionSequence OPTIONAL }
would be able to encode a LaboratoryModifiers2 instance which
Figure 13: The definition of SemanticLinks.
was not actually valid. However, since we require that any data
entered into the system must pass a service level validation step The definition of EventActionSequence is shown in Figure
regardless of the source, invalid encodings would still be 14. EventActionSequence is a sequence of EventActionObject.
detected. Meanwhile, any valid instance of EventActionObject is an INSTANCE-OF-SUBTYPE where the
LaboratoryModifiers2 could be decoded using its more general concept instances must be subtypes of EventActionInfo. The
definition as a sequence of BaseObservations. Ncid inside of the INSTANCE-OF-SUBTYPE (see Figure 9),
A similar situation exists for the SUBTYPES OF syntax. It is indicates an action taken by a user or application. Typical
essentially a way of specifying an ASN.1 template where actions would include add, modify, delete, sign, review, and
TypeReference can be replaced by a known ASN.1 type at the place on hold. The EventActionInfo set contains the time of the
time a subtype definition is created. This is a convenient action, the reason for the action, and any free text or coded
notation for the derivation of subtypes. However, this syntax comments about the action. AssociatedActions refers
could be replace by a policy that required each subtype derived recursively back to the definition of EventActionSequence, and
from INSTANCE-OF-SUBTYPE to become its own type. allows actions to be taken on other actions. One common use
Macro substitution in a precompile step would accomplish the would be to allow counter signatures on signatures.
same thing. Either of these approaches would allow a standard The final element in EventActionInfo is sourceInfo, which
ASN.1 compiler to decode a BER encoded string created by our ultimately becomes a sequence of possible SourceInfo types.
system. The source of the information could be a person, a system, a
Having explained the substructure of ObservationSequence device, etc. Further details of the components of
and its component type INSTANCE-OF-SUBTYPE, we will PersonSourceInfo, SystemSourceInfo, etc., are not shown but
now describe the remaining elements within the definition of include person names, identifiers, model numbers, and
BaseObservation. application names. The list of possible information sources will
likely increase as new applications and interfaces are added to
the system.
10. We have now described all of the elements that are included BaseEvent ::= SET {
in BaseObservation. It is the ASN.1 type that contains a single sharedContext [0] ObservationSequence,
observation about a patient, with its attendant modifiers and eventContent [1] EventContent,
comments [2] Comments(BaseEventComment) OPTIONAL,
attribution information. BaseObservation, in turn, becomes the actionsInfo [7] EventActionSequence OPTIONAL,
building block for Events, which are the next level of patient semanticLinks [8] SemanticLinks OPTIONAL }
data organization in our system. EventContent ::= CHOICE {
observations [0] ObservationSequence,
EventActionSequence ::= SEQUENCE OF EventActionObject events [1] EventSequence }
EventActionObject ::= INSTANCE-OF-SUBTYPE( EventSequence ::= SEQUENCE OF EventObject
SUBTYPES OF (EventActionInfo) )
EventObject ::= INSTANCE-OF-SUBTYPE(
EventActionInfo ::= SET { SUBTYPES OF (BaseEvent))
time [1] LDateTimeAtt OPTIONAL,
associatedActions [2] EventActionSequence OPTIONAL, Figure 15: The definition of BaseEvent by reference to
reasons [3] Reasons OPTIONAL, BaseObservation.
comments [4] Comments(EvntActComment) OPTIONAL,
sourceInfo [5] SourceInfoSequence }
The first element in BaseEvent is sharedContext, which
SourceInfoSequence ::= SEQUENCE OF SourceObject ultimately resolves to a sequence of BaseObservations. The
SourceObject ::= INSTANCE-OF-SUBTYPE( purpose of sharedContext is to record context and attribution
SUBTYPES OF (SourceInfo)) information that is common to all of the observations that will
SourceInfo ::= CHOICE { be recorded in eventContent. For the cardiovascular exam,
person [0] PersonSourceInfo, sharedContext would contain who the examiner was, the date
system [1] SystemSourceInfo, and time of the examination, and where the examination was
decision [2] LogicProcessSourceInfo,
institution [3] ExternalInstitution,
performed. For the clinical chemistry example, sharedContext
device [4] DeviceInfo, would contain descriptions of the sample, who collected the
physicalLocation [5] PhysicalLocation } sample, and when it was collected.
Figure 14: The definition of EventActionSequence. EventContent is the real information container in BaseEvent.
It ultimately resolves to either a sequence of BaseObservations,
Events or to a sequence of BaseEvents. This is a recursive definition,
allowing events inside of events to whatever depth is necessary
In many situations in clinical medicine, observations are not to capture a complex clinical situation. The other elements in
made in isolation but are collected in a common context. For BaseEvent (comments, actionsInfo, and semanticLinks) have
example, observations on heart sounds, heart murmurs, heart the same use and structure as they did in BaseObservation,
rate, systolic and diastolic blood pressures might all be collected except that they are now modifiers of the event, rather than a
as part of a cardiovascular examination. In the laboratory, a single observation.
serum sodium, potassium, carbonate, and chloride are often
done on the same blood specimen as a collection of clinical Patient Events
chemistry observations. Previous publications have described
the use of events in building patient information systems [11, With the definition of BaseEvent, the next step in the clinical
22, 23]. The BaseEvent type, as shown in Figure 15, was type hierarchy is the association of one or more events with a
designed to allow efficient recording of observations that share patient, and then the association of patient events with
a common clinical context. Many of the parts of BaseEvent are transactions and messages. Figure 16 presents the definition of
reused from BaseObservation. “PatientEvent,” where one or more events (derived from
BaseEvent) are associated with a patient. The substructure of
Patient is not shown, but contains a unique numeric identifier
of the patient. PatientEvent then becomes the building block
for PatEventSequence, which is then included in Action.
Action represents a logical set of actions that will be taken
against the patient database. Besides the sequence of patient
events, Action contains an operation (delete, update, insert)
which indicates what action is to be taken relative to the patient
database, a set of reasons for the action, and errors associated
with the action. Action is the primary component of
PatDataTrans, a patient data transaction which consists of a
sequence of actions.
11. PatDataTrans ::= SEQUENCE OF Action in the information model. It became apparent that it would be
useful to be able to enclose or encapsulate one ASN.1 type
Action ::= SET {
patEvents [0] PatEventSequence, inside of another ASN.1 type. The difference between
operation [1] Operation, INSTANCE-OF-SUBTYPE and INSTANCE-OF-CONCEPT is
reason [2] ActionReasons OPTIONAL, that INSTANCE-OF-CONCEPT offers an anonymous choice
errors [3] Errors OPTIONAL }
between one or more types that are listed inside of the “id”
Operation ::= CHOICE { element. An anonymous choice means that the concepts that
deleteOp [0] Null,
are part of the choice do not have to be defined in the ASN.1
updateOp [1] Null,
insertOp [2] Null } module where the INSTANCE-OF-CONCEPT type is used.
This is the opposite of INSTANCE-OF-SUBTYPE, where the
PatEventSequence ::= SEQUENCE OF PatientEvent
contained type must be defined in the same ASN.1 module
PatientEvent ::= SET { where it is used.
patient [0] Patient,
events [1] EventSequence } The value of the anonymous choice is that applications only
Figure 16: The definition of PatientEvent by reference to have to be aware of the definition of the contained types which
EventObject. they will manipulate or decode. Assume, for example, that a
message routing application is able to route a message based on
The final step in the clinical information model is to include the information contained in MessageInfo. By using the
the patient data transactions in a message. Within the system, INSTANCE-OF-CONCEPT construct as shown in Figure 17,
all process-to-process communication or computer-to-computer the definitions of PatDataTrans and HDDTrans (and all of the
interfaces are based on a single message definition, as shown in types that they reference) do not need to be included in the
Figure 17. HemsMessage (health enterprise management routing application. However, an application that was storing
system message) contains three components. Version patient data in the patient database would need the definition of
represents the version number of the HemsMessage definition PatDataTrans (and the types which it references) in order to
that was current at the time a message was created. decode and validate that the patient events it contains conform
MessageInfo contains global information about security and to their ASN.1 definitions.
access information, message flags, errors associated with the Essentially, the INSTANCE-OF-CONCEPT type provides
message, and the context (attribution information) of the “cut points” in the ASN.1 definitions that allow applications to
message. TransactionObject ultimately contains PatDataTrans include only those types that they directly manipulate or decode.
(the patient transaction) or HDDTrans (a transaction to query At runtime, applications are also able to decode only those
for information in the health care data dictionary), but portions of a BER-encoded ASN.1 string in which they have
TransactionObject utilizes a new construct called INSTANCE- interest.
OF-CONCEPT.
The definition of INSTANCE-OF-CONCEPT is shown in
HemsMessage ::= SET { Figure 18. The “id” element contains the Ncid of the ASN.1
version [0] Version, type that is contained as a BER encoded string in “value.”
messageInfo [1] MessageInfo,
trans [3] TransactionObject }
Version is the version number of the type definition that was
current at the time an instance of the INSTANCE-OF-
MessageInfo ::= SET { CONCEPT type was created. The “&Value” notation expresses
accessSecurityInfo [3] AccessSecurityInfo,
messageFlags [5] MessageFlags OPTIONAL, the anonymous choice. It means that the definition of the type
errors [8] Errors OPTIONAL, referred to by Ncid (and encoded in the “value” element) does
context [9] UserContext OPTIONAL } not have to reside in the same module where INSTANCE-OF-
TransactionObject ::= INSTANCE-OF-CONCEPT ( WCS { CONCEPT is used.
...,
id ({PatDataTrans} | {HDDTrans} ) }) INSTANCE-OF-CONCEPT ::= SEQUENCE {
id [0] Ncid,
Figure 17: The definition of a HemsMessage (health version [1] Version,
enterprise management system message), which ultimately value [2] &Value }
contains a patient data transaction (PatDataTrans).
Figure 18: The definition of INSTANCE-OF-CONCEPT.
INSTANCE-OF-CONCEPT has some similarity in purpose
and structure to INSTANCE-OF-SUBTYPE. As we We have now completed the description of all of the
implemented the early releases of our clinical database, it underlying data types used in our clinical information model.
became obvious that information hiding and encapsulation were We would now like to show by example how these structures
necessary in order to effectively manage modularity and change are used.
12. As illustrated above, BaseObservation is the building block of another body site by a “BodySiteConnector”. BodySite and
BaseEvent. However, the complex information structures BodySiteConnector are both derived from BaseObservation.
necessary to describe clinical events are often shared across BodySiteConnector is a simple observation consisting of a
multiple types of events. For instance, most clinical findings coded attribute whose values are limited to the domain of “body
reported by the patient or detected by the provider require a site connectors.” BodySite is a complex observation in which
precise description of their anatomic locations. The an anatomical site (“BodyAnatomy”) is modified or
representation of these common pieces of information is complemented by a series of optional characteristics, including
possible with the creation of subtypes of BaseObservation, “side” (right, left, etc.), “laterality” (lateral, medial, etc.),
making these subtypes the building blocks of more complex “depth” (anterior, posterior, etc.), among others. These various
observations and events. characteristics are themselves subtypes of BaseObservation,
taking advantage of the recursive characteristic of the
Figure 19 illustrates the definition of “LocationDescription”
information model.
as a constrained subtype of BaseObservation. The first element
in the definition is “codedObsVal(Presence).” Note that Figure 20 presents the definition of a patient exam event
codedObsVal is a shorthand syntax for describing a coded derived from BaseEvent, and it includes the
observation restricted to codes in a specific domain. It is LocationDescription observation as an optional context. Figure
actually a macro that is used to hide some of the complexity of 20 introduces a new element of ASN.1 subtype notation. The
the ASN.1 subtype syntax from people creating or reviewing the use of ellipsis (“...”) in a subtype definition means that all the
model. The expansion of the macro is not shown. The elements defined in the parent type are included in the subtype
“codedObsVal(Presence)” element has a purpose similar to the definition unchanged, unless otherwise specified. For
purpose previously described for Negation in the definition of PatientExamEvent, it means that all of the elements defined in
Attribute (Figure 4). The value of Presence can be either BaseEvent are applicable to PatientExamEvent, except for
“Present” or “Absent (Not Present),” which allows negation of a “sharedContext” which is explicitly included, but as a
LocationDescription as expressed in the phrase “The rash was constrained subtype that is defined further. PatientExamEvent
not present on the palms of the hands.” is used to represent the information from the physical
examination where more than one observation is associated
LocationDescription ::= BaseObservation ( WCS { with the same body location, such as systolic and diastolic blood
codedObsVal(Presence), pressures measured in the right brachial artery.
obsMods ( WITH TYPES {
BodySite,
BodySiteConnector OPTIONAL }) }) PatientExamEvent ::= BaseEvent ( WCS {
...,
BodySiteConnector ::= BaseObservation ( WCS { sharedContext ( WITH TYPES {
codedObsVal(BodySiteConnectors) }) EventDateTime,
ObjectStatus OPTIONAL,
BodySite ::= BaseObservation ( WCS {
LocationDescription OPTIONAL }) })
codedObsVal(Presence),
obsMods ( WITH TYPES {
BodyAnatomy,
Figure 20: The definition of PatientExamEvent by subtype
BodySide OPTIONAL, constraint of BaseEvent
Laterality OPTIONAL,
BodyDepth OPTIONAL, Clinical observations are frequently associated with a series of
BodyRelativeDepth OPTIONAL, commonly used modifiers. Some of these modifiers are
BodyLevel OPTIONAL,
BodyRelativeLevel OPTIONAL, particularly important when numeric values are reported, such
RelativeSideOrLaterality OPTIONAL }) }) as clinical laboratory measurements. Figure 21 shows the
BodyAnatomy ::= BaseObservation ( WCS { derivation of “PatientObservation” from BaseObservation,
codedObsVal(BodyAnatomySite) }) where common modifiers for numeric result values are
BodySide ::= BaseObservation ( WCS {
explicitly included (reference range, abnormal flag, status, delta
codedObsVal(BodySides) }) flag, reporting flag, and location).
Laterality ::= BaseObservation ( WCS {
codedObsVal(Lateralities) })
Figure 19: The definition of a complex observation (body
location description) by recursive subtype constraint of
BaseObservation.
The rest of the definition of LocationDescription basically
consists of a recursive sequence of a “BodySite” linked to
13. PatientObservation ::= BaseObservation ( WCS { PulmonaryExam ::= PatientExamEvent ( WCS {
..., ...,
commonMods ( WITH TYPES { sharedContext ( WITH TYPES {
ReferenceRange OPTIONAL, ...,
AbnormalFlag OPTIONAL, LocationDescription ABSENT }),
ObjectStatus OPTIONAL, eventContent (WCS {
DeltaFlag OPTIONAL, observations ( WITH TYPES {
ReportFlag OPTIONAL, Rhonchi,
LocationDescription OPTIONAL }) }) Rales }) }) })
Figure 21: The definition of PatientObservation by subtype Rhonchi ::= PatientObservation ( WCS {
...,
constraint of BaseObservation. codedObsVal(Presence),
commonMods ( WITH TYPES {
Figure 22 illustrates how PatientExamEvent and ...,
PatientObservation can be combined to define a more specific ReferenceRange ABSENT,
LocationDescription }),
type of patient exam, in this case a blood pressure battery obsMods ( WITH TYPES {
(“BPBattery”). In BPBattery the observation component is Pitch OPTIONAL,
made explicit by restricting it to systolic (SysBP) and diastolic Severity OPTIONAL }) })
blood pressure (DiaBP) observations. In addition, both SysBP Rales ::= PatientObservation ( WCS {
and DiaBP are to be resulted using units of “mmHg.” Again, ...,
decUObsVal is shorthand for an observation that has been codedObsVal(Presence),
commonMods ( WITH TYPES {
constrained to be a decimal value measured in the specific units ...,
of “mmHg.” ReferenceRange ABSENT,
LocationDescription }),
Finally, Figure 23 illustrates a more complex type of obsMods ( WITH TYPES {
PatientExamEvent, named “PulmonaryExam.” In this Quality OPTIONAL,
example, LocationDescription is excluded for use at the event Severity OPTIONAL }) })
level (by use of the keyword ABSENT), but it is required to be Figure 23: The definition of PulmonaryExam (a set of
present with each observation. This allows the location of each pulmonary examination findings) by subtype constraint of
observation to be described individually. PatientExamEvent, and by subtype constraint of
Figure 23 suggests that PulmonaryExam corresponds to PatientObservation.
possible findings encountered during an examination of the
In the same way, if “rales” are detected, the “quality” and
lungs, such as “Rhonchi” and “Rales.” The PulmonaryExam
“severity” of the rales can be stored. Notice that the location
event is instantiated by acknowledging the presence or not of
description is a required modifier for both Rhonchi and Rales,
each observation. If a given observation is present, the
i.e., if either rhonchi or rales are present, their anatomical
characteristics of this observation can also be recorded. For
locations must be determined. Likewise, the common modifier
instance, if “rhonchi” is present, the “pitch” and “severity” can
ReferenceRange does not apply in either case and is marked as
be optionally represented.
absent. These two examples are fairly simple, but show the
BPBattery ::= PatientExamEvent ( WCS {
process whereby very complex clinical data models can be built
..., from BaseObservation and BaseEvent.
eventContent (WCS {
observations ( WITH TYPES {
SysBP, Practical Experience
DiaBP }) }) })
SysBP ::= PatientObservation ( WCS { We have implemented an electronic medical record system
decUObsVal(MillimetersOfMercury) })
using ASN.1. The exact model used was an earlier version of
DiaBP ::= PatientObservation ( WCS { the clinical model presented in this paper. The patient
decUObsVal(MillimetersOfMercury) }) repository was implemented using a client-server architecture
Figure 22: The definition of BPBattery (a blood pressure and using a relational database as the primary data store. We
battery) by subtype constraint of PatientExamEvent, and by previously published a description of the system architecture
subtype constraint of PatientObservation. [24]. In the first design of the data repository, patient events
were to be stored as packed PTXT strings. (PTXT is the
hierarchical encoding system used in the HELP system [25]. In
the actual implementation of the system, the packed PTXT
strings were replaced by BER encoded ASN.1 strings. The
14. BER strings are stored in a variable length character field in the return a list of all classes to which the concept belongs.
relational database. In addition to the patient event table, there Because the RelSFormNumId is present, the application can
are a number of associated relational tables that expose partially also display the exact representation that was seen by the user at
normalized views of the patient data to standard SQL query data entry time. This rich set of behaviors evolves from the
tools. The associated tables also act as indexes to the ASN.1 combination of a simple ASN.1 structure, a set of vocabulary
objects contained in the patient event table. The ASN.1 strings tables, and a set of library routines that implement behaviors
are created and modified using a standard library of routines based on the structure.
written in C and C++. Application developers can interact with
A second issue is the alternative approach that ASN.1
the ASN.1 objects using C, C++, or Microsoft OLE interfaces.
subtyping offers in the creation of new object types. In standard
Three clinical systems have recently begun live operations
object-oriented modeling, specialized types are created by
using this architecture.
inheriting the attributes and behaviors of the parent type and
then adding additional attributes. This approach could be
Discussion implemented using CHOICE constructs in ASN.1. Our
approach has been to create a parent type that includes all of the
We have found ASN.1 to be a very flexible and robust attributes that can exist in any of the children, and then create
language for representing and implementing a clinical the children using the extensive subtype constraint language of
information model. Our approach has been to develop a small ASN.1. The different approaches offer tradeoffs in
number of primitive types that are then used to construct implementation. Because of the large number of object types
composite data types of progressively greater complexity. This that exist in clinical medicine, the standard approach can lead
strategy has uncovered some interesting issues related to ASN.1 to an unwieldy number of object classes. For example, it is easy
and also object oriented modeling in general. to think of a hematocrit measurement as an object class, since
there could be many hematocrits associated with a given
One issue is that ASN.1 is not a true object oriented language. patient, there are expectations about the range of values that it
That is, ASN.1 only defines the data structures related to an can take, it has a specific unit of measure, etc. However, if
object, not its methods or behaviors. However, the ASN.1 data everything that is at the same level of abstraction in clinical
structure is very flexible and can be used as the storage medicine as hematocrit becomes an object class, there would be
structure for persistent data related to the object. thousands of object classes.
In our case, the true clinical objects were implemented using By contrast, a single ASN.1 type could be implemented, i.e.
Microsoft OLE Automation technology, but the internal storage BaseObservation, and then thousands of subtypes would be
structure used ASN.1. Methods and behaviors related to the created for objects at the level of hematocrit. The subtype
ASN.1 objects were implemented in programs written in C and information is only used in the general implementation when a
C++. The goal was to encapsulate behaviors at the appropriate new instance of hematocrit or one of the other subtypes is
level, and then add more complex behaviors as the objects created. Validation is simple because it follows directly from
become more sophisticated. For instance, there are low-level ASN.1 subtype notation. After validation, the subtype can be
routines that subtract and compare two DateTime items. At the manipulated simply as its supertype, which in this case is
level of Attribute, these routines are still utilized to compare BaseObservation. The ASN.1 subtyping approach has provided
two DateTimeAttributes, but the higher level functions must an important simplification in the implementation of our
also handle the open-ended uncertainty allowed at the level of patient data repository.
Attribute. The lower-level functions are the building blocks for
higher-level behaviors. Another particular advantage of ASN.1 is that it integrates
coded data types (and hence vocabulary) with other primitive
Coded data, in particular, benefits from behaviors types in a consistent modeling framework. Coded data
implemented around a simple data structure. At most, a coded elements are just one more primitive type in the system. Coded
data element in our system can consist of 3 elements: an Ncid, a data has its own set of behaviors, but it lives side by side with
RelSFormNumId, and Text. However, a set of programs, using numbers, dates, times, and images. Rather than using one kind
information stored in the vocabulary tables of the system, can of infrastructure to manage coded data and another to manage
implement very useful behaviors on the coded data types. the other types of patient data, there is a single consistent tool
These routines allow an application to display coded data in used for all patient data. At the level of Attribute, the model
different languages, to translate between languages, and to explicitly shows how uncertainty and negation are handled. At
display a representation that is determined by other parameters the level of BaseObservation, the explicit relationship of
of the runtime environment. By reference to the semantic modifiers to the observation is shown. Eventually the clinical
network in the vocabulary tables, the routines can determine information model builds to the level of patient events, patient
whether a given concept is a member of a particular class, or transactions, and inter-process messages, all of which are
15. expressed using a common language. Because of the tight One of our objectives in having a formally defined model was
connection between ASN.1 syntax and ASN.1 encoding, the to support application development. In our setting, applications
storage structure of any element is known as soon as its abstract include data entry programs, data review programs, clinical
syntax is defined. alert modules, decision logic, and database queries to support
clinical research and quality assurance. The clinical model is
A final advantage of ASN.1 is that it is platform independent.
structured so that the model can be browsed by application
Data encoded using ASN.1 is easy to pass from client to server
developers. By entering a keyword it is possible to traverse the
or across computer-to-computer interfaces, because of the byte-
model from a term, to a coded attribute, to an observation, and
oriented nature of a BER-encoded string.
then to events that contain that observation. For example, by
There are limitations to the use of ASN.1. As previously entering the term penicillin, one could find that the concept
mentioned, it is not a true object modeling language. It does penicillin can be taken as the value of the coded attribute drug,
not include the definition of methods and behaviors as part of and that the drug attribute participates in pharmacy order
type definitions. events, medication administration events, drug allergy events,
and antibiotic susceptibility results. The definitions for these
A second limitation, at least in our implementation, is that
events can then be shown to the developer who can choose the
our type definitions cannot support a generative grammar. [9,
events of interest for her application. The information model is
26]. The subtype definitions, at the granularity that we have
in place to support this kind of browsing, but we are just now
created them, could be instantiated in nonsensical ways. This
creating the tools that will support this kind of user interface to
was intentional. The work of making all of the definitions
the model.
explicit before the system could be used was a practical
impediment to implementation. By leaving some vagueness in It would be interesting to do a comparison of not only the
the definitions, we risk storing data that is ill structured or expressive power of various modeling notations, but also the
nonsensical. On the other hand, we have been able to degree to which they simplify implementation of a system.
implement systems quickly, and we can continue to refine the Potential candidates for a comparison would include ASN.1,
subtype definitions as time permits. IDL (Interface Definition Language), Conceptual Graphs, UML
(Unified Modeling Language), and GRAIL. An impediment to
such a comparison would be defining a set of measurable
Conclusion
characteristics on which to base the comparison.
One discovery that we made in the development of our Finally, we would like to investigate other formally defined
clinical information model was only tangentially related to the clinical information models to see how we might incorporate
use of ASN.1. We found that there were a number of issues their information into our model. We recognize that we have
about medical information representation that could not be only just begun the process of defining all of the necessary
discussed intelligently without reference to a formal notation. ASN.1 types and subtypes that will be needed to support a
At some level the words normally used to communicate abstract comprehensive clinical information system. We would
ideas are not useful unless they can be rigorously defined by welcome the opportunity to collaborate with others who are
reference to objects represented in a formal syntax. This is developing compatible models in a formal notation.
certainly the case when discussing how medical vocabularies
interact with a clinical information model and a semantic References
network. Until an information model is formally defined, it is
difficult to draw consistent boundaries about what role each
1. Rector AL, Nowlan WA, Kay S, Goble CA, Howkins TJ. A
element plays in the total picture of medical information
Framework for Modelling the Electronic Medical Record.
representation.
Methods of Information in Medicine 1993;32(2):109-19.
There are a number of issues in which we are interested, but
2. Sowa JF. Toward the Expressive Power of Natural
have not had time to develop or pursue. We can see relevance
Language. In: Sowa JF, ed. Principles of Semantic
of ASN.1 subtyping as a mechanism to represent the
Networks. San Mateo, CA: Morgan Kaufmann Publishers,
decomposition of multi-word medical expressions into more
Inc., 1991:157-89.
atomic elements, but have not finalized the model that would
implement a solution. The approach that we envision is to 3. Campbell KE, Das AK, Musen MA. A Logical Foundation
progressively define more specific subtypes of our existing data for Representation of Clinical Data. Journal of the
types. For example, LocationDescription (Figure 19), could be American Medical Informatics Association 1994;1(3):218-
used to show the association between the expression “right 32.
upper arm” and its decomposition as BodyAnatomySite =
“arm,” BodyLevel = “upper,” and BodySide = “right.”