Mais conteúdo relacionado Semelhante a Semantic models for cdisc based standards and metadata management (20) Mais de Kerstin Forsberg (18) Semantic models for cdisc based standards and metadata management1. Semantic Models for CDISC Based
Standards and Metadata Management
Presented at CDISC Interchange Europe, Stockholm, 19 April
2012, by
Kerstin Forsberg, R&D, AstraZeneca
Frederik Malfait, IMOS Consulting and Hoffmann-La Roche
© CDISC 2012 1
2. Key Message
• Things converge to create new and unique
opportunities.
The coverage and maturity of existing CDISC standards.
The establishment of these standards within the
industry.
The use of these standards as a foundation for metadata
driven systems.
The upcoming role of semantic web standards and
linked data principles.
• See also presentation and blog post from last
year’s conference: Linking Clinical Data Standards
© CDISC 2012 2
3. Two real world use of semantic web
standards and linked data principles
© CDISC 2012 3
4. Today’s Situation
• “Not if and when, but how” to best adopt CDISC
based data standards is becoming the leading
question.
• We see a variety of CDISC standards at different
levels of maturity, not linked together and
published in different formats.
• Sponsors are faced with challenges on all levels:
architecture, process, and application.
© CDISC 2012 4
5. An Emerging Insight
• The CDISC standards is all about the meaning of
what is studied in the biological and clinical reality
(often referred to as concepts).
• How these concepts are represented as data
elements from protocol to submission, and beyond.
• We are dealing with semantics and metadata for
biomedical and clinical research knowledge and
data.
• “Put semantic into the semantic”
Use semantic web standards
and linked data principles.
© CDISC 2012 5
6. RDF Triples
• Resource Description Framework (RDF)
A general model of how any piece of data, and
representations of knowledge, can be expressed
as so called triples.
subject predicate object (or value)
Stockholm type place
Stockholm capital Sweden
Stockholm subject Port cities in Sweden
Stockholm areaCode “+46-8”
“http://en.wikipedia.org/wiki/Stockholm”
Stockholm primaryTopic
© CDISC 2012 6
7. RDF Triples
• Triples can be aggregated into graphs with subject
and objects as nodes, and predicates as arcs.
type City
capital Sweden
Stockholm subject Port cities in Sweden
areaCode “+46-8”
“http://en.wikipedia.org/wiki/Stockholm”
primaryTopic
© CDISC 2012 7
8. RDF Triples
• Graphs of triples can be extended across different
sources and for different purpose.
type City
type Country
CDISC
capital Sweden
Stockholm subject Port cities in Sweden subject
CDISC
Interchange
EU 2012 areaCode “+46-8” Gothenburg
“http://en.wikipedia.org/wiki/Stockholm”
primaryTopic
© CDISC 2012 8
9. RDF Triples
• RDF Schema and the RDF based Web Ontology
Language (OWL) add a typing mechanism to
classify subjects and objects into hierarchies.
Thing subClass Place
subClass
subClass
Organization Event
Adm.Area
subClass
subClass
type subClass
Business
type City
Event type Country
CDISC
capital Sweden
type
Stockholm subject Port cities in Sweden subject
CDISC
Interchange
EU 2012 areaCode “+46-8” Gothenburg
“http://en.wikipedia.org/wiki/Stockholm”
primaryTopic
© CDISC 2012 9
10. RDF Triples
• Google, Bing (Microsoft) and Yahoo use OWL
publish a joint vocabulary.
Thing subClass Place
subClass
subClass
Organization Event
Adm.Area
subClass
subClass
subClass
Business
City
Event Country
Exempel
http://schema.org/City
© CDISC 2012 10
11. RDF Triples
• NCI use OWL to publish NCI Thesaurus (the
source for CDISC’s CT:s) in an RDF/XML format.
Hematology
Laboratory Test CDISC Laboratory CDISC Laboratory
Procedure Test Name Test
Terminology Terminology
subClass
Has NCIHD Concept in Concept in
Parent Subset Subset
Hemoglobin
Measurement
definition “A quantitative measurement of the amount of
hemoglobin present in a sample.”
NCI Thesaurus
http://ncicb.nci.nih.gov/download/evsportal.jsp
© CDISC 2012 11
12. Linked Open Data Cloud
http://lod-cloud.net/
Richard Cyganiak and Anja Jentzsch
© CDISC 2012 12
13. Real world use
• Two examples of how sponsors have started to
use semantic web standards and apply linked data
principles.
AstraZeneca:
• Integrative Informatics (i2) program establishing the
components to let a Linked Data cloud grow across
AstraZeneca R&D
Roche
• Implementing an internally built MDR.
© CDISC 2012 13
14. Roche Biomedical MDR
Schema Architecture Production
Partial / Future
CDISC
Standards
Metadata
Management
Knowledge
Management
© CDISC 2012 14
15. Roche Biomedical MDR
Content
• External content
SDTM 1.2, SDTMIG 3.1.2
NCI Thesaurus, CDISC Controlled Terminology
• Integrated Data Standards, Roche and Genentech
Safety and every Roche TA, ~ 2000 data elements
Data Collection and Data Tabulation
• Value level metadata
Lab measurements, Unit conversions, Questionnaires
• Looking at metadata for
SDTM Conformance Checking, Biomarker (HGNC), …
© CDISC 2012 15
16. Roche Biomedical MDR Production
Partial
Information Architecture Future
Transformation
Models
Study & Project
Level Metadata
Roche Global
Data Standards
CDISC
PRM CDASH SDTM ADaM Define
Data Standards
Biomedical +++ BRIDG +++ SHARE +++ NCI Thesaurus +++ Data Element Concepts +++
Domain Model
Study Data Data Data Regulatory
Design Collection Tabulation Analysis Submission
© CDISC 2012 16
17. Roche Biomedical MDR
System Architecture
Content
Management
Content
Publishing
Metadata
Repository
Single Point
of Access
© CDISC 2012 17
18. Roche Biomedical MDR
Value Proposition
• Current
Integrated knowledge, metadata, and data standards
management
System independent information asset
Single point of access
• Future
Leverage the SOA interface to create a framework for
integrated metadata driven workflow
Integrate MDR and Component Based Authoring
capabilities (study design, protocol, CSR)
© CDISC 2012 18
19. Key Message
• We now see all of these things converge to create
new and unique opportunities.
The coverage and maturity of existing CDISC standards.
The establishment of these standards within the industry
at large.
The use of these standards as a foundation for metadata
driven systems.
The upcoming role of semantic web standards and
linked data principles.
© CDISC 2012 19
44. Oh well, if you really
want that Excel sheet
© CDISC 2012 44
Notas do Editor Red marks the use of examples from http://schema.org/, i.e. Google’s, Yahoo’s and Bing’s (i.e. Microsoft’s) joint effort, webmasters can use to markup their pages in ways recognized by these search engines. See http://schema.rdfs.org/ for a RDF Schema representation of it Red marks the use of examples from http://schema.org/, i.e. Google’s, Yahoo’s and Bing’s (i.e. Microsoft’s) joint effort, webmasters can use to markup their pages in ways recognized by these search engines. See http://schema.rdfs.org/ for a RDF Schema representation of it