SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
CDISC2RDF
Making clinical data standards linkable, computable and queryable
The CDISC2RDF initiative exploits Semantic
Web standards and Linked Data principles for
clinical data standards from CDISC (Clinical
Data Interchange Standards Consortium).
Introduction
Clinical data standards have been identified as one of five
initial areas by the TransCelerate BioPharma, the non-profit
organization formed by ten leading pharmaceutical companies,
to accelerate the development of new medicines.
The European Medicines Agency (EMA) is developing a policy
on the proactive publication of clinical-trial data in the interests
of public health including clear and understandable clinical
data formats. The FDA has a long-held goal of making better
use of submitted clinical trial data. Pharmaceutical companies
have attempted to use submission standards to create study
repositories.
Exploiting Semantic Web technologies stands to simplify the
interpretation of individual studies, and improve cross-study
integration.
Kerstin Forsberg, Informatics Scientist
kerstin.l.forsberg@astrazeneca.com
Analysis, Informatics & Knowledge Engineering Practice, AstraZeneca, Sweden
CDISC2RDF Schemas
The first version of the core CDISC2RDF schemas were
intentionally developed to represent a minimal part of the
ISO11179 model for metadata registries.
The Meta Model Schema (mms) represents the core Data
Description part of the ISO11179 model, Part 3: Registry
metamodel and basic attributes
From human readable documentation and “Text strings”
In the domain of clinical research CDISC, a non-profit
organization, have developed standards for study design
(SDM), study data collection (CDASH), study data analysis
(ADAM), and submission to the regulatory bodies (SDTM).
These represent a limited set of data elements with names
such as “RACE“, that also have a value set derived from NCI
Thesaurus. However, most of the data elements are
containers for contextual variables with names such as
“VSDATE” and “AEACN” (Date of measurement of Vital Signs and
Action Taken for Adverse events), and of the data elements for
the results of the measurements. These are indirectly indicated
in variables called “TESTCD” with a term, or rather a text string
such as “DIABP”, “BMI”, “HGB” representing the measurement
procedures, “ listed in the so called controlled terminologies
(CT) for SDTM (Study Data Tabulation Model).
Today all data standards and controlled terminologies, are
published as PDF:s, Excel , and traditional XML, by CDISC
and NCI EVS.
Human readable documentation in
PDF:s, Excel:s (and some in XML)
CDISC2RDF Schemas
(based on the core of ISO11179)
Machine processable linked
data structured as RDF triples
Meta model schema
(mms)
(Data definition, the core part of ISO 11179)
Controlled Terminology schema
(cts)
(a few additional properties
from the NCI Thesaurus export)
SDTM 1.2 schema
(sdtms)
(classifiers: Data Element roles and types)
SDTM 3.1.2 IG schema (sdtmigs)
(a few additional properties)
To machine processable RDF triples and “URI:s”
The first deliverable from the CDISC2RDF project was
published early 2013. It contained OWL/RDF files (triples) for
CDISC submission standards: SDTM 1.2, Implementation
Guideline (IG) 3.1.2 and Controlled Terminology (CT), plus
CTs for data capture standards (CDASH) and analysis
standards (ADaM).
Each data element / column, dataset, code list, classifier etc.
have got URI:s (Uniform Resource Identifiers) assigned to
them:
Meta model schema
(mms)
(Data definition, the core part of ISO 11179)
The SDTM schema (sdtms) version 1.2 defines additional
classifiers in the underlying model such as the data
element role: Record Qualifier and also Expected variable.
The Controlled Terminology schema (cts) adds to the
metadata model schema (mms) a few additional
classifications and properties to represent the existing NCI
Thesaurus EVS export.
The classes and properties are being used to annotate the
Excel column headers and the standard import
functionality in the TopBraid Composer tool have been
used to create the RDF triples in XML, Turtle, and JSON
formats.
CDISC2RDF started as a cross-pharma pre-
competitive project with AstraZeneca, Roche,
TopQuadrant, Free University of Amsterdam
and W3C HCLS to show case the use of
Semantic Web standards and Linked Data
principles.
It is now incorporated in the Semantic
Technology project, part of the FDA/PhUSE
working group on Emerging Technologies with
representatives across FDA, CDISC, pharmas,
CRO:s and software vendors.
We want to push back to CDISC and NCI, and other public and internal standard
groups, and show in practice how to “Use (semantic web) standards for standards”
http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEACN
http://rdf.cdisc.org/sdtmig-3-1-2/std#Table.AE
http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier
All OWL/RDF files, schemas and standards
are available on https://code.google.com/p/cdisc2rdf/

Mais conteúdo relacionado

Mais procurados

FAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODSFAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODSFelipe Gutierrez
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesValeria Pesce
 
CDISC's CDASH and SDTM: Why You Need Both!
CDISC's CDASH and SDTM: Why You Need Both!CDISC's CDASH and SDTM: Why You Need Both!
CDISC's CDASH and SDTM: Why You Need Both!Kit Howard
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadataStatistics Specialist
 
How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesValeria Pesce
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesManjulaPatel
 
Metid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for ScienceMetid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for Scienceale93756
 
Cdisc sdtm implementation_process _v1
Cdisc sdtm implementation_process _v1Cdisc sdtm implementation_process _v1
Cdisc sdtm implementation_process _v1ray4hz
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...ijcsit
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Role of Data Cleaning in Data Warehouse
Role of Data Cleaning in Data WarehouseRole of Data Cleaning in Data Warehouse
Role of Data Cleaning in Data WarehouseRamakant Soni
 
FAIRsharing and Core Data Resources - RDA, March 2018
FAIRsharing and Core Data Resources - RDA, March 2018FAIRsharing and Core Data Resources - RDA, March 2018
FAIRsharing and Core Data Resources - RDA, March 2018Susanna-Assunta Sansone
 
Usage of open source software for Real World Data Analysis in pharmaceutical ...
Usage of open source software for Real World Data Analysis in pharmaceutical ...Usage of open source software for Real World Data Analysis in pharmaceutical ...
Usage of open source software for Real World Data Analysis in pharmaceutical ...Kees van Bochove
 

Mais procurados (18)

FAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODSFAIR sequencing data repository based on iRODS
FAIR sequencing data repository based on iRODS
 
CDISC-CDASH
CDISC-CDASHCDISC-CDASH
CDISC-CDASH
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
 
CDISC's CDASH and SDTM: Why You Need Both!
CDISC's CDASH and SDTM: Why You Need Both!CDISC's CDASH and SDTM: Why You Need Both!
CDISC's CDASH and SDTM: Why You Need Both!
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadata
 
How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issues
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
 
Metid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for ScienceMetid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for Science
 
Cdisc sdtm implementation_process _v1
Cdisc sdtm implementation_process _v1Cdisc sdtm implementation_process _v1
Cdisc sdtm implementation_process _v1
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
 
Webinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BDWebinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BD
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Role of Data Cleaning in Data Warehouse
Role of Data Cleaning in Data WarehouseRole of Data Cleaning in Data Warehouse
Role of Data Cleaning in Data Warehouse
 
FAIRsharing and Core Data Resources - RDA, March 2018
FAIRsharing and Core Data Resources - RDA, March 2018FAIRsharing and Core Data Resources - RDA, March 2018
FAIRsharing and Core Data Resources - RDA, March 2018
 
Metadata: A concept
Metadata: A conceptMetadata: A concept
Metadata: A concept
 
scopeKM: Text analysis with Triples
scopeKM: Text analysis with TriplesscopeKM: Text analysis with Triples
scopeKM: Text analysis with Triples
 
Usage of open source software for Real World Data Analysis in pharmaceutical ...
Usage of open source software for Real World Data Analysis in pharmaceutical ...Usage of open source software for Real World Data Analysis in pharmaceutical ...
Usage of open source software for Real World Data Analysis in pharmaceutical ...
 

Semelhante a CDISC2RDF poster for Conference on Data Integration in the Life Sciences 2013

Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsHealth Informatics New Zealand
 
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.pptHCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.pptMadeeshShaik
 
Decoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data StandardsDecoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data Standardsd-Wise Technologies
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platformibemam
 
CDISC SDTM Domain Presentation
CDISC SDTM Domain PresentationCDISC SDTM Domain Presentation
CDISC SDTM Domain PresentationAnkur Sharma
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Syed Ahmad Chan Bukhari, PhD
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Ahmad C. Bukhari
 
Health Informatics- Module 2-Chapter 2.pptx
Health Informatics- Module 2-Chapter 2.pptxHealth Informatics- Module 2-Chapter 2.pptx
Health Informatics- Module 2-Chapter 2.pptxArti Parab Academics
 
Interpreting CDISC ADaM IG through Users Interpretation
Interpreting CDISC ADaM IG through Users InterpretationInterpreting CDISC ADaM IG through Users Interpretation
Interpreting CDISC ADaM IG through Users InterpretationAngelo Tinazzi
 
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.pptcdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.pptpoonamshukla311
 
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...Gunjan Patel
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousingJuliaWilson68
 
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdf
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdfWhat are 3 of the main functions of the HL7 StandardDiscuss the i.pdf
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdfrbjain2007
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardStuart Chalk
 
An overview of clinical data repository
An overview of clinical data repositoryAn overview of clinical data repository
An overview of clinical data repositoryNetrah Laxminarayanan
 
Developing MDR Requirements and Operational Implementation
Developing MDR Requirements and Operational ImplementationDeveloping MDR Requirements and Operational Implementation
Developing MDR Requirements and Operational Implementationd-Wise Technologies
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data ManagementOpenAIRE
 

Semelhante a CDISC2RDF poster for Conference on Data Integration in the Life Sciences 2013 (20)

Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information Systems
 
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.pptHCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
 
CDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.pptCDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.ppt
 
Decoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data StandardsDecoding the Acronyms in Clinical Data Standards
Decoding the Acronyms in Clinical Data Standards
 
eTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service PlatformeTRIKS Data Harmonization Service Platform
eTRIKS Data Harmonization Service Platform
 
CDISC SDTM Domain Presentation
CDISC SDTM Domain PresentationCDISC SDTM Domain Presentation
CDISC SDTM Domain Presentation
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
Health Informatics- Module 2-Chapter 2.pptx
Health Informatics- Module 2-Chapter 2.pptxHealth Informatics- Module 2-Chapter 2.pptx
Health Informatics- Module 2-Chapter 2.pptx
 
Interpreting CDISC ADaM IG through Users Interpretation
Interpreting CDISC ADaM IG through Users InterpretationInterpreting CDISC ADaM IG through Users Interpretation
Interpreting CDISC ADaM IG through Users Interpretation
 
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.pptcdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt
cdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt
 
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...
HL7 3.0 Clinical Interoperability to Improve Quality and the point of care EH...
 
Introduction to SDTM
Introduction to SDTMIntroduction to SDTM
Introduction to SDTM
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Pivoting approach-eav-data-dinu-2006
Pivoting approach-eav-data-dinu-2006Pivoting approach-eav-data-dinu-2006
Pivoting approach-eav-data-dinu-2006
 
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdf
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdfWhat are 3 of the main functions of the HL7 StandardDiscuss the i.pdf
What are 3 of the main functions of the HL7 StandardDiscuss the i.pdf
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
An overview of clinical data repository
An overview of clinical data repositoryAn overview of clinical data repository
An overview of clinical data repository
 
Developing MDR Requirements and Operational Implementation
Developing MDR Requirements and Operational ImplementationDeveloping MDR Requirements and Operational Implementation
Developing MDR Requirements and Operational Implementation
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 

Mais de Kerstin Forsberg

Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zenecaKerstin Forsberg
 
Linked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareLinked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareKerstin Forsberg
 
Linked data presentation for who umc 21 jan 2015
Linked data presentation for who umc 21 jan 2015Linked data presentation for who umc 21 jan 2015
Linked data presentation for who umc 21 jan 2015Kerstin Forsberg
 
A Justification-based Semantic Framework for Representing, Evaluating and Uti...
A Justification-based Semantic Framework for Representing, Evaluating and Uti...A Justification-based Semantic Framework for Representing, Evaluating and Uti...
A Justification-based Semantic Framework for Representing, Evaluating and Uti...Kerstin Forsberg
 
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings Kerstin Forsberg
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Kerstin Forsberg
 
Pushing back, standards and standard organizations in a Semantic Web enabled ...
Pushing back, standards and standard organizations in a Semantic Web enabled ...Pushing back, standards and standard organizations in a Semantic Web enabled ...
Pushing back, standards and standard organizations in a Semantic Web enabled ...Kerstin Forsberg
 
Linked open data it univ 22 nov 2012
Linked open data it univ 22 nov 2012Linked open data it univ 22 nov 2012
Linked open data it univ 22 nov 2012Kerstin Forsberg
 
Linked open data example uk spending
Linked open data example uk spendingLinked open data example uk spending
Linked open data example uk spendingKerstin Forsberg
 
Semantic models for cdisc based standards and metadata management (1)
Semantic models for cdisc based standards and metadata management (1)Semantic models for cdisc based standards and metadata management (1)
Semantic models for cdisc based standards and metadata management (1)Kerstin Forsberg
 
Semantic models for cdisc based standards and metadata management
Semantic models for cdisc based standards and metadata managementSemantic models for cdisc based standards and metadata management
Semantic models for cdisc based standards and metadata managementKerstin Forsberg
 
Linked data in pharma it univ 2 april 2012
Linked data in pharma it univ 2 april 2012Linked data in pharma it univ 2 april 2012
Linked data in pharma it univ 2 april 2012Kerstin Forsberg
 
Linked data introduction w exempel
Linked data introduction w exempelLinked data introduction w exempel
Linked data introduction w exempelKerstin Forsberg
 
Linking clinical data standards
Linking clinical data standardsLinking clinical data standards
Linking clinical data standardsKerstin Forsberg
 
Metadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesMetadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesKerstin Forsberg
 

Mais de Kerstin Forsberg (20)

Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
Linked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareLinked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcare
 
Linked data presentation for who umc 21 jan 2015
Linked data presentation for who umc 21 jan 2015Linked data presentation for who umc 21 jan 2015
Linked data presentation for who umc 21 jan 2015
 
A Justification-based Semantic Framework for Representing, Evaluating and Uti...
A Justification-based Semantic Framework for Representing, Evaluating and Uti...A Justification-based Semantic Framework for Representing, Evaluating and Uti...
A Justification-based Semantic Framework for Representing, Evaluating and Uti...
 
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
MIE2014: A Framework for Evaluating and Utilizing Medical Terminology Mappings
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium
 
Pushing back, standards and standard organizations in a Semantic Web enabled ...
Pushing back, standards and standard organizations in a Semantic Web enabled ...Pushing back, standards and standard organizations in a Semantic Web enabled ...
Pushing back, standards and standard organizations in a Semantic Web enabled ...
 
Cdisc2 rdf overveiw
Cdisc2 rdf overveiwCdisc2 rdf overveiw
Cdisc2 rdf overveiw
 
Linked open data it univ 22 nov 2012
Linked open data it univ 22 nov 2012Linked open data it univ 22 nov 2012
Linked open data it univ 22 nov 2012
 
Linked open data example uk spending
Linked open data example uk spendingLinked open data example uk spending
Linked open data example uk spending
 
Semantic models for cdisc based standards and metadata management (1)
Semantic models for cdisc based standards and metadata management (1)Semantic models for cdisc based standards and metadata management (1)
Semantic models for cdisc based standards and metadata management (1)
 
Semantic models for cdisc based standards and metadata management
Semantic models for cdisc based standards and metadata managementSemantic models for cdisc based standards and metadata management
Semantic models for cdisc based standards and metadata management
 
Linked data in pharma it univ 2 april 2012
Linked data in pharma it univ 2 april 2012Linked data in pharma it univ 2 april 2012
Linked data in pharma it univ 2 april 2012
 
Linked data introduction w exempel
Linked data introduction w exempelLinked data introduction w exempel
Linked data introduction w exempel
 
Linking clinical data standards
Linking clinical data standardsLinking clinical data standards
Linking clinical data standards
 
Linked data in pharma
Linked data in pharmaLinked data in pharma
Linked data in pharma
 
Linked data in pharma R&D
Linked data in pharma R&DLinked data in pharma R&D
Linked data in pharma R&D
 
Mobile Newsmaking
Mobile NewsmakingMobile Newsmaking
Mobile Newsmaking
 
Metadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesMetadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiences
 
Extensible use of RDF
Extensible use of RDFExtensible use of RDF
Extensible use of RDF
 

CDISC2RDF poster for Conference on Data Integration in the Life Sciences 2013

  • 1. CDISC2RDF Making clinical data standards linkable, computable and queryable The CDISC2RDF initiative exploits Semantic Web standards and Linked Data principles for clinical data standards from CDISC (Clinical Data Interchange Standards Consortium). Introduction Clinical data standards have been identified as one of five initial areas by the TransCelerate BioPharma, the non-profit organization formed by ten leading pharmaceutical companies, to accelerate the development of new medicines. The European Medicines Agency (EMA) is developing a policy on the proactive publication of clinical-trial data in the interests of public health including clear and understandable clinical data formats. The FDA has a long-held goal of making better use of submitted clinical trial data. Pharmaceutical companies have attempted to use submission standards to create study repositories. Exploiting Semantic Web technologies stands to simplify the interpretation of individual studies, and improve cross-study integration. Kerstin Forsberg, Informatics Scientist kerstin.l.forsberg@astrazeneca.com Analysis, Informatics & Knowledge Engineering Practice, AstraZeneca, Sweden CDISC2RDF Schemas The first version of the core CDISC2RDF schemas were intentionally developed to represent a minimal part of the ISO11179 model for metadata registries. The Meta Model Schema (mms) represents the core Data Description part of the ISO11179 model, Part 3: Registry metamodel and basic attributes From human readable documentation and “Text strings” In the domain of clinical research CDISC, a non-profit organization, have developed standards for study design (SDM), study data collection (CDASH), study data analysis (ADAM), and submission to the regulatory bodies (SDTM). These represent a limited set of data elements with names such as “RACE“, that also have a value set derived from NCI Thesaurus. However, most of the data elements are containers for contextual variables with names such as “VSDATE” and “AEACN” (Date of measurement of Vital Signs and Action Taken for Adverse events), and of the data elements for the results of the measurements. These are indirectly indicated in variables called “TESTCD” with a term, or rather a text string such as “DIABP”, “BMI”, “HGB” representing the measurement procedures, “ listed in the so called controlled terminologies (CT) for SDTM (Study Data Tabulation Model). Today all data standards and controlled terminologies, are published as PDF:s, Excel , and traditional XML, by CDISC and NCI EVS. Human readable documentation in PDF:s, Excel:s (and some in XML) CDISC2RDF Schemas (based on the core of ISO11179) Machine processable linked data structured as RDF triples Meta model schema (mms) (Data definition, the core part of ISO 11179) Controlled Terminology schema (cts) (a few additional properties from the NCI Thesaurus export) SDTM 1.2 schema (sdtms) (classifiers: Data Element roles and types) SDTM 3.1.2 IG schema (sdtmigs) (a few additional properties) To machine processable RDF triples and “URI:s” The first deliverable from the CDISC2RDF project was published early 2013. It contained OWL/RDF files (triples) for CDISC submission standards: SDTM 1.2, Implementation Guideline (IG) 3.1.2 and Controlled Terminology (CT), plus CTs for data capture standards (CDASH) and analysis standards (ADaM). Each data element / column, dataset, code list, classifier etc. have got URI:s (Uniform Resource Identifiers) assigned to them: Meta model schema (mms) (Data definition, the core part of ISO 11179) The SDTM schema (sdtms) version 1.2 defines additional classifiers in the underlying model such as the data element role: Record Qualifier and also Expected variable. The Controlled Terminology schema (cts) adds to the metadata model schema (mms) a few additional classifications and properties to represent the existing NCI Thesaurus EVS export. The classes and properties are being used to annotate the Excel column headers and the standard import functionality in the TopBraid Composer tool have been used to create the RDF triples in XML, Turtle, and JSON formats. CDISC2RDF started as a cross-pharma pre- competitive project with AstraZeneca, Roche, TopQuadrant, Free University of Amsterdam and W3C HCLS to show case the use of Semantic Web standards and Linked Data principles. It is now incorporated in the Semantic Technology project, part of the FDA/PhUSE working group on Emerging Technologies with representatives across FDA, CDISC, pharmas, CRO:s and software vendors. We want to push back to CDISC and NCI, and other public and internal standard groups, and show in practice how to “Use (semantic web) standards for standards” http://rdf.cdisc.org/sdtmig-3-1-2/std#Column.AE.AEACN http://rdf.cdisc.org/sdtmig-3-1-2/std#Table.AE http://rdf.cdisc.org/sdtm-1-2/schema#Classifier.RecordQualifier All OWL/RDF files, schemas and standards are available on https://code.google.com/p/cdisc2rdf/