SlideShare uma empresa Scribd logo
1 de 17
Semantic Data Normalization
For
Efficient Clinical Trial Research
September 8th, 2016
• The specifics of clinical data
• What is RDF and how we can use it together with TA?
• Semantic annotations and their limitations
• What is semantic data normalization?
• Current state and next steps
Outline
September 8th, 2016
• Unstructured (Semi-Structured)
• Abundant
• Redundant
• Ambiguous
• Aggregated
Clinical Data
September 8th, 2016
In order to transform your clinical data into information and even knowledge, you will have to
analyze it!
… but before that you have to make it ready for the analysis!
September 8th, 2016
What is RDF
RDF data model resolves all syntax level ambiguities
It helps you express all data in a common data model
ID GRAA_HUMAN STANDARD; PRT; 262 AA.
AC P12544; DT 01-OCT-1989 (Rel. 12, Created)
DT 01-OCT-1989 (Rel. 12, Last sequence update)
DT 15-JUN-2002 (Rel. 41, Last annotation update)
DE Granzyme A precursor (EC 3.4.21.78) (Cytotoxic T-
lymphocyte proteinase
DE 1) (Hanukkah factor) (H factor) (HF) (Granzyme 1)
(CTL tryptase)
DE (Fragmentin 1). GN
GZMA OR CTLA3 OR HFSP. OS Homo sapiens
(Human).
<PubmedArticle> <MedlineCitation Owner="NLM"
Status="In-Process"> <PMID
Version="1">21500419</PMID> <DateCreated>
<Year>2011</Year> <Month>04</Month>
<Day>15</Day> </DateCreated> <Article
PubModel="Print"> <Journal> <ISSN
IssnType="Electronic">1520-6882</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>82</Volume> <Issue>20</Issue>
<PubDate> <Year>2010</Year>
<Month>Oct</Month> <Day>15</Day>
</PubDate> </JournalIssue>
Linked Data
How well interlinked is the linked data cloud?
•Many interesting queries are difficult to be expressed in SPARQL
•String functions could not be index
•Often there are misplaced identifiers
P29965
UNIPROT
CD40L_HUMAN
cpath:CPATH-94138
cpath:CPATH-LOCAL-8467065
cpath:CPATH-LOCAL-8749236
uniprot:P29965
CD40L_HUMAN
TNF5_HUMAN
CD4L_HUMAN
#5
September 8th, 2016
Semantic Annotations
pmid:17714090
umls:C0035204
COPD
Bronchial Diseases
Respiration Disorders
umls:C0006261
Chronic Obstructive
Airway Diseases
Asthma umls:C000496
Ian A Yang
Clinical and experimental pharmacology …
September 8th, 2016
• Good for:
– Generation of machine readable meta data
– Semantic indexing of large sets of documents
– Providing additional background knowledge
• Limitations:
– Incomplete knowledge extraction
– Does not capture completely the context
Semantic Annotations
September 8th, 2016
• What is it?
– A text analytics approach that aims to capture the full
context of the information and to provide clear references to
concepts/objects in order to be easily interpreted by
machines.
• How we do it?
– Work on sentence level
– Extract the key phrases from the sentence
– Identify the main concept
– Identify all the qualifiers and negations
– Model the extracted data as RDF
Semantic Data Normalization
September 8th, 2016
Semantic Data Normalization
September 8th, 2016
• Condition text:
– “Advanced Biliary Tract Adenocarcinoma” (Study ID = NCT01506973)
• Text Analysis
– One phrase is identified in the Condition text
– Advanced Biliary Tract Adenocarcinoma
• Data Schema
– One annotation object is created
– Main concept is “Adenocarcinoma”
– Qualifier concepts are “Advanced” and “Biliary tract”
Semantic Data Normalization
September 8th, 2016
NCT01506973
rdf:type ClinicalTrial
ct:conditionText “Advanced Biliary Tract Adenocarcinoma”
ct:conditionAnnotation ConditionAnnotationID
ca:hasDisease
C0001418
ca:hasPhrase
“Advanced Biliary Tract Adenocarcinoma”
ca:hasQualifiers
QualifierGroupID
C0205179 C0005423
cg:hasQualifiers
• Study Conditions
– Multiple phrases in a text
– Pre-coordinated concepts vs. post-coordinated
– Scoring of matching concepts
• Study Interventions
– Drug, route, form
– Drug dosage
• Adverse Events
– Normalization of AE
– Post-coordinated concepts
• Eligibility Criteria
– Semantic sectioning and categorization
– Negations
– Diseases, findings, treatments, age and gender
Demo Example
September 8th, 2016
Intervention Annotation Model - Drugs
September 8th, 2016
NCT01506973
rdf:type ClinicalTrial
ct:hasIntervention
in:drugAnnotation
DrugAnnotationID
da:hasDrug
111418
da:hasAdministrationRoute
do:hasSingleDose
DrugDosageID
SingleDoseID PeriodID
do:hasPeriod
NCT01506973_1_2
SCTID:111418
SCTID:121681
da:hasDosage
do:hasFrequency
FrequencyID
Value Unit
Denominator
Value
Denominator
Unit
da:hasAdministrationForm
Criteria Annotation Model
September 8th, 2016
NCT01506973
rdf:type ClinicalTrial
ct:hasCriteriaSection
cs:hasCriterion
Criterion
cr:hasText
cr:hasAnnotation
CriteriaSection
AnnotationId
sa:Negation
rdf:type “Inclusion”/”Exclusion”/”Not defined”
cs:hasText
…
No extensive intraductal components on core
biopsy, defined as intraductal carcinoma.
Patients must not have recurrent invasive breast
cancer.
…
Patients must not have recurrent invasive breast
cancer.
“Disease”/”Drug”/…rdf:type
“True”/”False”/…Property 1Property 2Property N
• Work with ClinicalTrials.gov data as public show case
– > 215K clinical studies
– > 76 million RDF statements
• Coverage
– Conditions (197,154 objects)
– Diseases, Findings, Body locations, Qualifiers
– Interventions (rdf:type = ‘Drug’ and rdf:type = ‘Biologics’) – (381,590 objects)
– Drugs, Dosages, Administration form, Administration route, Population group
– Adverse Events – (1,226,754 objects)
– Diseases, Findings, Body locations, Qualifiers
– Criteria (semantic sectioning and categorization, negations) – (7,216,361 objects)
– Diseases, Findings, Drugs, Population groups
• In total more than 80 millions of RDF triples
Current Status
September 8th, 2016
• Directly mine the public enhanced CT.gov version
• Apply the same approach over your internal clinical trials data
• Once the data is semantically normalized you can “slice and
dice” it as your use case requires
• Examples
– Top-bottom data exploration
– Linked data browsing
How Can I Use This?
September 8th, 2016
Next Steps
• Release RDFized version of ClinicalTrials.gov
• Pre-loaded in GraphDB Free
• Pre-loaded on Ontotext S4 Cloud
• As RDF serialization distribution
• Release all semantically structured information
under free for non-commercial use license
• Extend the data schema to support not only
concepts but also tokens which cannot be
normalized to ontology instances
Thank You!
You can contact me by e-mail:
todor.primov@ontotext.com

Mais conteúdo relacionado

Mais procurados

3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
Lokesh Ramaswamy
 
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text MiningII-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
Dr. Haxel Consult
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
John Breslin
 

Mais procurados (18)

3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
 
Text mining
Text miningText mining
Text mining
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Implementing Semantic Search
Implementing Semantic SearchImplementing Semantic Search
Implementing Semantic Search
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
 
Text mining
Text miningText mining
Text mining
 
Harvester_presentaion
Harvester_presentaionHarvester_presentaion
Harvester_presentaion
 
Ferosa - Insights
Ferosa - InsightsFerosa - Insights
Ferosa - Insights
 
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text MiningII-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
II-SDV 2012 Making Knowledge Discoverable: The Role of Agile Text Mining
 
Big Data & Text Mining
Big Data & Text MiningBig Data & Text Mining
Big Data & Text Mining
 
Textmining Information Extraction
Textmining Information ExtractionTextmining Information Extraction
Textmining Information Extraction
 
Konrad cedem praesi
Konrad cedem praesiKonrad cedem praesi
Konrad cedem praesi
 
Text mining and data mining
Text mining and data mining Text mining and data mining
Text mining and data mining
 
Annotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation ModelAnnotating Scholarly Works - the W3C Open Annotation Model
Annotating Scholarly Works - the W3C Open Annotation Model
 
Week12
Week12Week12
Week12
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 

Destaque

Towards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic WebTowards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic Web
Jie Bao
 
Concept Modeling on Semantic Wiki
Concept Modeling on Semantic WikiConcept Modeling on Semantic Wiki
Concept Modeling on Semantic Wiki
Jie Bao
 
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
Technische Universität München
 
Semantic Modeling - A Query Language for the 21st century
Semantic Modeling - A Query Language for the 21st centurySemantic Modeling - A Query Language for the 21st century
Semantic Modeling - A Query Language for the 21st century
Clifford Heath
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations I
cd_crisci
 

Destaque (16)

First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging News
 
Building Semantic Web Portals with WebML
Building Semantic Web Portals with WebMLBuilding Semantic Web Portals with WebML
Building Semantic Web Portals with WebML
 
A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
A Semantic Multimedia Web (Part 1)
A Semantic Multimedia Web (Part 1)A Semantic Multimedia Web (Part 1)
A Semantic Multimedia Web (Part 1)
 
Towards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic WebTowards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic Web
 
A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)
 
Concept Modeling on Semantic Wiki
Concept Modeling on Semantic WikiConcept Modeling on Semantic Wiki
Concept Modeling on Semantic Wiki
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 
A Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsA Semantic Data Model for Web Applications
A Semantic Data Model for Web Applications
 
Freebase Schema
Freebase SchemaFreebase Schema
Freebase Schema
 
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
Smart Models for Smart Cities - Modeling of Dynamics, Sensors, Urban Indicato...
 
Freebase, RDF and the Semantic Web
Freebase, RDF and the Semantic WebFreebase, RDF and the Semantic Web
Freebase, RDF and the Semantic Web
 
Semantic Modeling - A Query Language for the 21st century
Semantic Modeling - A Query Language for the 21st centurySemantic Modeling - A Query Language for the 21st century
Semantic Modeling - A Query Language for the 21st century
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations I
 
Modeling and Representing Trust Relations in Semantic Web-Driven Social Networks
Modeling and Representing Trust Relations in Semantic Web-Driven Social NetworksModeling and Representing Trust Relations in Semantic Web-Driven Social Networks
Modeling and Representing Trust Relations in Semantic Web-Driven Social Networks
 

Semelhante a Semantic Data Normalization For Efficient Clinical Trial Research

Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
Chimezie Ogbuji
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
Matthieu Schapranow
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Data Consortium
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
Snow Owl
 

Semelhante a Semantic Data Normalization For Efficient Clinical Trial Research (20)

Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision Medicine
 
Starting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer ResearchStarting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer Research
 
Starting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer ResearchStarting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer Research
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha Noy
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
Archetypes and FHIR by Koray Atalag
Archetypes and FHIR by Koray AtalagArchetypes and FHIR by Koray Atalag
Archetypes and FHIR by Koray Atalag
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
 
Controlled vocabularies for medical and health research
Controlled vocabularies for medical and health researchControlled vocabularies for medical and health research
Controlled vocabularies for medical and health research
 
Data Quality in Healthcare: An Important Challenge
Data Quality in Healthcare: An Important ChallengeData Quality in Healthcare: An Important Challenge
Data Quality in Healthcare: An Important Challenge
 
Gaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical DataGaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical Data
 
Schema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataSchema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive Data
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
 
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
 
Big Data in Clinical Research
Big Data in Clinical ResearchBig Data in Clinical Research
Big Data in Clinical Research
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
Canadian health census to lod
Canadian health census to lodCanadian health census to lod
Canadian health census to lod
 

Mais de Ontotext

Mais de Ontotext (20)

Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 

Último

Electrocardiogram (ECG) physiological basis .pdf
Electrocardiogram (ECG) physiological basis .pdfElectrocardiogram (ECG) physiological basis .pdf
Electrocardiogram (ECG) physiological basis .pdf
MedicoseAcademics
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Sheetaleventcompany
 
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Sheetaleventcompany
 
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
Sheetaleventcompany
 
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
Sheetaleventcompany
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
rajnisinghkjn
 
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
Sheetaleventcompany
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan 087776558899
 

Último (20)

Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
 
Electrocardiogram (ECG) physiological basis .pdf
Electrocardiogram (ECG) physiological basis .pdfElectrocardiogram (ECG) physiological basis .pdf
Electrocardiogram (ECG) physiological basis .pdf
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
Circulatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsCirculatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanisms
 
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
 
Intramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptxIntramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptx
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
 
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
 
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
👉 Amritsar Call Girls 👉📞 8725944379 👉📞 Just📲 Call Ruhi Call Girl Near Me Amri...
 
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
 
Cardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their RegulationCardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their Regulation
 
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
 
Genuine Call Girls Hyderabad 9630942363 Book High Profile Call Girl in Hydera...
Genuine Call Girls Hyderabad 9630942363 Book High Profile Call Girl in Hydera...Genuine Call Girls Hyderabad 9630942363 Book High Profile Call Girl in Hydera...
Genuine Call Girls Hyderabad 9630942363 Book High Profile Call Girl in Hydera...
 
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsAppMost Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdfShazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
 
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
💚Chandigarh Call Girls Service 💯Piya 📲🔝8868886958🔝Call Girls In Chandigarh No...
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
 

Semantic Data Normalization For Efficient Clinical Trial Research

  • 1. Semantic Data Normalization For Efficient Clinical Trial Research September 8th, 2016
  • 2. • The specifics of clinical data • What is RDF and how we can use it together with TA? • Semantic annotations and their limitations • What is semantic data normalization? • Current state and next steps Outline September 8th, 2016
  • 3. • Unstructured (Semi-Structured) • Abundant • Redundant • Ambiguous • Aggregated Clinical Data September 8th, 2016 In order to transform your clinical data into information and even knowledge, you will have to analyze it! … but before that you have to make it ready for the analysis!
  • 4. September 8th, 2016 What is RDF RDF data model resolves all syntax level ambiguities It helps you express all data in a common data model ID GRAA_HUMAN STANDARD; PRT; 262 AA. AC P12544; DT 01-OCT-1989 (Rel. 12, Created) DT 01-OCT-1989 (Rel. 12, Last sequence update) DT 15-JUN-2002 (Rel. 41, Last annotation update) DE Granzyme A precursor (EC 3.4.21.78) (Cytotoxic T- lymphocyte proteinase DE 1) (Hanukkah factor) (H factor) (HF) (Granzyme 1) (CTL tryptase) DE (Fragmentin 1). GN GZMA OR CTLA3 OR HFSP. OS Homo sapiens (Human). <PubmedArticle> <MedlineCitation Owner="NLM" Status="In-Process"> <PMID Version="1">21500419</PMID> <DateCreated> <Year>2011</Year> <Month>04</Month> <Day>15</Day> </DateCreated> <Article PubModel="Print"> <Journal> <ISSN IssnType="Electronic">1520-6882</ISSN> <JournalIssue CitedMedium="Internet"> <Volume>82</Volume> <Issue>20</Issue> <PubDate> <Year>2010</Year> <Month>Oct</Month> <Day>15</Day> </PubDate> </JournalIssue>
  • 5. Linked Data How well interlinked is the linked data cloud? •Many interesting queries are difficult to be expressed in SPARQL •String functions could not be index •Often there are misplaced identifiers P29965 UNIPROT CD40L_HUMAN cpath:CPATH-94138 cpath:CPATH-LOCAL-8467065 cpath:CPATH-LOCAL-8749236 uniprot:P29965 CD40L_HUMAN TNF5_HUMAN CD4L_HUMAN #5 September 8th, 2016
  • 6. Semantic Annotations pmid:17714090 umls:C0035204 COPD Bronchial Diseases Respiration Disorders umls:C0006261 Chronic Obstructive Airway Diseases Asthma umls:C000496 Ian A Yang Clinical and experimental pharmacology … September 8th, 2016
  • 7. • Good for: – Generation of machine readable meta data – Semantic indexing of large sets of documents – Providing additional background knowledge • Limitations: – Incomplete knowledge extraction – Does not capture completely the context Semantic Annotations September 8th, 2016
  • 8. • What is it? – A text analytics approach that aims to capture the full context of the information and to provide clear references to concepts/objects in order to be easily interpreted by machines. • How we do it? – Work on sentence level – Extract the key phrases from the sentence – Identify the main concept – Identify all the qualifiers and negations – Model the extracted data as RDF Semantic Data Normalization September 8th, 2016
  • 9. Semantic Data Normalization September 8th, 2016 • Condition text: – “Advanced Biliary Tract Adenocarcinoma” (Study ID = NCT01506973) • Text Analysis – One phrase is identified in the Condition text – Advanced Biliary Tract Adenocarcinoma • Data Schema – One annotation object is created – Main concept is “Adenocarcinoma” – Qualifier concepts are “Advanced” and “Biliary tract”
  • 10. Semantic Data Normalization September 8th, 2016 NCT01506973 rdf:type ClinicalTrial ct:conditionText “Advanced Biliary Tract Adenocarcinoma” ct:conditionAnnotation ConditionAnnotationID ca:hasDisease C0001418 ca:hasPhrase “Advanced Biliary Tract Adenocarcinoma” ca:hasQualifiers QualifierGroupID C0205179 C0005423 cg:hasQualifiers
  • 11. • Study Conditions – Multiple phrases in a text – Pre-coordinated concepts vs. post-coordinated – Scoring of matching concepts • Study Interventions – Drug, route, form – Drug dosage • Adverse Events – Normalization of AE – Post-coordinated concepts • Eligibility Criteria – Semantic sectioning and categorization – Negations – Diseases, findings, treatments, age and gender Demo Example September 8th, 2016
  • 12. Intervention Annotation Model - Drugs September 8th, 2016 NCT01506973 rdf:type ClinicalTrial ct:hasIntervention in:drugAnnotation DrugAnnotationID da:hasDrug 111418 da:hasAdministrationRoute do:hasSingleDose DrugDosageID SingleDoseID PeriodID do:hasPeriod NCT01506973_1_2 SCTID:111418 SCTID:121681 da:hasDosage do:hasFrequency FrequencyID Value Unit Denominator Value Denominator Unit da:hasAdministrationForm
  • 13. Criteria Annotation Model September 8th, 2016 NCT01506973 rdf:type ClinicalTrial ct:hasCriteriaSection cs:hasCriterion Criterion cr:hasText cr:hasAnnotation CriteriaSection AnnotationId sa:Negation rdf:type “Inclusion”/”Exclusion”/”Not defined” cs:hasText … No extensive intraductal components on core biopsy, defined as intraductal carcinoma. Patients must not have recurrent invasive breast cancer. … Patients must not have recurrent invasive breast cancer. “Disease”/”Drug”/…rdf:type “True”/”False”/…Property 1Property 2Property N
  • 14. • Work with ClinicalTrials.gov data as public show case – > 215K clinical studies – > 76 million RDF statements • Coverage – Conditions (197,154 objects) – Diseases, Findings, Body locations, Qualifiers – Interventions (rdf:type = ‘Drug’ and rdf:type = ‘Biologics’) – (381,590 objects) – Drugs, Dosages, Administration form, Administration route, Population group – Adverse Events – (1,226,754 objects) – Diseases, Findings, Body locations, Qualifiers – Criteria (semantic sectioning and categorization, negations) – (7,216,361 objects) – Diseases, Findings, Drugs, Population groups • In total more than 80 millions of RDF triples Current Status September 8th, 2016
  • 15. • Directly mine the public enhanced CT.gov version • Apply the same approach over your internal clinical trials data • Once the data is semantically normalized you can “slice and dice” it as your use case requires • Examples – Top-bottom data exploration – Linked data browsing How Can I Use This? September 8th, 2016
  • 16. Next Steps • Release RDFized version of ClinicalTrials.gov • Pre-loaded in GraphDB Free • Pre-loaded on Ontotext S4 Cloud • As RDF serialization distribution • Release all semantically structured information under free for non-commercial use license • Extend the data schema to support not only concepts but also tokens which cannot be normalized to ontology instances
  • 17. Thank You! You can contact me by e-mail: todor.primov@ontotext.com