SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Awakening Clinical Data: Semantics for
Scalable Medical Research Informatics
                    Satya S. Sahoo
                 Division Medical Informatics
  Electrical Engineering and Computer Science Department
           Case Western Reserve University
                   Cleveland, OH, USA
Big Picture of Data in Clinical Research
143, 961 Patients per year
(e.g. Emory)                                                                          MRI: 50-100MB
                                                                                      PET: 60-100MB




                                         National Sleep Research Resource: 500 TB                        MRI, PET scans
     Patient Reports
                                                                                                      source: PRISM project, BME dept CWRU
source: PRISM project CWRU
                                                     Case Western EMU: 250 TB
  Epilepsy Monitoring Unit (EMU) Data
                                    500-600MB per patient
                                    per stay in EMU



                                                                                                         Wireless Health Data
                                                                                                         source: CWRU School of Engineering

                                                                                                         ~5.6 billion wireless
                              1-20GB each                                                                connections and growing
       Polysomnograms
                                                    Pathology Reports, Tissue Bank
       source: Physio-MIMI, PRISM CWRU                    source: NLM and Wikipedia
Big Picture of Data in Clinical Research
143, 961 Patients per year
(e.g. Emory)                                          MRI: 50-100MB
                      •  Ultra large volume of data and growing rapidly
                                                      PET: 60-100MB
                      •  Data is Multi-modal, Heterogeneous
                      •  Heterogeneity: Syntactic, Structural, Semantic

                                         National Sleep Research Resource: 500 TB        MRI, PET scans
     Patient Reports
                                                                                      source: PRISM project, BME dept CWRU
source: PRISM project CWRU
                                                     Case Western EMU: 250 TB
  Epilepsy Monitoring Unit (EMU) Data
                                    500-600MB per patient
                                    per stay in EMU



                                                                                         Wireless Health Data
                                                                                         source: CWRU School of Engineering

                                                                                         ~5.6 billion wireless
                              1-20GB each                                                connections and growing
       Polysomnograms
                                                    Pathology Reports, Tissue Bank
       source: Physio-MIMI, PRISM CWRU                    source: NLM and Wikipedia
Scalability in Medical Informatics: Beyond Volume
                                         Exemplar: Sleep Medicine Research




                                                                                   MRI, PET scans
     Patient Reports
                                                                                source: PRISM project, BME dept CWRU
source: PRISM project CWRU


  Epilepsy Monitoring Unit (EMU) Data




                                                                                   Wireless Health Data
                                                                                   source: CWRU School of Engineering




       Polysomnograms
                                              Pathology Reports, Tissue Bank
       source: Physio-MIMI, PRISM CWRU              source: NLM and Wikipedia
Scalability in Medical Informatics: Beyond Volume
                                         Exemplar: Sleep Medicine Research




            •  Multi-Center Studies with differing
                administrative requirements – business logicscans
   Patient Reports
                                                      MRI, PET
                                                                                source: PRISM project, BME dept CWRU
source: PRISM project CWRU

            •  Dynamic data – grows over project duration
  Epilepsy Monitoring Unit (EMU) Data
            •  Data Semantics as foundation to support a
                wide spectrum of users – clinicians, nurse
                practitioners, research fellows
                                                                                   Wireless Health Data
                                                                                   source: CWRU School of Engineering




       Polysomnograms
                                              Pathology Reports, Tissue Bank
       source: Physio-MIMI, PRISM CWRU              source: NLM and Wikipedia
A Wish List for Scalable Clinical Data Management
•  Reconcile Data Heterogeneity – most critical to successful
   translational research
   o  Syntactic heterogeneity – less of a problem, data dictionaries
      help
   o  Structural heterogeneity – problematic, XML somewhat helpful
   o  Semantic heterogeneity – a huge problem, ontologies to the
      rescue?
•  Provenance – essential for data quality, compliance, insight
   o  Blood Oxygen Baseline: oxygen saturation during the first 15 or
      30 seconds of sleep
   o  Patient blood report last month cause of change in medication
      – Domain Provenance (not just tuple provenance)
•  Intuitive access to information – clinical trials eligibility,
   cohort identification
•  Scalable - Data sources, research partners added or removed
   dynamically
A “not to do” list for Clinical Data Management




                                         Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch




•  No Linked Open Patient Data – HIPAA, HITECH
   Act (US), Data Protection Act (UK)
  o  De-identified data – IRB approval
•  Ontology as global schema – but no RDF
  o  Vast majority as RDB
  o  Practical issues with RDF – cannot be institution-
     specific URI (privacy)
Physio-MIMI: Multi‐Modality, Multi‐Resource Environment for Physiological
                              and Clinical Research

              Clinical
             Researcher




            SNOMED-CT                                 FMA
                                Sleep Domain
                          …       Ontology            OGMS



                                                                     Any
                                                                   number of
                                                                     new
                                                                    centers
Physio-MIMI: Enabling Scalable Medical Research
•  NCRR‐funded, multi‐CTSA site project: Sleep medicine as
   exemplar
•  Federated data management – scalable, adapts to changing
   data access policies
•  Ontology-driven:
   o  Data mappings – Ontology class to data dictionary terms
      (manually curated)
   o  Drive query interface
   o  Manage provenance
•  Privacy aware, IRB-compliant
•  Collaboration among Case Western, U. of Michigan,
   Marshfield Clinic and U. of Wisconsin, Madison
   o  Now Harvard Medical School
Key Resource: Sleep Domain Ontology (SDO)
           https://mimi.case.edu/concepts
Data Mappings: SDO to Data Dictionary
                       Physio-Map Module
                       •  Visual interface
                       •  Stores mappings in XML –
                       moving towards rules
                       •  Dynamically executed in response
                       to user query




       User Voting
Provenance: Contextual Metadata for Clinical
                Research




             Slide courtesy: Remo Mueller
Provenance: To Trace Variations in Data and
                 Results




             Slide courtesy: Remo Mueller
Modified from slide courtesy: Remo
Mueller
Provenance: Source information for Patient Data




                                    Slide courtesy: Remo Mueller
Intuitive Query Interface: Ontology (SDO)-driven
       Visual Aggregator and Explorer (VisAgE)


 DataSets




Ontology Concept – Type of Query Widget
PhysioMIMI in National Sleep Research Resource
•  National Sleep Research Resource (NSSR) – scored and
   awaiting funding review
•  Collaboration between Harvard Medical School (domain
   experts) and Case Western (CS) with 15 projects
    o  50,000 sleep research studies – total size of 500TB
•  Semantic Data Integration – SDO and Sleep Provenance
   Ontology (extending W3C PROV Ontology PROV-O)
•  Signal processing tools – using a common format called
   European Data Format (EDF), XML-based
•  Domain analysis, cross-linking – secure Web access
Challenges: Semantics in Large Scale Clinical Data
•  Incentives for adopting RDF in clinical data management
   – what is already not possible in RDB?
•  OWL2, RDFS reasoning – Privacy aware reasoning,
   semantics-aware access control (Nguyen et al. 2012)
•  Missing Semantics?
    o  Variable, missing provenance in original study - re-
       create provenance with (limited) provenance?
    o  Fine-level granularity for semantic annotation of
       signal data – currently not scalable
•  A little semantics does not go too far in clinical data
    o  Need for greater involvement of Semantic Web
       community in development of EHR systems
Acknowledgements
•  Guo-Qiang Zhang, Remo Mueller, Samden Lhatoo, Susan Redline, Alireza Bozorgi
•  Division of Medical Informatics: Lingyun Luo, Joe Teagno, Meng Zhao, Jake Luo,
   Licong Cui, Chien-Hung Chen, Catherine Jayapandian
•  Physio-MIMI Team: http://physiomimi.case.edu/
•  Contact Information: satya.sahoo@case.edu,
   http://cci.case.edu/cci/index.php/Satya_Sahoo

Mais conteúdo relacionado

Semelhante a Awakening Clinical Data: Semantics for Scalable Medical Research Informatics

Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
Codiax
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
Shyam Sarkar
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
Joel Saltz
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

Semelhante a Awakening Clinical Data: Semantics for Scalable Medical Research Informatics (20)

Sequencing Genomics: The New Big Data Driver
Sequencing Genomics:The New Big Data DriverSequencing Genomics:The New Big Data Driver
Sequencing Genomics: The New Big Data Driver
 
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
 
The Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, NoorThe Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, Noor
 
Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
Fireside chat: Newton Howard, Director of the MIT Synthetic Intelligence Lab ...
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Health Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research CyberinfrastructureHealth Sciences Driving UCSD Research Cyberinfrastructure
Health Sciences Driving UCSD Research Cyberinfrastructure
 
Brief introduction to Bioinformatics
Brief introduction to BioinformaticsBrief introduction to Bioinformatics
Brief introduction to Bioinformatics
 
Bioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisBioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysis
 
Cancer genome repository_berkeley
Cancer genome repository_berkeleyCancer genome repository_berkeley
Cancer genome repository_berkeley
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Characterization of the c...
 
UNMSymposium2014
UNMSymposium2014UNMSymposium2014
UNMSymposium2014
 
GFII 2014 Big Data
GFII 2014 Big DataGFII 2014 Big Data
GFII 2014 Big Data
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
ANN presentataion
ANN presentataionANN presentataion
ANN presentataion
 
Data analytics challenges in genomics
Data analytics challenges in genomicsData analytics challenges in genomics
Data analytics challenges in genomics
 
Driving Applications on the UCSD Big Data Freeway System
Driving Applications on the UCSD Big Data Freeway SystemDriving Applications on the UCSD Big Data Freeway System
Driving Applications on the UCSD Big Data Freeway System
 
Hybrid imaging
Hybrid imagingHybrid imaging
Hybrid imaging
 
NCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncologyNCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncology
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 

Último

Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
chetankumar9855
 

Último (20)

Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
 
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
Call Girl In Pune 👉 Just CALL ME: 9352988975 💋 Call Out Call Both With High p...
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
 
Call Girls Jaipur Just Call 9521753030 Top Class Call Girl Service Available
Call Girls Jaipur Just Call 9521753030 Top Class Call Girl Service AvailableCall Girls Jaipur Just Call 9521753030 Top Class Call Girl Service Available
Call Girls Jaipur Just Call 9521753030 Top Class Call Girl Service Available
 
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
 
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Andheri East ^ (Genuine) Escort Service Mumbai ₹7.5k Pick Up & Drop With Cash...
Andheri East ^ (Genuine) Escort Service Mumbai ₹7.5k Pick Up & Drop With Cash...Andheri East ^ (Genuine) Escort Service Mumbai ₹7.5k Pick Up & Drop With Cash...
Andheri East ^ (Genuine) Escort Service Mumbai ₹7.5k Pick Up & Drop With Cash...
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 

Awakening Clinical Data: Semantics for Scalable Medical Research Informatics

  • 1. Awakening Clinical Data: Semantics for Scalable Medical Research Informatics Satya S. Sahoo Division Medical Informatics Electrical Engineering and Computer Science Department Case Western Reserve University Cleveland, OH, USA
  • 2. Big Picture of Data in Clinical Research 143, 961 Patients per year (e.g. Emory) MRI: 50-100MB PET: 60-100MB National Sleep Research Resource: 500 TB MRI, PET scans Patient Reports source: PRISM project, BME dept CWRU source: PRISM project CWRU Case Western EMU: 250 TB Epilepsy Monitoring Unit (EMU) Data 500-600MB per patient per stay in EMU Wireless Health Data source: CWRU School of Engineering ~5.6 billion wireless 1-20GB each connections and growing Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  • 3. Big Picture of Data in Clinical Research 143, 961 Patients per year (e.g. Emory) MRI: 50-100MB •  Ultra large volume of data and growing rapidly PET: 60-100MB •  Data is Multi-modal, Heterogeneous •  Heterogeneity: Syntactic, Structural, Semantic National Sleep Research Resource: 500 TB MRI, PET scans Patient Reports source: PRISM project, BME dept CWRU source: PRISM project CWRU Case Western EMU: 250 TB Epilepsy Monitoring Unit (EMU) Data 500-600MB per patient per stay in EMU Wireless Health Data source: CWRU School of Engineering ~5.6 billion wireless 1-20GB each connections and growing Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  • 4. Scalability in Medical Informatics: Beyond Volume Exemplar: Sleep Medicine Research MRI, PET scans Patient Reports source: PRISM project, BME dept CWRU source: PRISM project CWRU Epilepsy Monitoring Unit (EMU) Data Wireless Health Data source: CWRU School of Engineering Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  • 5. Scalability in Medical Informatics: Beyond Volume Exemplar: Sleep Medicine Research •  Multi-Center Studies with differing administrative requirements – business logicscans Patient Reports MRI, PET source: PRISM project, BME dept CWRU source: PRISM project CWRU •  Dynamic data – grows over project duration Epilepsy Monitoring Unit (EMU) Data •  Data Semantics as foundation to support a wide spectrum of users – clinicians, nurse practitioners, research fellows Wireless Health Data source: CWRU School of Engineering Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  • 6. A Wish List for Scalable Clinical Data Management •  Reconcile Data Heterogeneity – most critical to successful translational research o  Syntactic heterogeneity – less of a problem, data dictionaries help o  Structural heterogeneity – problematic, XML somewhat helpful o  Semantic heterogeneity – a huge problem, ontologies to the rescue? •  Provenance – essential for data quality, compliance, insight o  Blood Oxygen Baseline: oxygen saturation during the first 15 or 30 seconds of sleep o  Patient blood report last month cause of change in medication – Domain Provenance (not just tuple provenance) •  Intuitive access to information – clinical trials eligibility, cohort identification •  Scalable - Data sources, research partners added or removed dynamically
  • 7. A “not to do” list for Clinical Data Management Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch •  No Linked Open Patient Data – HIPAA, HITECH Act (US), Data Protection Act (UK) o  De-identified data – IRB approval •  Ontology as global schema – but no RDF o  Vast majority as RDB o  Practical issues with RDF – cannot be institution- specific URI (privacy)
  • 8. Physio-MIMI: Multi‐Modality, Multi‐Resource Environment for Physiological and Clinical Research Clinical Researcher SNOMED-CT FMA Sleep Domain … Ontology OGMS Any number of new centers
  • 9. Physio-MIMI: Enabling Scalable Medical Research •  NCRR‐funded, multi‐CTSA site project: Sleep medicine as exemplar •  Federated data management – scalable, adapts to changing data access policies •  Ontology-driven: o  Data mappings – Ontology class to data dictionary terms (manually curated) o  Drive query interface o  Manage provenance •  Privacy aware, IRB-compliant •  Collaboration among Case Western, U. of Michigan, Marshfield Clinic and U. of Wisconsin, Madison o  Now Harvard Medical School
  • 10. Key Resource: Sleep Domain Ontology (SDO) https://mimi.case.edu/concepts
  • 11. Data Mappings: SDO to Data Dictionary Physio-Map Module •  Visual interface •  Stores mappings in XML – moving towards rules •  Dynamically executed in response to user query User Voting
  • 12. Provenance: Contextual Metadata for Clinical Research Slide courtesy: Remo Mueller
  • 13. Provenance: To Trace Variations in Data and Results Slide courtesy: Remo Mueller
  • 14. Modified from slide courtesy: Remo Mueller
  • 15. Provenance: Source information for Patient Data Slide courtesy: Remo Mueller
  • 16. Intuitive Query Interface: Ontology (SDO)-driven Visual Aggregator and Explorer (VisAgE) DataSets Ontology Concept – Type of Query Widget
  • 17. PhysioMIMI in National Sleep Research Resource •  National Sleep Research Resource (NSSR) – scored and awaiting funding review •  Collaboration between Harvard Medical School (domain experts) and Case Western (CS) with 15 projects o  50,000 sleep research studies – total size of 500TB •  Semantic Data Integration – SDO and Sleep Provenance Ontology (extending W3C PROV Ontology PROV-O) •  Signal processing tools – using a common format called European Data Format (EDF), XML-based •  Domain analysis, cross-linking – secure Web access
  • 18. Challenges: Semantics in Large Scale Clinical Data •  Incentives for adopting RDF in clinical data management – what is already not possible in RDB? •  OWL2, RDFS reasoning – Privacy aware reasoning, semantics-aware access control (Nguyen et al. 2012) •  Missing Semantics? o  Variable, missing provenance in original study - re- create provenance with (limited) provenance? o  Fine-level granularity for semantic annotation of signal data – currently not scalable •  A little semantics does not go too far in clinical data o  Need for greater involvement of Semantic Web community in development of EHR systems
  • 19. Acknowledgements •  Guo-Qiang Zhang, Remo Mueller, Samden Lhatoo, Susan Redline, Alireza Bozorgi •  Division of Medical Informatics: Lingyun Luo, Joe Teagno, Meng Zhao, Jake Luo, Licong Cui, Chien-Hung Chen, Catherine Jayapandian •  Physio-MIMI Team: http://physiomimi.case.edu/ •  Contact Information: satya.sahoo@case.edu, http://cci.case.edu/cci/index.php/Satya_Sahoo