O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Semantic Web & Web 3.0 empowering real world outcomes in biomedical research and clinical practices

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
BioNLPSADI
BioNLPSADI
Carregando em…3
×

Confira estes a seguir

1 de 98 Anúncio

Semantic Web & Web 3.0 empowering real world outcomes in biomedical research and clinical practices

Talk presented in Spain (WiMS 2013/UAM-Madrid, UMA-Malaga), June 2013.

Replaces earlier version at: http://www.slideshare.net/apsheth/semantic-technology-empowering-real-world-outcomes-in-biomedical-research-and-clinical-practices

Biomedical and translational research as well as clinical practice are increasingly data driven. Activities routinely involve large number of devices, data and people, resulting in the challenges associated with volume, velocity (change), variety (heterogeneity) and veracity (provenance, quality). Equally important is to realize the challenge of serving the needs of broader ecosystems of people and organizations, extending traditional stakeholders like drug makers, clinicians and policy makers, to increasingly technology savvy and information empowered patients. We believe that semantics is becoming centerpiece of informatics solutions that convert data into meaningful, contextually relevant information and insights that lead to optimal decisions for translational research and 360 degree health, fitness and well-being.
In this talk, I will provide a series of snapshots of efforts in which semantic approach and technology is the key enabler. I will emphasize real-world and in-use projects, technologies and systems, involving significant collaborations between my team and biomedical researchers or practicing clinicians. Examples include:
• Active Semantic Electronic Medical Record
• Semantics and Services enabled Problem Solving Environment for T.cruzi (SPSE)
• Data Mining of Cardiology data
• Semantic Search, Browsing and Literature Based Discovery
• PREscription Drug abuse Online Surveillance and Epidemiology (PREDOSE)
• kHealth: development of a knowledge-enhanced sensing and mobile computing applications (using low cost sensors and smartphone), along with ability to convert low level observations into clinically relevant abstractions

Further details are at http://knoesis.org/amit/hcls

Talk presented in Spain (WiMS 2013/UAM-Madrid, UMA-Malaga), June 2013.

Replaces earlier version at: http://www.slideshare.net/apsheth/semantic-technology-empowering-real-world-outcomes-in-biomedical-research-and-clinical-practices

Biomedical and translational research as well as clinical practice are increasingly data driven. Activities routinely involve large number of devices, data and people, resulting in the challenges associated with volume, velocity (change), variety (heterogeneity) and veracity (provenance, quality). Equally important is to realize the challenge of serving the needs of broader ecosystems of people and organizations, extending traditional stakeholders like drug makers, clinicians and policy makers, to increasingly technology savvy and information empowered patients. We believe that semantics is becoming centerpiece of informatics solutions that convert data into meaningful, contextually relevant information and insights that lead to optimal decisions for translational research and 360 degree health, fitness and well-being.
In this talk, I will provide a series of snapshots of efforts in which semantic approach and technology is the key enabler. I will emphasize real-world and in-use projects, technologies and systems, involving significant collaborations between my team and biomedical researchers or practicing clinicians. Examples include:
• Active Semantic Electronic Medical Record
• Semantics and Services enabled Problem Solving Environment for T.cruzi (SPSE)
• Data Mining of Cardiology data
• Semantic Search, Browsing and Literature Based Discovery
• PREscription Drug abuse Online Surveillance and Epidemiology (PREDOSE)
• kHealth: development of a knowledge-enhanced sensing and mobile computing applications (using low cost sensors and smartphone), along with ability to convert low level observations into clinically relevant abstractions

Further details are at http://knoesis.org/amit/hcls

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Quem viu também gostou (6)

Anúncio

Semelhante a Semantic Web & Web 3.0 empowering real world outcomes in biomedical research and clinical practices (20)

Mais recentes (20)

Anúncio

Semantic Web & Web 3.0 empowering real world outcomes in biomedical research and clinical practices

  1. 1. 1 Semantic Web & Web 3.0 empowering real world outcomes in biomedical research and clinical practices Amit Sheth Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio http://knoesis.org http://knoesis.org/amit/hcls Special thanks: Sujan Perera; Ack: Kno.e.sis HCLS team and collaborators Talk presented in Spain (WiMS 2013/UAM-Madrid, UMA-Malaga), June 2013
  2. 2. Integration
  3. 3. Semantics
  4. 4. Role of Semantic Web in HCLS • Improve the machine understandability and processing of all types of data by • Modeling and Background Knowledge • Annotation • Complex Querying/Analysis, Reasoning • Improve Insight from Biomedical Data • Improve Clinical Decision Making • Vastness/Volume • Velocity • Variety/Heterogeneity • Vagueness, Uncertainty, Inconsistency, Deceit Objective Challenges Approach
  5. 5. Identifiers: URI Character set: UNICODE Syntax: XML Data interchange: RDF Querying: SPARQL Taxonomies: RDFS Ontologies: OWL Rules: RIF/SWRL Unifying logic Proof Trust Cryptography User interface and applications Querying Data/Knowledge Representation Knowledge Representation Lots of need for NLP, ML, IR, and other technologies – SW significantly empowers these and closes some critical gaps
  6. 6. HCLS Apps @ Kno.e.sis • Semantic Search and Browsing(Doozer++, SCOONER, iExplore) • Semantics and Services enabled Problem Solving Environment for T.cruzi(SPSE) • Active Semantic Electronic Medical Record(ASEMR) • Mining and Analysis of EMR(ezFIND, ezMeasure, ezCAC) • kHealth (ADHF, Asthma, …) • PREscription Drug abuse Online Surveillance and Epidemiology(PREDOSE) Biomedical Healthcare Epidemiology
  7. 7. Insights Better Understanding Intuitive Browsing Hypothesis Generation Personalization Knowledge Exploration Doozer++ iExplore SCOONER Kino Kno.e.sis Bioinformatics toolkit http://knoesis.org/opensource http://knoesis.org/showcase
  8. 8. Knowledge Acquisition – Doozer++ • Building ontology is costly • Large volume of knowledge available in semi- structured/unstructured format • No assurance for the credibility of such knowledge
  9. 9. Knowledge Acquisition – Doozer++ Circle of Knowledge http://knoesis.org/node/71
  10. 10. Knowledge Acquisition – Doozer++
  11. 11. Knowledge Acquisition – Doozer++
  12. 12. j.1:category_scie nce j.1:category_neu roscience j.1:category_cog nitive_science j.1:category_psy chology j.1:category_beh avior j.1:category_phil osophy_of_mind j.1:category_brai n j.1:category_psy cholinguistics j.1:category_neu rology j.1:category_neu rophysiology 10 classes… Knowledge Acquisition – Doozer++
  13. 13. Doozer++ Demo Knowledge Acquisition from Community-Generated Content Continuous Semantics to Analyze Real-Time Data , IEEE Internet Computing (Volume 14)
  14. 14. • Identify Relationships • Textual pattern-based extraction for known relationships • Facts available in background knowledge • Find evidence for such facts • Combined evidence from many different patterns increases the certainty of a relationship between the entities Beyond Hierarchy
  15. 15. • Evaluating acquired knowledge • Explicit • User can vote for facts • Facts presented based on user interests • Implicit • User’s browsing history used as a indication of which propositions are correct and interesting • Now it adds validated knowledge back to community Validating Knowledge
  16. 16. Base Hierarchy from Wikipedia SenseLab Neuroscience Ontologies Meta Knowledgebase PubMed Abstracts Focused pattern based extraction Initial KB creation Enriched Knowledgebase HPC Keywords Kno.e.sis: NLP based triples NLM: Rule based BKR triples Building Human Performance & Cognition Ontology (HPCO) Merge http://wiki.knoesis.org/index.php/Human_Performance_and_Cognition_Ontology
  17. 17. Use Case for HPCO • Number of Entities – 2 million • Number of non-trivial facts – 3 million • NLP Based*: calcium-binding protein S100B modulates long-term synaptic plasticity • Pattern Based**: Olfactory Bulb has physical part of anatomic structure Mitral cell * Joint Extraction of Compound Entities and Relationships from Biomedical Literature , Web Intel. 2008 * A Framework for Schema-Driven Relationship Discovery from Unstructured Text, ISWC 2006 ** On Demand Creation of Focused Domain Models using Top-down and Bottom-up Information Extraction, Technical Report
  18. 18. Knowledge-based Browsing - SCOONER • Knowledge-based browsing: relations window, inverse relations, creating trails • Persistent Projects: Work bench, Browsing history, Comments, Filtering • Collaboration: Comments, Dashboard, Exporting projects, Importing projects
  19. 19. SCOONER Demo SCOONER Details An Up-to-date Knowledge-Based Literature Search and Exploration Framework for Focused Bioscience Domains , IHI 2012- 2nd ACM SIGHIT International Health Informatics Symposium
  20. 20. Kino • An integrated suite of tools that enables scientists to annotate – unstructured resources – semi-structured resources • Annotates documents by accessing NCBO ontologies, via the NCBO Web API. • Includes two main components – A browser-based annotation front-end – An annotation-aware back-end index that provides faceted search capabilities
  21. 21. Kino Architecture
  22. 22. Example: Annotating Literature
  23. 23. Annotation the XML file with NCBO Ontology rel rel Ontology concept
  24. 24. Kino Search •Search the annotated document with the concept of interest •Return all annotated document with selected concept
  25. 25. Kino Demo Kino: A Generic Document Management System for Biologists Using SA-REST and Faceted Search. ICSC 2011
  26. 26. iExplore Interactive Browsing and Exploring Biomedical Knowledge
  27. 27. Architecture
  28. 28. Generate Novel Hypothesis
  29. 29. iExplore video iExplore Demo
  30. 30. Turning to Applications with End Users
  31. 31. Active Semantic Electronic Medical Record - ASEMR • New Drugs • Adds interaction with current drugs • Changes possible procedures to treat an illness • Insurance coverage changes • Will pay for drug X, but not Y • May need certain diagnosis before expensive tests • Physicians are require to keep track of ever changing landscape
  32. 32. • A Document • With semantic annotations • entities linked to ontology • terms linked to specialized lexicon • With actionable information • rules over semantic annotations • rule violation indicated with alerts Atrial fibrillation with prior stroke, currently on Pradaxa, doing well. Mild glucose intolerance and hyperlipidemia, being treated by primary care. ASEMR – Active Semantic Document
  33. 33. • Type of ASD • Three Ontologies • Practice Information about practice such as patient/physician data • Drug Information about drugs, interaction, formularies, etc. • ICD/CPT Describes the relationships between CPT and ICD codes ASEMR – Active Semantic Patient Record
  34. 34. encounter ancillary event insurance_ carrier insurance facility insurance_ plan patient person practitioner insurance_ policy owl:thing ambularory _episode ASEMR – Practice Ontology Hierarchy
  35. 35. owl:thing prescription _drug_ brand_name brandname_ undeclared brandname_ composite prescription _drug monograph _ix_class cpnum_ group prescription _drug_ property indication_ property formulary_ property non_drug_ reactant interaction_ property property formulary brandname_ individual interaction_ with_prescri ption_drug interaction indication generic_ individual prescription _drug_ generic generic_ composite interaction_ with_non_ drug_reactant interaction_ with_mono graph_ix_cl ass ASEMR – Drug Ontology Hierarchy
  36. 36. ASEMR
  37. 37. 0 100 200 300 400 500 600 Jan 04 M ar04 M ay 04 Jul04Sept04 N ov 04 Jan 05 M ar05 M ay 05 Jul05 Month/Year Charts Same Day Back Log Before ASEMR
  38. 38. 0 100 200 300 400 500 600 700 Sept 05 Nov 05 Jan 06 Mar 06 Month/Year Charts Same Day Back Log After ASEMR
  39. 39. • Error Prevention • Patient care • Insurance • Decision Support • Patient satisfaction • Reimbursement • Efficiency/Time • Real-time chart completion • “semantic” and automated linking with billing ASEMR - Benefits
  40. 40. ASEMR Demo Active Semantic Electronic Medical Record, ISWC 2006
  41. 41. Semantics and Services enabled Problem Solving Environment for T.cruzi - SPSE • Majority of experimental data reside in labs • Integration of lab data facilitate new insights • Formulating queries against such data required deep technical knowledge A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi, 2012
  42. 42. SPSE • Data Sources • Internal Lab Data • External Database • Ontological Infrastructure • Parasite Lifecycle • Parasite Experiment • Query Processing • Cuebee
  43. 43. • Integrated internal data with external databases, such as KEGG, GO, and some datasets on TriTrypDB • Developed semantic provenance framework and influenced W3C community • SPSE supports complex biological queries that help find gene knockout, drug and/or vaccination targets. For example: • Show me proteins that are downregulated in the epimastigote stage and exist in a single metabolic pathway. • Give me the gene knockout summaries, both for plasmid construction and strain creation, for all gene knockout targets that are 2-fold upregulated in amastigotes at the transcript level and that have orthologs in Leishmania but not in Trypanosoma brucei. SPSE
  44. 44. Complex queries can also include: - on-the-fly Web services execution to retrieve additional data - inference rules to make implicit knowledge explicit SPSE
  45. 45. • So many ontologies • Rich in number of concepts • Mostly concentrated on taxonomical relationships • Applications require domain relationships • A is_symptom_of B • C is_treated_with D Knowledge Enrichment from Data
  46. 46. Data Information Knowledge Knowledge Enrichment from Data
  47. 47. IntellegOBackground knowledge Modified background knowledge EMR Knowledge Enrichment from Data Data Driven Knowledge Acquisition Method for Domain Knowledge Enrichment in Healthcare, BIBM 2012 An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web, Applied Ontology 2011
  48. 48. Knowledge Enrichment from Data atrial Fibrillation hypertension diabetes chest pain weight gain discomfort in chest rash skin cough weight loss headache edema shortness of breath fatigue syncope weight loss chest pain discomfort in chest dizzy shortness of breath nausea vomiting headache cough weight gain Diseases Symptoms Symptoms From EMR From KB Is edema symptom of atrial fibrillation? Is edema symptom of hypertension? Is edema symptom of diabetes?
  49. 49. Domains Cardiology Orthopedics Oncology Neurology Etc… No of concepts 1008161 Problems(diseases, symptoms) 125778 Procedures 262360 Medicines 298993 Medical Devices 33124 Relationships 77261 is treated with (disease -> medication) 41182 is relevant procedure (procedure -> disease) 3352 is symptom of (symptom -> disease) 8299 contraindicated drug (medication -> disease) 24428 Knowledge Enrichment from Data with the above method + UMLS healthline.com druglib.com
  50. 50. • 80% unstructured healthcare data • Pose challenges in • Searching • Understanding • Mining • Knowledge discovery • Decision support • Evidence based medicine • Federal policies promote meaningful use and pose constraints to healthcare system Healthcare Challenge
  51. 51. Coding Complexity ICD-9 ICD-10 Diagnostic Codes 14,000 69,000 Procedure Codes 3,800 72,000 ICD-9 (Current) ICD-10 Conversion (1st Oct,2014) Clinical Documentation & Coding-Billing Challenges Example: 821.01: ICD-9 code for “closed” Fractured Femur, or thigh bone. Translates to 36 codes in ICD-10 with details regarding the precise nature of fracture, which thigh was fractured, whether a delay in healing occurred etc. Healthcare Challenge
  52. 52. • Traditional methods doesn’t work • Understanding the context is crucial Need to Do Better Healthcare Challenge
  53. 53. Search Mining Decision Support Knowledge Discovery Evidence-based Medicine NLP + Semantics Healthcare Challenge – The Solution
  54. 54. ezHealth cTAKES ezNLP ezKB <problem value="Asthma" cui="C0004096"/> <med value="Losartan" code="52175:RXNORM" /> <med value="Spiriva" code="274535:RXNORM" /> <procedure value="EKG" cui="C1623258" /> ezFIND ezMeasure ezCDIezCAC www.ezdi.us
  55. 55. ezHealth - Benefits • Advance search • All hypertension patients with ejection fraction <40 • All MI patients who are taking either beta- blockers or ACE Inhibitors • Patients diagnose with Atrial Fibrillation on Coumadin or Lovanox • Support core-measure initiative
  56. 56. Error Detection EMR: 1. “Sepsis due to urinary tract infection….” 2. “Her prognosis is poor both short term and long term, however, we will do everything possible to keep her alive and battle this infection." SNM:40733004_infection SNM:68566005_infection_urinary_tract A syntax based NLP extractor (such as Medlee) can extract this term and annotate as SNM:40733004_infection By utilizing IntellegO and cardiology background knowledge, we can more accurately annotate the term as SNM:68566005_infection_urinary_tract *MedLEE with usage of IntellegO Problem Problem *MedLEE is NLP engine optimized to parse clinical documents
  57. 57. EMR: ”The patient is to receive 2 fluid boluses." SNM:32457005_body_fluid A syntax based NLP extractor (such as Medlee) can extract this term and annotate as SNM:32457005_body_fluid MedLEE Problem Fluid is part of buloses treatment, not a problem with IntellegO By utilizing IntellegO and cardiology background knowledge, we can determine that this is not a symptom – hence annotation is incorrect. Treatment Error Detection
  58. 58. The balance of evidence would suggest that his episode of atrial fibrillation seems to be an isolated event He has had no documented atrial fibrillation since that time Patient has atrial fibrillation Patient does not have atrial fibrillation NLP NLP Atrial FibrillationSyncope Is_symptom_of Warfarin Atenolol AspirinIs_medication_for Resolve Inconsistency Using domain relationships we validated that patient has atrial fibrillation Symptoms Medication Medication Medication
  59. 59. She denies any chest pain but is not really function due to leg stiffness, swelling an shortness of breath Regarding the shortness of breath, we will send for a dobutamine stress echocardiogram Patient does not have shortness of breath Patient has shortness of breath NLP NLP Shortness of Breath Is_symptom_of Obesity Hypertension Sleep Apnea Obstructive Resolve Inconsistency Using domain relationships we validated that patient has shortness of breath Disorder Disorder Disorder
  60. 60. PREscription Drug abuse Online Surveillance and Epidemiology - PREDOSE • Non-medical use of Prescription Drugs • Fastest Growing Drug problem in US • Director ONDCP Gil Kerlikowske, Epidemic* • Pathway to heroin addiction • Escalating accidental overdose deaths • Current Epidemiological Data Systems • Interactive Interviews • Online Surveys • Manual Coding
  61. 61. Specific Aims Describe drug user’s knowledge, attitudes, and behaviors related to illicit use of Prescription Drugs (Content Analysis) Describe temporal patterns of non-medical use of Prescription Drugs (Trend Analysis)
  62. 62. Overall Approach 1. Automate Data Collection • Social Media - Online Web forums 2. Create Structured Domain Vocabulary • Drug Abuse Ontology (DAO) 3. Automate Information Extraction • Entity, Relationship, Triple, Sentiment, Template 4. Develop Tools for Data Analysis a) Content Analysis - Content Explorer, Template Pattern Explorer, Proximity Search b) Trend Analysis - Trend explorer, Emerging pattern explorer
  63. 63. Web Crawler Informal Text DatabaseWeb Forums 2 4 5 8 Data Cleaning Stage 1. Data Collection 3 Stage 2. Automatic Coding Stage 3. Data Analysis and Interpretation 1 6 Qualitative and Quantitative Analysis of Drug User Knowledge, Attitudes and Behaviors + = Semantic Web Database Information Extraction Module Temporal Analysis for Trend Detection 10 Triples/RDF Database Entity Identification Sentiment Extraction Relationship Extraction Triple Extraction 7 Opioid, Cannabinoid, Side Effect, Feeling [Buprenorphine has_slang_term bupe] [Suboxone subClassOf Buprenorphine] [Suboxone_Injection CAUSES Nausea] Drug Abuse Ontology (Schema) 9 PREDOSE Web Application 9
  64. 64. Research Highlights Drug Abuse Ontology • First ontology on prescription drug abuse Ontology-based Entity Identification • Gold standard dataset 601 posts • Buprenorphine – 33:1 Slang-to-drug mentions • Loperamide – 24:1 Slang-to-drug mentions: • 85% Precision, 72% Recall
  65. 65. Research Highlights Template-based (Knowledge-Aware) Search • Complex Information Needs Solution 1. Ontology-based Search 2. Rule-based Search – Intensity, Frequency, Dosage, Interval 3. Context-Free Grammar – Queries Interpretable by PREDOSE 4. Data Sources – Ontology, Lexicon, Lexico-ontology, Alphabet
  66. 66. Entity +ve Sentiment Opiated Effect Extra-medical Use of Loperamide Loperamide-Withdrawal Discovery
  67. 67. kHealth 71 Health information is now available from multiple sources • medical records • background knowledge • social networks • personal observations • sensors • etc.
  68. 68. 72 Foursquare is an online application which integrates a persons physical location and social network. Community of enthusiasts that share experiences of self-tracking and measurement. FitBit Community allows the automated collection and sharing of health-related data, goals, and achievements kHealth
  69. 69. 73 Sensors, actuators, and mobile computing are playing an increasingly important role in providing data for early phases of the health-care life-cycle This represents a fundamental shift: • people are now empowered to monitor and manage their own health; • and doctors are given access to more data about their patients kHealth
  70. 70. 74 kHealth
  71. 71. 75 Personal Health Dashboard kHealth
  72. 72. 76 Personal Health Dashboard 1  2  3 Continuous Monitoring Personal Assessment Medical Service Auxiliary Information – background knowledge, social/community support, personal context, personal medical history kHealth
  73. 73. 77 ? kHealth
  74. 74. kHealth – Key Ingredients 78 Background Knowledge Social Network Input Personal Observations Personal Medical History
  75. 75. 79 Abstractions Observations kHealth
  76. 76. 80 kHealth - Technology observes inheres in perceives sends focus sends observation Observer Quality EntityPerceiver
  77. 77. 82 kHealth - Technology Background Knowledge as Bi-partite Graph
  78. 78. 83 kHealth - Technology Explanation: is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building Discrimination: is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features
  79. 79. 84 kHealth - Technology Explanatory Feature: a feature that explains the set of observed properties ExplanatoryFeature ≡ ∃ssn:isPropertyOf—.{p1} ⊓ … ⊓ ∃ssn:isPropertyOf—.{pn} elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Observed Property Explanatory Feature Explanation
  80. 80. 85 kHealth - Technology Discrimination Expected Property: would be explained by every explanatory feature ExpectedProperty ≡ ∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ∃ssn:isPropertyOf.{fn} elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Expected Property Explanatory Feature
  81. 81. 86 kHealth - Technology Discrimination Not Applicable Property: would not be explained by any explanatory feature NotApplicableProperty ≡ ¬∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ¬∃ssn:isPropertyOf.{fn} elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Not Applicable Property Explanatory Feature
  82. 82. 87 kHealth - Technology Discrimination Discriminating Property: is neither expected nor not- applicable DiscriminatingProperty ≡ ¬ExpectedProperty ⊓ ¬NotApplicableProperty elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Discriminating Property Explanatory Feature
  83. 83. 90 kHealth Demo An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web, Applied Ontology 2011 Representation of Parsimonious Covering Theory in OWL-DL (OWLED 2011) An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices (ISWC 2012) Data Driven Knowledge Acquisition Method for Domain Knowledge Enrichment in Healthcare, BIBM 2012
  84. 84. 91 kHealth
  85. 85. 92 kHealth - Asthma • Can we detect asthma/allergy early? – Using data from on-body sensors, and environmental sensors – Using knowledge from an asthma ontology, generated from asthma knowledge on the Web and domain experts – Generate a risk measure from collected data and background knowledge • Can we characterize asthma/allergy progression? – State of asthma patient may change over time – Identifying risky progressions before worsening of the patient state • Does the early detection of asthma/allergy, and subsequent intervention/treatment, lead to improved outcomes? – Improved outcomes could be improved health (less serious symptoms), less need for invasive treatments, preventive measures (e.g. avoiding risky environmental conditions), less cost, etc.
  86. 86. • GO (well controlled) – peak flow 80-100%* – Good breathing and sleep: Acceleration reading pattern – No cough: microphone – Good physical activity: Acceleration reading pattern • CAUTION (not well controlled) – peak flow 60-80%* – Cough and Wheeze: microphone – Tight chest: Acceleration readings – Wakes up at night: Acceleration reading pattern • STOP (poor control) – peak flow < 60%* – Medicine not helping: medicine = TRUE and still in STOP state – Breathing hard and fast: microphone – Can’t walk or talk well: Acceleration and microphone 93* Measured using peak flow meter Asthma Control Level and Corresponding Sensor Observations
  87. 87. 94 Physical Social http://ngs.ics.uci.edu/blog/?p=1478 Cyber Data Collection Analysis Action Take Medication before going to work Avoid going out in the evening due to high pollen levels Domain ExpertsDomain Knowledge Risk Model Action Model Overall Landscape
  88. 88. 95 Personal Level Events Population Level Events (Personal Level Events) (Personalized Events) (Population Level Events) Population-level Events Relevant at the Personal-level Machine sensors:  Pollen levels  Pollution levels  Accelerometer  Peak flow meter  Medication tracking Personal sensors:  Symptoms (kHealth) (EventShop) Qualify & Quantify -Detect all the factors influencing asthma -Find the role of each factor in influencing asthma Asthma Risk Profile -Contextual information to personalize risk -Risk score computation Asthma Mitigation -Corrective action based on risk score What are the factors influencing my asthma? What is the contribution of each of these factors? How controlled is my asthma? (risk score) What will be my action plan to manage asthma? Storage Pose Questions Receive answers Access/update patient information Machine sensors:  Pollen levels  Pollution levels Personal sensors:  Symptoms  Asthma prevalence
  89. 89. 96 Community Spaces Personal Spaces Personal Wheeze – Yes Do you have tightness of chest? –Yes ObservationsPhysical-Cyber-Social System Health Signal Extraction Health Signal Understanding <Wheezing=Yes, time, location> <ChectTightness=Yes, time, location> <PollenLevel=Medium, time, location> <Pollution=Yes, time, location> <Activity=High, time, location> Wheezing ChectTightness PollenLevel Pollution Activity Wheezing ChectTightness PollenLevel Pollution Activity RiskCategory <PollenLevel, ChectTightness, Pollution, Activity, Wheezing, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> . . . Actionable Information Action: contact doctor now Explanation: Increased activity is the primary cause of wheezing and high risk category Expert Knowledge Background Knowledge tweet reporting pollution level and asthma attacks Acceleration readings from on-phone sensors Sensor and personal observations Signals from personal, personal spaces, and community spaces Risk Category assigned by doctors Qualify Quantify Enrich Outdoor pollen and pollution
  90. 90. • Collaborators: AHC (Dr. Agrawal), CITAR-WSU, ezDI (ezdi.us), NLM (Dr. Bodenrider), CTEGD-UGA (Dr. Mnning/Prof. Tarleton), NCBO - Stanford, Welcome Trust, AFRL, Boonshoft Sch of Med – WSU (Dr. Forbis, …), • Funding: NIH (NHLBI R01: 1R01HL087795-01A1; NIDA: R21 DA030571 ), NSF, AFRL, Industry…. Acknowledgements
  91. 91. Thank You Visit Us @ www.knoesis.org with additional background at http://knoesis.org/amit/hcls
  92. 92. Ohio Center of Excellence in Knowledge-enabled Computing - An Ohio Center of Excellence in BioHealth Innovation Wright State University
  93. 93. Amit Sheth’s PHD students Ashutosh Jadhav Hemant Purohit Vinh Nguyen Lu Chen Pavan Kapanipathi Pramod Anantharam Sujan Perera Alan Smith Pramod Koneru Maryam Panahiazar Sarasi Lalithsena Cory Henson Kalpa Gunaratna Delroy Cameron Sanjaya Wijeratne Wenbo Wang Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)

Notas do Editor

  • Heterogeneity of data to be integrated(Variety)
  • QualityHow do you fix it? Measure it?How do you decide
  • Consumers are changedClinicians + drug makers + Insurance companiesTechnology savvy users + gadgetsPut the text from 360
  • We have lot of data, we are trying to use meaningfully, but still customer(users) are not satisfiedSo we need computer to understand the data
  • What is semantic web?http://en.wikipedia.org/wiki/Semantic_WebVast – huge dataVague – define ‘young’ ‘tall’Uncertainty - a patient might present a set of symptoms which correspond to a number of different distinct diagnoses each with a different probabilityDeceit -  intentionally misleading
  • The technology stack and usage of most popular technologies
  • Kno.e.sis products
  • This slide intend justify the development of tools doozer, scooner, Kino, iExplorerHuge amount of knowledge in different format and people are overloaded withKnowledge/Information, we need mechanism to better exploration of knowledgeAnd help them to find what they require(scooner, iExplorer) and derive new knowledge
  • Why doozer?Knowledge is available in various formats, but they are hardly helpful if not inStructured format. But building structured knowledgebase from available formats is achallenge
  • Human knowledge cycleDoozer is a one tool that supports this
  • Forms of open knowledgeWikipediaLODFormal models
  • Knowledge acquisition through Model creation
  • Hierarchy creation from wikipedia
  • Big picture
  • Doozer’s way of identifying relationships
  • Last two steps of knowledge cycle
  • Big pictureKno.e.sis: NLP based triples -  CarticRamakrishnan&apos;s and Pablo&apos;s work on open Information Extraction from biomedical text.Sentences in MedLine abstracts are parsed and split into Subject, Predicate and Object.In the Merge phase, only those triples that have Subject and Object that can be mapped to the initial KB are added to the enriched KB.BKR triples is that the BKR triples were probably verified by NLM before being published, whereas the Knoesis triples went into the KB unverified, apart from having to match initial KB concepts.
  • Last two steps of knowledge cycle
  • Why scooner
  • demo
  • Semantic annotation maps target data resources to concepts in ontologies.Extra information is added to the resource to connect it to its corresponding concept(s) in the ontology.This system includes two main components, a browser-based annotation front-end, integrated with NCBO and an annotation-aware back-end index that provides faceted search capabilities
  • illustrates the user interface of the annotator plug-in. When the user highlights and right clicks in a word or a phrase, the browser’s context menu includes the annotatation as a phylogenetical concept menu item. Selecting this menu item brings up the annotations window where the highlighted term is searched using the NCBO RESTful API and a detailed view of the available ontological terms is shown to the user to select. The user can search or browse for a concept in any ontology hosted in NCBO. Once all the annotations are added, users can directly submit the annotations to a predefined (configurable through an options dialog) Kino instance, by selecting the publish annotations menu item [3]. Kino supports generic domain annotations, and is capable of providing facets on any domain. Kino is built on top of Apache SOLR6, a facet capable indexing and searching engine that is easily extensible. The current Kino framework supports three facets based on the SA-REST specification. The index manages content of each annotation, the annotated text and the content of the document, hence the users have to flexibility to search on the annotated concept as well as the document content similar to a text based search engine.
  • Novelty of this annotation process: annotate the term in XML with the triples from ontology not just the concepts from ontology.3 kind of Object values in annotation:Literal object valueRemote resource as an object valueA nested annotation as an object valueThis is the first effort Step 1: tree1(s) is- close-match (p) tree (o)Step 2: tree (s) is-inferred –by (p) maximum-liklihood (o)
  • User can search for specific term and get all the annotate documents with that specific term.
  • Knowledge and data are separatedThere is no way to validate whether my data adheres to knowledge and vice-versa
  • Architecture
  • Generate Novel hypothesis
  • The challengeWhy ASEMR?
  • How ASEMR?
  • How ASEMR?
  • The architecture
  • Why SPSE?Integration of data gives more insights, but the heterogeneity of data stand against the integration
  • How SPSE
  • Benefits
  • why
  • EMR documents not only contain data/information but knowledge tooBut scattered nature of knowledge makes it difficult to discover
  • The big pictureThe built knowledgebase should be able to explain the real world data,We used this claim in reverse order: real world data can be used to enhance the Knowledge base when it fails to explain the dataScenario: Extract all diseases from the documentGenerate all possible symptoms for these diseases using knowledgebaseExtract all symptoms from the documentIf there are more symptoms in document than the generated set, this indicates that we might be missingsome relationship betweenDisease and symptomsWe use this indication to generate questions that can be answered by the domain expert, this will allow us to enrich the knowledgebase
  • From EMR: we extract the diseases and symptoms (we have already annotated concept in the EMR with our background knowledge)We generate the symptom coverage for the diseases found(union of symptoms that each disease attached to in the knowledge base)Now we have observed symptoms and all possible symptomsAssumption : observed symptoms should be a subset of all possible symptomsWhenever we found that there is a symptom in observed list which is not in all possible list, we can generate the hypothesis and verify with the domain expert.What we found is edema is symptom of hypertension.This method will reduce the workload of domain expertImagine we have 50 diseases and 100 symptomsThen there are 5000 possibilities,Domain expert has to go through each and validate, but with this methodWe will only ask the question only if we find evidence
  • What we achieved?Not sure whether this slide is requiredWe used lot of existing knowledgebases to build this knowledgebaseWe extract the knowledge from the listed websites by crawling and the annotating the concepts using UMLS
  • Unstructured data posing challenges in every field, but here is our attempt to overcomeThe challenge in healthcareTraditional methods - IR, Data mining, traditional NLP
  • People waiting to harness the unstructured healthcare data for all these applications
  • ArchitectureData Cleaning: Adding section headers, Modify malformed section headersDe-identificationCAC – Computer Assisted CodingCDI – Clinical Document Improvement
  • Emphasize the capability of inferencing (only because we have knowledgebase) andPoint out that how difficult to formulate such queries if knowledgebase is not available
  • EMR doc has these two sentences‘Urinary tract infection’ (first sentence) is correctly annotated, but ‘infection’ in second one is not.Second ‘infection’ actually refers to ‘urinary tract infection’ in first sentence, but NLP engineDoes not understand this.We could find this because there are no evidences to suggest ‘infection’ in the document according to our knowledgebase.So after detecting this issue, we could annotate the second infection as urinary tract infection(this annotation is done manually) Detection is done with IntellegOOne could rather argue that annotating second ‘infection’ as just infection does not harm because urinary tract infection is alsoInfection, but detection of these things help to improve the annotation.
  • NLP engine annotate the fluid as ‘body_fluid’ which is a symptomBut here the term ‘fluid’ does not refer to symptom rather the form of medication ‘boluses’We could find this issue because there was no disease in the document to suggest the ‘body fluid’
  • In this case NLP does not detect second statement is talking about history.But with the knowledgebase we have, we can say patient actually has AF.So we resolve the inconsistency here.Example from document 673
  • NLP does not understand the first sentenceIt attaches ‘not’ to shortness of breath which is wrong according to semantics of the sentence.But we can resolve this issue by using knowledgebaseExample from document 595
  • Why PREDOSE?Data collection practices – interactive interviews, surveysData analysis limitations- coding* www.judiciary.senate.gov/hearings/hearing.cfm?id=e655f9e2809e5476862f735da16cf3a9
  • Specific Aim 1: Has a special focus on recently approved abuse-deterrent formulation; It is now more difficult to obtain prescribed Oxycontin;
  • 1. Non-medical use of prescription drugs is fastest growing form of drug abuse in USEpidemic: Responding to America’s Prescription Drug Abuse CrisisGATEWAY DRUGNational Survey on Drug Use and Health (NSDUH) - nearly one-third of people aged 12 and over who used drugs for the first time in 2009 began by using a prescription drug non-medically2. Purdue Pharma - Best known for its pain-treatment products, OxyContin - Oxycontin reformulation (Aug 2010)Pathway to heroin addiction [1] (2003)[1] Probable Relationship Between Opioid Abuse and Heroin Use - ROBERT G. CARLSON, PH.D.The Ohio Substance Abuse Monitoring (OSAM) Network (In Dayton, 10 subjects, aged 18 to 33 years, were interviewed.)Accidental drug overdose death [2] (2008)[2] Recent changes in drug poisoning mortality in the United States by urban-rural status and by drug type. Paulozzi LJ, Xi Y. @ CDC 2008 - 1999-2004, degree of urbanization, found opiod abuse
  • Buprenorphine is an opioid antagonist used in the management of opioid addiction, including such opioids as heroin, oxycontin and vicodin, Prescribed daily dosage typically range from 4–24mg
  • Slang, abbreviations, equivalent concepts
  • Posts which mentioned Buprenorphine, benzodiazepine
  • Multi model healthcare data
  • Recent advancement in observation mechanisms and data sharing
  • Sensors play key role
  • But still we are here
  • We need to get here
  • Kno.e.siskHealth ideaOngoing work : simulating first two phasesOur product is MobileMDDemo is at the end of the slides
  • The ChallengeWe have sensors to measure movements, heart rate, sleeping, galvanic skin response etc…But we don’t know how to aggregate
  • Key ingredients which will help to understand the healthcare data(measurements)
  • Numbers-&gt;abstractions-&gt;knowledge integration(static knowledge about the domain, personal background)-&gt;predictionAdvantages: early detection and alert generation
  • http://www.vitalograph.com/products/monitors-screeners/asthma/asma-1-bluetooth
  • Observe data from different sensors at the same time.
  • System Architecture Fig. shows an overview of the SemHealth architecture. SensorsAll are bluetooth sensors already utilized by the current k-Health application to measure weight, heart rate, and blood pressureAndroid applicationReads sensor observations through bluetoothPerforms annotation on observations and generates percepts from those observationsUploads annotated observations and percepts to the server-side data storeRetrieves data using DSU API and feeds data to DPU and/or DVU APIsVisualizes data through DVU APIConsidered a “nice to have” as existing visualization may be used as-isWill utilize existing graphing library for Android with Open mHealth-style API that may be translated to browser at a later timeServer-sideOpen mHealth compliant DSU and DPU APIsTriple data storage replaces existing SQLite database in k-Health applicationExisting k-Health reasoner now the brains behind DPU
  • Example of queries: Give a time during the last 5 days when Blood pressure and heart rate were high for selected patient.Give me a last time any patient exhibited pre-hypertension.Give me a patient who exhibited reading of pre-hypertention and normal heart rate along with most recent timestamps for both reading.

×