SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
Improving Text Mining with Controlled
Natural Language:
A Case Study for Protein Interactions
Tobias Kuhn (speaker)
Loïc Royer
Norbert E. Fuchs
Michael Schroeder
DILS'06, Hinxton (UK)
21 July 2006
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2
Cooperation of
University of Zurich
(Norbert E. Fuchs, Tobias Kuhn)
and
TU Dresden
(Loïc Royer, Michael Schroeder)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3
Introduction
 Biomedical literature is growing at a
tremendous pace
 PubMed contains 16 million articles and
grows by over 600'000 articles per year
 Computational support is needed!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4
Today's Solution
NLP, manual
annotation
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5
Our Approach
 Let the researchers express their own
results in a formal language
 Perfect processing of scientific results by
computers
 This formal language has to be ...
 easy to learn and understand
 expressive enough to express even
complicated scientific results
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6
Knowledge Representation
Languages
OWL with RDF/XML
Description Logics
first-order logic
ACE
UML
has
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7
Attempto Controlled English
(ACE)
 Formal language that looks like natural
English
 Unambiguously translatable into first-
order logic
 Restricted grammar
 Unlimited vocabulary
 www.ifi.unizh.ch/attempto
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8
Formal Summaries
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9
Formal Summaries
BubR1 interacts-with a trunk-domain of Beta2-Adaptin.
[A, B, C, D]
named(A, BubR1)-1
object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
named(B, Beta2-Adaptin)-1
object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1
object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1
relation(C, trunk-domain, of, B)-1
predicate(D, unspecified, interact_with, A, C)-1
ACE text
Logical representation (DRS)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10
Ontology for Protein Interactions
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11
Empirical Study
 “How suitable is ACE together with our
ontology to express scientific results of
protein interactions?”
 Manual translation of 273 facts about
protein interactions
 These facts are subheadings of the
“Results”-sections of 89 articles (journals
by Elsevier)
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12
Empirical Study
154
57
62
matched perfectly
matched partially
unmatched not covered by the model
relations of relations
fuzzy
21
56
11
31
not understood
Total: Non-perfect:
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13
Authoring tool
 Helps writing ACE sentences
 Shows step by step the possible
continuations of the sentence
 New words can be created on-the-fly
 Awareness of the underlying ontology
 The users do not need to know the details
of the ACE syntax and of the underlying
ontology
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14
Authoring tool:
Prototype demo
http://gopubmed.biotec.tu-dresden.de/AceWiki/
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15
Benefits of our Approach
 Consistency / redundancy checks
 “Is there a paper that contradicts my results?”
 “Is there a paper that comes to the same or similar
results?”
 Answer extraction
 “Which proteins interact with a certain domain of
protein X?”
 Automatically updated knowledge bases
 “Give me an overview of the relations of a protein X
to other proteins!”
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16
Conclusions
 Formal summaries for scientific articles
can make text mining easier and more
powerful
 ACE combines the power of ontologies
with the convenience of natural language
 Let the researchers formalize their own
results!
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17
Thank you for your attention!
Questions
&
Discussion
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18
Subheadings: Example
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19
Degree of Matching: Examples
 Matched perfectly:
 Interaction of Act1 with TRAF6
 → Act1 interacts-with TRAF6.
 Matched partially:
 The mtFabD protein is part of the core of the FAS-II
complex
 → MtFabD is a subunit of FAS-II.
 Unmatched:
 Cav1 interacts differentially with distinct Dyn2 forms
Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20
Reasons for Non-perfect
Matching: Examples
 Not covered by the model:
 Daxx Potentiates Fas-Mediated Apoptosis
 Relations of relations:
 Kal-GEF1 activation of Pak does not require GEF activity
 Fuzzy:
 ANKRD1 contains potential CASQ2 binding sequences
located in both its NT- and CT-regions
 Not understood:
 hSrb7 does not interact with other nuclear receptors

Mais conteúdo relacionado

Destaque

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsChris Willmott
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseThe Higher Education Academy
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Cary Gillenwater
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Studycahafner
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewStefanie Panke
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case studyKobi Vider
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisvibhabhagat2007
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...dr m m bagali, phd in hr
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Amanda Preston
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Zaana Jaclyn
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus Schoolnainadesh
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaDr Mohan Savade
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense PresentationDavid Onoue
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a SchoolSarthak Kaura
 

Destaque (19)

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Novel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethicsNovel activities for teaching about epigenetics and ethics
Novel activities for teaching about epigenetics and ethics
 
Critical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus NurseCritical thinking in action. The case study approach - Angus Nurse
Critical thinking in action. The case study approach - Angus Nurse
 
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...Using graphic novels as a pedagogical approach with Advanced Placement Englis...
Using graphic novels as a pedagogical approach with Advanced Placement Englis...
 
CityU: English For Science Case Study
CityU: English For Science Case StudyCityU: English For Science Case Study
CityU: English For Science Case Study
 
E-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature ReviewE-Portfolios in Higher Education: Case Study & Literature Review
E-Portfolios in Higher Education: Case Study & Literature Review
 
Process performance models case study
Process performance models case studyProcess performance models case study
Process performance models case study
 
Thesis Report Review and Analysis
Thesis Report Review and AnalysisThesis Report Review and Analysis
Thesis Report Review and Analysis
 
Dr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsisDr vibha bhagat phd synopsis
Dr vibha bhagat phd synopsis
 
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...M M  Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
M M Bagali, Phd Synopsis style / PhD/ research / Synopsis template jain univ...
 
Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)Academic writing on literature (from Gocsik’s Writing About World Literature)
Academic writing on literature (from Gocsik’s Writing About World Literature)
 
Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...Understanding design thinking in practice: a qualitative study of design led ...
Understanding design thinking in practice: a qualitative study of design led ...
 
البصرة 2
البصرة 2البصرة 2
البصرة 2
 
Literature case study - Druk White Lotus School
Literature case study - Druk White Lotus SchoolLiterature case study - Druk White Lotus School
Literature case study - Druk White Lotus School
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Thesis powerpoint
Thesis powerpointThesis powerpoint
Thesis powerpoint
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Case study/ Literature of a School
Case study/ Literature of a SchoolCase study/ Literature of a School
Case study/ Literature of a School
 

Semelhante a Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisAlison Specht
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualizationLars Juhl Jensen
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?hsls
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notationkhinsen
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourBria Davis
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated dataRobin Haunschild
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment ICZN
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1hcsc2016
 

Semelhante a Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions (11)

Collaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, ParisCollaboration for Environmental Evidence 2018, Paris
Collaboration for Environmental Evidence 2018, Paris
 
Data integration and visualization
Data integration and visualizationData integration and visualization
Data integration and visualization
 
What do we know about the h index?
What do we know about the h index?What do we know about the h index?
What do we know about the h index?
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
 
Normalization of zero-inflated data
Normalization of zero-inflated dataNormalization of zero-inflated data
Normalization of zero-inflated data
 
BACE1 inhibitor
BACE1 inhibitorBACE1 inhibitor
BACE1 inhibitor
 
Public Health Curriculum.docx
Public Health Curriculum.docxPublic Health Curriculum.docx
Public Health Curriculum.docx
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
 
Chapter 1 Part 1
Chapter 1 Part 1Chapter 1 Part 1
Chapter 1 Part 1
 

Mais de Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataTobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Tobias Kuhn
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksTobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 

Mais de Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions

  • 1. Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions Tobias Kuhn (speaker) Loïc Royer Norbert E. Fuchs Michael Schroeder DILS'06, Hinxton (UK) 21 July 2006
  • 2. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 2 Cooperation of University of Zurich (Norbert E. Fuchs, Tobias Kuhn) and TU Dresden (Loïc Royer, Michael Schroeder)
  • 3. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 3 Introduction  Biomedical literature is growing at a tremendous pace  PubMed contains 16 million articles and grows by over 600'000 articles per year  Computational support is needed!
  • 4. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 4 Today's Solution NLP, manual annotation
  • 5. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 5 Our Approach  Let the researchers express their own results in a formal language  Perfect processing of scientific results by computers  This formal language has to be ...  easy to learn and understand  expressive enough to express even complicated scientific results
  • 6. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 6 Knowledge Representation Languages OWL with RDF/XML Description Logics first-order logic ACE UML has
  • 7. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 7 Attempto Controlled English (ACE)  Formal language that looks like natural English  Unambiguously translatable into first- order logic  Restricted grammar  Unlimited vocabulary  www.ifi.unizh.ch/attempto
  • 8. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 8 Formal Summaries
  • 9. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 9 Formal Summaries BubR1 interacts-with a trunk-domain of Beta2-Adaptin. [A, B, C, D] named(A, BubR1)-1 object(A, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 named(B, Beta2-Adaptin)-1 object(B, atomic, named_entity, object, cardinality, count_unit, eq, 1)-1 object(C, atomic, trunk-domain, unspecified, cardinality, count_unit, eq, 1)-1 relation(C, trunk-domain, of, B)-1 predicate(D, unspecified, interact_with, A, C)-1 ACE text Logical representation (DRS)
  • 10. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 10 Ontology for Protein Interactions
  • 11. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 11 Empirical Study  “How suitable is ACE together with our ontology to express scientific results of protein interactions?”  Manual translation of 273 facts about protein interactions  These facts are subheadings of the “Results”-sections of 89 articles (journals by Elsevier)
  • 12. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 12 Empirical Study 154 57 62 matched perfectly matched partially unmatched not covered by the model relations of relations fuzzy 21 56 11 31 not understood Total: Non-perfect:
  • 13. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 13 Authoring tool  Helps writing ACE sentences  Shows step by step the possible continuations of the sentence  New words can be created on-the-fly  Awareness of the underlying ontology  The users do not need to know the details of the ACE syntax and of the underlying ontology
  • 14. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 14 Authoring tool: Prototype demo http://gopubmed.biotec.tu-dresden.de/AceWiki/
  • 15. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 15 Benefits of our Approach  Consistency / redundancy checks  “Is there a paper that contradicts my results?”  “Is there a paper that comes to the same or similar results?”  Answer extraction  “Which proteins interact with a certain domain of protein X?”  Automatically updated knowledge bases  “Give me an overview of the relations of a protein X to other proteins!”
  • 16. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 16 Conclusions  Formal summaries for scientific articles can make text mining easier and more powerful  ACE combines the power of ontologies with the convenience of natural language  Let the researchers formalize their own results!
  • 17. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 17 Thank you for your attention! Questions & Discussion
  • 18. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 18 Subheadings: Example
  • 19. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 19 Degree of Matching: Examples  Matched perfectly:  Interaction of Act1 with TRAF6  → Act1 interacts-with TRAF6.  Matched partially:  The mtFabD protein is part of the core of the FAS-II complex  → MtFabD is a subunit of FAS-II.  Unmatched:  Cav1 interacts differentially with distinct Dyn2 forms
  • 20. Tobias Kuhn, DILS'06, Hinxton (UK), 21 July 2006 20 Reasons for Non-perfect Matching: Examples  Not covered by the model:  Daxx Potentiates Fas-Mediated Apoptosis  Relations of relations:  Kal-GEF1 activation of Pak does not require GEF activity  Fuzzy:  ANKRD1 contains potential CASQ2 binding sequences located in both its NT- and CT-regions  Not understood:  hSrb7 does not interact with other nuclear receptors