SlideShare uma empresa Scribd logo
1 de 19
CHIC – Converting Hamburgers Into Cows Joseph Townsend jat45@cam.ac.uk
The Scholarly Publication Cycle
What is a Cow? the character encoding is clearly stated the document uses a mark-up technology to identify components  the components have meaning and possibly behaviour associated with them unreduced data available
What we thought the workflow should look like Standoff Annotation File
OSCAR http://sourceforge.net/projects/oscar3-chem/ http://www.omii.ac.uk/wiki/Nwsltr1209OSCAR http://tinyurl.com/yakzgkd
Article Front Matter Abstract Introduction Discussion Results Experimental References
Experimental Front Matter Set up	 Abstract Introduction Compound Name Discussion Results Synthesis Experimental Analysis References
DOCX Workflow (part 1)
DOCX Workflow (part 2)
OREChem PDF PSU Soton Atom Atom SVG Text Cam CrystalEye PubChem Atom Molecules Gaussian  workflow ORE Triplestore IU http://research.microsoft.com/en-us/projects/orechem/
What can we do with a Cow? 5-Cyclobutyl-2,3-dihydro-[1H]-2-benzazepine 82: Potassium carbonate (0.63 g, 4.56 mmol) and thiophenol(0.19 g, 1.69 mmol) were added to the 2- nitrobenzene sulfonamide 50 (0.50 g, 1.302 mmol) in N,N-dimethylformamide(33 cm3) at room temperature and the mixture was stirred for 16 h. Deionised water (50 cm3) was added and the aqueous phase was extracted with ethyl acetate (5 x 50 cm3). The organic extracts were dried (MgSO4) and concentrated under reduced pressure to give the title compound 82 (0.259 g, 1.302 mmol, ca. 100%) as an oil used without further purification.
Parsing and Semantics
Tokenization and Chunking
Phrase identification
RDF of reaction components
[object Object]
Double Circles: Oil

Mais conteúdo relacionado

Destaque (6)

IGCSE
IGCSEIGCSE
IGCSE
 
Cambridge University
Cambridge UniversityCambridge University
Cambridge University
 
Universities of Great Britain
Universities of Great BritainUniversities of Great Britain
Universities of Great Britain
 
Cambridge powerpoint
Cambridge powerpointCambridge powerpoint
Cambridge powerpoint
 
Redacción de textos academicos 2009
Redacción de textos academicos 2009Redacción de textos academicos 2009
Redacción de textos academicos 2009
 
Módulo instruccional partes de la computadora
Módulo instruccional partes de la computadora Módulo instruccional partes de la computadora
Módulo instruccional partes de la computadora
 

Semelhante a CHIC - Converting Hamburgers Into Cows

Substructure Search Face-off
Substructure Search Face-offSubstructure Search Face-off
Substructure Search Face-offNextMove Software
 
Imgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialImgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialDeanna Church
 
Architectural Simulation of Distributed ECU Systems
Architectural Simulation of Distributed ECU SystemsArchitectural Simulation of Distributed ECU Systems
Architectural Simulation of Distributed ECU SystemsJoachim Schlosser
 
Lithium PHP Meetup 0210
Lithium PHP Meetup 0210Lithium PHP Meetup 0210
Lithium PHP Meetup 0210schreck84
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyAerospike
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Hilmar Lapp
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution ControlDaniel Wheeler
 
Virtual Science in the Cloud
Virtual Science in the CloudVirtual Science in the Cloud
Virtual Science in the Cloudthetfoot
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Databricks
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLNicolas Poggi
 
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...Lviv Startup Club
 
Making Repository Easier With SWORD
Making Repository Easier With SWORDMaking Repository Easier With SWORD
Making Repository Easier With SWORDAdrian Stevenson
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 
Question Answering in NLP on Mahabharata 24 may 2017
Question Answering in NLP on Mahabharata 24 may 2017Question Answering in NLP on Mahabharata 24 may 2017
Question Answering in NLP on Mahabharata 24 may 2017SK Reddy
 

Semelhante a CHIC - Converting Hamburgers Into Cows (20)

Substructure Search Face-off
Substructure Search Face-offSubstructure Search Face-off
Substructure Search Face-off
 
Imgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialImgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorial
 
Architectural Simulation of Distributed ECU Systems
Architectural Simulation of Distributed ECU SystemsArchitectural Simulation of Distributed ECU Systems
Architectural Simulation of Distributed ECU Systems
 
Lithium PHP Meetup 0210
Lithium PHP Meetup 0210Lithium PHP Meetup 0210
Lithium PHP Meetup 0210
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
 
Icoper webinar
Icoper webinar Icoper webinar
Icoper webinar
 
Jvm fundamentals
Jvm fundamentalsJvm fundamentals
Jvm fundamentals
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution Control
 
Virtual Science in the Cloud
Virtual Science in the CloudVirtual Science in the Cloud
Virtual Science in the Cloud
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQL
 
SWORD: The Story So Far
SWORD: The Story So FarSWORD: The Story So Far
SWORD: The Story So Far
 
LEXICAL ANALYZER
LEXICAL ANALYZERLEXICAL ANALYZER
LEXICAL ANALYZER
 
2016-07-06-openphacts-docker
2016-07-06-openphacts-docker2016-07-06-openphacts-docker
2016-07-06-openphacts-docker
 
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...
Vitalii Kotliarenko “Data processing pipelines with Apache Spark: from protot...
 
Making Repository Easier With SWORD
Making Repository Easier With SWORDMaking Repository Easier With SWORD
Making Repository Easier With SWORD
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
Question Answering in NLP on Mahabharata 24 may 2017
Question Answering in NLP on Mahabharata 24 may 2017Question Answering in NLP on Mahabharata 24 may 2017
Question Answering in NLP on Mahabharata 24 may 2017
 

Último

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

CHIC - Converting Hamburgers Into Cows

Notas do Editor

  1. Most scientific research is communicated in a formal mannerGroup vs Rest of Community Full Text and Supp InfoMore Data Points require semanitcsSliding Scale – Syntax, Vocab, Ontology, Model(Re)Use:Very hard. Has required human glue before now.This is why we need semantics.
  2. Scan of a printoutPicture with Text Comp Chem more strcuture but still hardFree text
  3. Char Enc - many papers are unreadable because the various glyphs are unresolvedMARKUP – XML RDF Sematic Webthe components have meaning and possibly behavior associated with them. – OntologyNot just interpretted dataNot whole document – sometimes entities sometimes sections
  4. PDF 2 Text HardSAFOSCAR
  5. NCEsChemical Terms Chemical DataOMIISections are important – false positives
  6. Only way to determine sections correctly is to preprocess before it goes into OSCAR using SciXML to hold the section imformationHard with PDF because of the the loss of line breaks text from pictures
  7. SciXML – sections, formattingEmbedded objects can be directly turned into CML (JumboConverters)Suddenly find Data XML too
  8. DataXML loses formatting - RegexHard to recombine.Need to know what Data is associated with what preparation hence which moleculeEach step adds sematics – incremental addition of information
  9. Object Reuse and Exchange
  10. We know that this is a preparationBold NumbersStir phrase Add Phrase
  11. TokensEntitiesPOSChunking
  12. Tokens in BoxesDouble boxes = entities
  13. chunks
  14. Complete description of reaction and added data (strcutures)The following query could be used to search for all reactions using N,Ndimethylformamide as a solvent and yields greater than80%.SELECT ?preparationWHERE f?preparationhasSubstance ?substance .?substance hasMolecule<http://www.polymerinformatics.com/#DMF> .?substance hasRole<http://www.polymerinformatics.com/#Solvent> .?preparation hasSubstance ?product .?product hasYield ?yield .FILTER(?yield > 80 ) .
  15. Maps outside55 compounds madeCompletely new view of this thesis
  16. University of Cambridge (UC) and the University of Southern Queensland (USQ) funded by the JISCIntegrated Repository deposition into author workflowFine grained embagoICE allows linking / inclusion of external data filesChem4WordSemantic Authoring for ChemistryLinked ZonesChemically intelligent authoring