Slides to be presented at a webinar arranged by Metasolution as part of a Vinnova project http://metasolutions.se/2014/03/webbinarium-med-kerstin-forsberg-om-lankade-data-i-lakemedelsforskningen/
1. Användningen av länkade data
principer och semantic web
standards i läkemedelsforskningen
Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.)
AZ IT | R&D Information
Kompetensförstärkning kring länkade
öppna data - dialog, webbinarier och vitbok
Webbinarium arrangerat av
18 mars 2014
2. About AstraZeneca
• Alongside our own R&D, we partner
with others, combining skills and
resources to broaden the potential
for successful innovation.
• We believe that only by working together with others
who have a part of play in improving healthcare can real
progress be made.
• We work closely with others in the healthcare
community, including physicians and those who pay for
healthcare, to understand their challenges and how we
can combine skills and resources to achieve a
common goal: improved health.
2 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
3. Länkade Data i Läkemedelsforskningen
Två exempel på hur AstraZeneca arbetar med europeiska
forskningssamarbeten och internationella standard
organisationer för att göra kemidata och kliniska studie data
enklare att använda med hjälp av nästa generations web teknik.
3 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 Set area descriptor | Sub level 1
4. Webben fyllde 25 år 12 mars
4 Kerstin Forsberg | Vinnova webbinar 18 mars 2013 AZIT | R&D Information
Web of (Linked) Data
Web of Documents
An Intro To The Semantic Web: Why You Need To
Know About It Sooner Than Later , by Samantha Wong
Image Source: Frederic Martin
5. RDF (semantic web basen) fyllde 15 år 22 febr.
5 Kerstin Forsberg | Vinnova webbinar 18 mars 2013 AZIT | R&D Information
Web of (Linked) Data
Web of Documents
subject predicat object
Common Model (“Triples”)
Resource Description Framework
6. The Project
The Innovative Medicines
Initiative
• EC funded public-private
partnership for
pharmaceutical research
• Focus on key problems
– Efficacy, Safety,
Education & Training,
Knowledge
Management
The Open PHACTS Project
• Create a semantic integration hub (“Open
Pharmacological Space”)…
• Delivering services to support on-going drug
discovery programs in pharma and public domain
• Not just another project; Leading academics in
semantics, pharmacology and informatics, driven by
solid industry business requirements
• 23 academic partners, 8 pharmaceutical companies,
3 biotechs
• Work split into clusters:
• Tehnical Build
• Scientific Drive
• Community & Sustainability
7. Pre-competitive Informatics:
Pharma are all accessing, processing, storing & re-processing external research data
Literature
PubChem
Genbank
Patents
Databases
Downloads
Data Integration Data Analysis
Firewalled Databases
Repeat @
each
company
x
Lowering industry firewalls: pre-competitive informatics in drug discovery
Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
9. Number sum Nr of 1 Question
15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse
18 14 8
Given compound X, what is its predicted secondary pharmacology? What are the on and
off,target safety concerns for a compound? What is the evidence and how reliable is that
evidence (journal impact factor, KOL) for findings associated with a compound?
24 13 8
Given a target find me all actives against that target. Find/predict polypharmacology of actives.
Determine ADMET profile of actives.
32 13 8 For a given interaction profile, give me compounds similar to it.
37 13 8
The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data
in serine protease assays for molecules that contain substructure X.
38 13 8
Retrieve all experimental and clinical data for a given list of compounds defined by their chemical
structure (with options to match stereochemistry or not).
41 13 8
A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the
compounds known to modulate the target directly? What are the compounds that may modulate
the target directly? i.e. return all cmpds active in assays where the resolution is at least at the
level of the target family (i.e. PKC) both from structured assay databases and the literature.
44 13 8 Give me all active compounds on a given target with the relevant assay data
46 13 8
Give me the compound(s) which hit most specifically the multiple targets in a given pathway
(disease)
59 14 8 Identify all known protein-protein interaction inhibitors
Business Question Driven Approach
http://www.sciencedirect.com/science/article/pii/S1359644613001542
10. Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
12. Clinical data standards, today’s documentation
12 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
Human readable
documentation in 200+
pages PDF:s, Excel:s (and
some in XML).
13. Clinical data standards in the Semantic Web
Enable end-to-end interoperable data standards
for clinical research
13 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
14. Clinical data standards in the Semantic Web
Example: 14 RDF triples describing one variable (“AEACN”)
14 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
RDF triples describing one variable/data element
and also linked to related standard parts
15. • CDISC2RDF started Oct 2012 as a pre-
competitive project with AZ, Roche, W3C et al.
to show case Semantic Web standards and
Linked Data principles.
• FDA meeting Nov 2012: Solutions for Study
Data Exchange Standards Meeting – W3C
Semantic Web presentation
• June 2013 the Semantic Technology project,
a FDA/PhUSE working group for Emerging
Technologies, with 25+ repr. from FDA,
CDISC, Pharma:s, CRO:s and software
vendors.
• Oct 2013 press release: Representing
existing standards (SDTM, CDASH,
SEND, ADaM) in RDF.
Clinical standards in the Semantic Web
Community building and knowledge sharing
15 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
CDISC Interchange Europe 2011 and 2012
presentations from Roche and AstraZeneca
16. AstraZeneca’s view on “Semantics”
Enabling the hyperconnected enterprise
16 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
“We need to build a linked
data architecture enabling us
to ask questions and solve
business problems across a
heterogeneous information
landscape extending beyond
the traditional boundaries of
the enterprise.”
semanticsconnectsusall
17. Acknowledgements
AZ’s Linked Data of Practice members:
Tom Plasterer (lead), Jim Morris, Courtland Yockey, Sorana
Popa, Rob Hernandez, Mike Westaway, Rajan Desai, Simon
Rakov, Dana Crowley, Ian Dix, Johan Törnqvist
Collaborators and Advisors:
• Charlie Mead – IO Informatics
• Dean Allemang – Working Ontologist
• Frederik Malfait – IMOS consulting / Roche
• Phil Ashworth – TopQuadrant
17 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information
Thank you! Kerstin.l.forsberg@astrazeneca.com
Notas do Editor
Mx/psa, how calculated who did it?Mash up. With your data too,- top layer join together but need them allcommerical
10Can go get everythingOPS not a repo of the world, specific sources