SlideShare uma empresa Scribd logo
1 de 67
Baixar para ler offline
Workflow tools for Life Science
Research
Apr 2017
nick@openphactsfoundation.org
This webinar is being
recorded and will be uploaded
to Slideshare etc afterwards
@Open_PHACTS
LinkedIn Group
RSS & Newsletter
Agenda
Introduction to common workflow language (CWL) -
Michael Crusoe
Accessing Open PHACTS with Knime nodes to support
Life Science Business questions - James Lumley, Eli
Lilly & Company
Pipeline Pilot workflows with Open PHACTS Examples
Jean-Marc Neefs, Janssen
Panel discussion on where next with Workflow and
supporting Life Science research
Our speakers & panel
Michael Crusoe, Common Workflow Language co-founder
James Lumley, Informatics, Eli Lilly & Company
Jean-Marc Neefs, Janssen
Panel:
– Michael Crusoe, James Lumley, Jean-Marc Neefs
– Derek Marren, Eli Lilly
– Daniela Digles, University of Vienna
– Andrei Caracoti, Biovia
Workflow Examples
The Application of the Open Pharmacological Concepts Triple Store (Open
PHACTS) to Support Drug Discovery Research
PLoS ONE 2014 DOI: 10.1371/journal.pone.0115460
Drug discovery FAQs: workflows for answering multidomain drug discovery
questions
Drug Discovery Today 2015 DOI: 10.1016/j.drudis.2014.11.006
Open PHACTS computational protocols for in silico target validation of
cellular phenotypic screens: knowing the knowns
Med. Chem. Commun. 2016 DOI: 10.1039/c6md00065g
Selectivity profiling of BCRP versus P-gp inhibition: from automated
collection of polypharmacology data to multi-label learning
J Cheminform 2016 DOI: 10.1186/s13321-016-0121-y
https://goo.gl/Aujxzahttps://goo.gl/Aujxza
Portable life science workflows with
the Common Workflow Language
Michael R. Crusoe CWL Community Engineer
2017-04-24 @biocrusoe / #CommonWL
Open PHACTS: Workflow tools for Life Science Research
https://goo.gl/Aujxza
Why use a workflow management system?
Features can include:
● separation of concerns: focus on the science being
done first; then optimize execution later
● automatic job execution: start a complicated
analysis involving many pieces with a single command
● scaling (across nodes, clusters, and possibly
continents)
● automatically generated graphical user interfaces
(example: Galaxy)
● How was this file made? (automatic provenance
tracking)
https://goo.gl/Aujxza
Existing computational research workflow
systems
https://github.com/common-workflow-language/common-workflo
w-language/wiki/Existing-Workflow-systems
https://goo.gl/Aujxza
Existing computational research workflow
systems
https://github.com/common-workflow-language/common-workflo
w-language/wiki/Existing-Workflow-systems
https://goo.gl/Aujxza
Existing computational research workflow
systems
https://github.com/common-workflow-language/common-workflo
w-language/wiki/Existing-Workflow-systems
https://goo.gl/Aujxza
Existing computational research workflow
systems
https://github.com/common-workflow-language/common-workflo
w-language/wiki/Existing-Workflow-systems
https://goo.gl/Aujxza
Why have a standard?
● Standards create a surface for collaboration that
promote innovation
● Research frequently dip in and out of different
systems but interoperability is not a basic
feature.
● Funders, journals, and other sources of
incentives prefer standards over proprietary or
single-source approaches
https://goo.gl/Aujxza
Common Workflow Language v1.0
● Common format for bioinformatics (and more!) tool
& workflow execution
● Community based standards effort, not a specific
software package; Very extensible
● Defined with a schema, specification, & test
suite
● Designed for shared-nothing clusters, academic
clusters, cloud environments, and local execution
● Supports the use of containers (e.g. Docker) and
shared research computing clusters with locally
installed software
https://goo.gl/Aujxza
Participating Organizations & Projects
Your logo here?
https://goo.gl/Aujxza
Why use the Common Workflow Language?
Develop your pipeline on your local computer
(optionally with containers)
Execute on your research cluster or in the cloud
Deliver to users via workbenches like Arvados, Rabix,
Toil. Galaxy, Apache Taverna, AWE, Funnel (GCP)
support is in alpha stage.
https://goo.gl/Aujxza
● Low barrier to entry for implementers
● Support tooling such as generators, GUIs, converters
● Allow extensions, but must be well marked
● Be part of linked data ecosystem
● Be pragmatic
CWL Design principles
https://goo.gl/Aujxza
Linked Data & CWL
● Hyperlinks are common currency
● Bring your own RDF ontologies for metadata
● Supports SPARQL to query
Example: can use the EDAM ontology (ELIXIR-DK) to
specify file formats and reason about them:
“FASTQ Sanger” encoding is a type of FASTQ file
https://goo.gl/Aujxza
Use Cases for the CWL standards
Publication reproducibility, reusability
Workflow creation & improvement across institutions
and continents
Contests & challenges
Analysis on non-public data sets, possibly using GA4GH
job & workflow submission API
https://goo.gl/Aujxza
Early Adopters
(US) National Cancer Institute Cloud Pilots (Seven
Bridges Genomics, Institute for Systems Biology)
Cincinnati Children’s Hospital Medical Research Center
(Andrey Kartashov & Artem Barski)
bcbio: Validated, scalable, community developed
variant calling, RNA-seq and small RNA analysis (docs,
BOSC 2016 talk: video, slides) (Brad Chapman et al.)
Duke University, Center for Genomic and Computational
Biology: GENOMICS OF GENE REGULATION project (BOSC
2016 talk: video, slides, poster)(Dan Leehr et al.)
NCI DREAM SMC-RNA Challenge (Kyle Ellrott et al.)
Presentation
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
Sample Real World CWL Workflow
Courtesy US NIH NCI Genomic Data Commons, visualization from
https://view.commonwl.org/workflows/github.com/NCI-GDC/gdc-dnaseq-cwl/tree/master/workflows/d
naseq/transform.cwl
https://goo.gl/Aujxza
Announcing: v1.0!
http://www.commonwl.org/v1.0/
Authors:
Peter Amstutz, Arvados Project, Curoverse
Michael R. Crusoe, Common Workflow Language project
Nebojša Tijanić, Seven Bridges Genomics
Contributors:
Brad Chapman, Harvard Chan School of Public Health
John Chilton, Galaxy Project, Pennsylvania State University
Michael Heuer, UC Berkeley AMPLab
Andrey Kartashov, Cincinnati Children's Hospital
Dan Leehr, Duke University
Hervé Ménager, Institut Pasteur
Maya Nedeljkovich, Seven Bridges Genomics
Matt Scales, Institute of Cancer Research, London
Stian Soiland-Reyes, University of Manchester
Luka Stojanovic, Seven Bridges Genomics
https://goo.gl/Aujxza
How did we do it?
Initial group started at BOSC Codefest 2014
Moved to open mailing list and extended onto GitHub &
then Gitter chat
Frequent (twice a month or more) video chats to work
through design issues with summaries emailed
Some participants doing CWL community work during
their day jobs, some on “nights & weekends”.
In October 2015 Seven Bridges sponsored one of the
co-founders (M. Crusoe) to work full time on the
project
https://goo.gl/Aujxza
Community Based Standards development
Different model than traditional nation-based or
regulatory approach
We adopted the Open-Stand.org Modern Paradigm for
Standards: Cooperation, Adherence to Principles (Due
process, Broad consensus, Transparency, Balance,
Openness), Collective Empowerment, (Free)
Availability, Voluntary Adoption
https://goo.gl/Aujxza
Challenges
Giving a standard to a community that is “free as in
puppies”: How does the community participate? How will
maintenance be funded?
CWL isn’t the only effort that has these needs; can we
join with related efforts?
https://goo.gl/Aujxza
A Grand Opportunity
if:
properly funded and embraced by the wider community
then:
the researchobject.org standards + CWL could fulfill
the huge need for an executable and complete
description of how computationaly derived research
results were made
https://goo.gl/Aujxza
What’s next for the Common Workflow
Language?
Public charity to own the standard
Tooling improvements
More implementations (Galaxy, Taverna, Kepler, Xenon,
…?)
Integration with researchobject.org standards for
attribution, provenance, and metadata guidance.
https://goo.gl/Aujxza
Thanks!
http://commonwl.org
https://goo.gl/Aujxza
Michael R. Crusoe, who is this guy?
Phoenix, Arizona (Sonoran Desert), USA
Studied at Arizona State University: Computer Science;
time in industry as a developer & system administrator
(Google, others); returned to academia to study
Microbiology.
Introduced to bioinformatics via Anolis (lizard)
genome assembly and analysis (Kenro Kusumi, Arizona
State University)
Returned to software engineering as a Research
Software Engineer for k-h-mer project (C. Titus Brown,
Michigan State University, then U. of California,
Davis)
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
File type & metadata
Input parameters
Output parameters
class: CommandLineTool
cwlVersion: v1.0
doc: Sort by chromosomal coordinates
inputs:
aligned_sequences:
type: File
format: edam:format_2572 # BAM binary alignment format
inputBinding:
position: 1
outputs:
sorted_aligned_sequences:
type: stdout
format: edam:format_2572
Executable
baseCommand: [samtools, sort]
hints:
DockerRequirement:
dockerPull: quay.io/cancercollaboratory/dockstore-tool-samtools-sort
Runtime environment
$namespaces: { edam: "http://edamontology.org/" }
$schemas: [ "http://edamontology.org/EDAM_1.15.owl" ]
Linked data support
Example: samtools-sort.cwl
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
File type & metadata
class: CommandLineTool
cwlVersion: v1.0
doc: Sort by chromosomal coordinates
● Identify as a CommandLineTool object
● Core spec includes simple comments
● Metadata about tool extensible to arbitrary RDF
vocabularies, e.g.
○ Biotools & EDAM
○ Dublin Core Terms (DCT)
○ Description of a Project (DOAP)
● GA4GH Tool Registry project will develop best
practices for metadata & attribution
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
hints:
DockerRequirement:
dockerPull: quay.io/[...]samtools-sort
Runtime Environment
● Define the execution environment of the tool
● “requirements” must be fulfilled or an error
● “hints” are soft requirements (express preference
but not an error if not satisfied)
● Also used to enable optional CWL features
○ Mechanism for defining extensions
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
Input parameters
● Specify name & type of input parameters
○ Based on the Apache Avro type system
○ null, boolean, int, string, float, array, record
○ File formats can be IANA Media/MIME types, or from domain
specific ontologies, like EDAM for bioinformatics
● “inputBinding”: describes how to turn parameter
value into actual command line argument
inputs:
aligned_sequences:
type: File
format: edam:format_2572 # BAM binary format
inputBinding:
position: 1
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
File type & metadata
Input parameters
Output parameters
class: CommandLineTool
cwlVersion: v1.0
doc: Sort by chromosomal coordinates
inputs:
aligned_sequences:
type: File
format: edam:format_2572 # BAM binary alignment format
inputBinding:
position: 1
outputs:
sorted_aligned_sequences:
type: stdout
format: edam:format_2572
Executable
baseCommand: [samtools, sort]
hints:
DockerRequirement:
dockerPull: quay.io/cancercollaboratory/dockstore-tool-samtools-sort
Runtime environment
$namespaces: { edam: "http://edamontology.org/" }
$schemas: [ "http://edamontology.org/EDAM_1.15.owl" ]
Linked data support
Example: samtools-sort.cwl
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
inputs:
aligned_sequences:
type: File
format: edam:format_2572
inputBinding:
position: 1
baseCommand: [samtools, sort]
aligned_sequences:
class: File
location: example.bam
format: http://edamontology.org/format_2572
[“samtools”, “sort”, “example.bam”]
Input object
Command Line Building
● Associate input values with parameters
● Apply input bindings to generate strings
● Sort by “position”
● Prefix “base command”
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
outputs:
sorted_aligned_sequences:
type: stdout
format: edam:format_2572
Output parameters
● Specify name & type of output parameters
● In this example, capture the STDOUT stream from
“samtools sort” and tag it as being BAM formatted.
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
Workflows
● Specify data dependencies between steps
● Scatter/gather on steps
● Can nest workflows in steps
● Still working on:
● Conditionals & looping
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
Example: grep & count
steps:
grep:
run: grep.cwl
in:
pattern: pattern
infile: infiles
scatter: infile
out: [outfile]
wc:
run: wc.cwl
in:
infiles: grep/outfile
out: [outfile]
class: Workflow
cwlVersion: v1.0
inputs:
pattern: string
infiles: File[]
outputs:
outfile:
type: File
outputSource: wc/outfile
requirements:
- class: ScatterFeatureRequirement
Source file:
https://github.com/common-workflow-language/workflows/blob/2855f2c3ea875128ff62101295897d8d11d99b94
/workflows/presentation-demo/grep-and-count.cwl
Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA
Example: grep & count
class: Workflow
cwlVersion: v1.0
inputs:
pattern: string
infiles: File[]
outputs:
outfile:
type: File
outputSource: wc/outfile
requirements:
- class: ScatterFeatureRequirement
steps:
grep:
run: grep.cwl
in:
pattern: pattern
infile: infiles
scatter: infile
out: [outfile]
wc:
run: wc.cwl
in:
infiles: grep/outfile
out: [outfile]
Tool to run
Scatter over
input array
Connect output
of “grep” to input
of “wc”
Connect output of “wc”
to workflow output
Accessing the
Open PHACTS Linked Data API
with KNIME
James A. Lumley
Research IT, Eli Lilly
April 2017
The KNIME Analytics Platform
Open source platform for data analytics. Over 1000 modules (or nodes) to connect to all major data
sources; support for many data types inc. XML/JSON/Images./Docs/Chemical Formats; Math and Stats
functions, Predictive modelling and machine learning; Tool blending for Python/R/Weka/SQL/Java;
Interactive data views and reporting. “a toolbox for any data scientist”.
https://www.knime.org/knime-analytics-platform
♦ 2016 (VU Amsterdam)*
• Original Nodes and workflows by Ronald Siebes, VU Amsterdam
• OPS_Swagger and OPS_JSON nodes used to create and execute the
parameterized API calls, as well as transforming the output to a tabular form
♦ Q2 2017 (Eli Lilly)
• Update of Erl Wood KNIME Nodes will add new OPS node developed internally
at Eli Lilly with input from OPS
– KNIME Node: Luke Bullard
– Team input: James Lumley / Derek Marren (Lilly); Daniella Digles / Nick Lynch (OPS);
Randy Kerber (d2discovery)
– Workflows: James Lumley
• Single Node allows user to select the call of interest and return both JSON and
Tabular results
• Focus of development: Updating to new API, improving usability
• Further iterations possible once feedback received
OPS-KNIME Nodes
* http://www.openphactsfoundation.org/wp/wp-content/uploads/2016/02/2016-02-25_Creating-workflows-for-drug-discovery-with-Open-PHACTS-and-KNIME.pdf
OPS & Erl Wood Community Nodes
♦ View based on internal Beta
of Lilly opensource Erl Wood
nodes due for release Q2
2017
♦ Community  Erlwood Nodes
 Open PHACTS
♦ Open PHACTS sub-folder
contains single OPS Linked
Data API node that will allow a
configured call/return
Configuring the OPS Linked Data API node
♦ Preferences panel allows client/workflow
level control of API URL Endpoint and API
Id/Key, avoiding the need to configure
these in the node
Using the OPS Linked Data API node
App Id and App Key fields are
automatically populated if they
are set in the preferences
Drop down ‘Select Method Type’
allows selection of API call
Using the OPS Linked Data API node
Input port is optional. Toggle
on input field allows user string
input or selection of input table
column
First output port returns
formatted data table
(corresponding to API param
“_format=tsv”)
Using the OPS Linked Data API node
Drop down ‘Select Method
Type’ allows selection of API
call
Logically grouped methods
match developer API docs
(swagger) at
https://dev.openphacts.org/d
ocs/2.1
Allows formatted results table or full
JSON/XML return for debug/analysis
First output port returns
formatted data table
(corresponding to API
param “_format=tsv”)
Second output port is
optional and if
requested, will return
JSON or XML response
(via second API call
without _format param)
User input and example return
User input
User input and example return
Raw Tabular Return:
Pivoted to show Column Names and Values:
User input and example return
Optional JSON Output as raw JSON Object
User input and example return
Rather than parsing the JSON to
understand the raw output, the node also
has an attached ‘View’ with a hierarchically
formatted tree view of the JSON output:
User input and example return
Generic JSON Extraction to
flat table shows additional
data returned from API,
deeper JSON processing
can be done using KNIME
JSON nodes
JSON/XML Support in KNIME 3.3
Extensive native support for JSON or XML parsing with KNIME 3.3 allows
complete/custom parsing of the return JSON object for full debugging
Chemistry Support on input SMI
Input columns of differing
chemical types are
automatically converted to
SMILES via Marvin if the API
param is SMILES based
API Timeouts and URL changes
Advanced developers can
change the API timeout value or
edit the API URL on a single
node using the Web Service
panel
1. A new KNIME 3.3 compatible “OpenPHACTS Linked Data API”
node will be released in Q2 2017
2. Designed for users, it provides easy configuration of API settings
and parameters with easy to user tabular data return (via API
_format parameter)
3. Designed for developers it allows additional full JSON/XML
response that can be viewed/parsed by the expert user to see raw
response
4. Further example workflows will be release once the node is
available
Summary
Pipeline Pilot workflows with
Open PHACTS Examples
Jean-Marc Neefs – Janssen
Open PHACTS 2012-2013
Targets
Pathways
Compounds
UniProt
Gene Ontology
Enzyme Classification
ChEMBL
DrugBank
ChEBI
WikiPathways
Reactome
KEGG
List compounds active on target X
Open PHACTS + Pipeline Pilot Workflow:
1. Search target information
• [OPS API call ‘Free Text to Concept’]
2. Get active compounds on that target
• [OPS API call ‘Target Pharmacology: List’]
Schematic Pipeline
Open PHACTS
components
Integrated Pipeline includes more data sources
Open PHACTS 2013-2014
Targets
Diseases
TissuesPathways
Compounds
DisGeNet
Human Protein Atlas
NextProt
Find compounds against Alzheimer’s targets
Open PHACTS + Pipeline Pilot Workflow:
1. Search for disease
• [OPS API call ‘Free Text to Concept’]
2. Search target information
• [OPS API call ‘Targets for Disease: List’]
3. Get active compounds on that target
• [OPS API call ‘Target Pharmacology: List’]
Real Pipeline
Open PHACTS 2015-2016
Targets
Diseases
TissuesPathways
Compounds
More Datasets
on
Phenotypic Screening
Source: Digles et al. Medchemcomm. 2016; 7(6): 1237–1244, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5063042/
Complex workflows: Collecting information from phenotypic screens
Panel Questions and
discussions
Thanks for your engagement

Mais conteúdo relacionado

Semelhante a Open PHACTS April 2017 Science webinar Workflow tools

Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble
 

Semelhante a Open PHACTS April 2017 Science webinar Workflow tools (20)

Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
 
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content TypesIlik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
 
Stevan Harnad: Slides for promoting open access mandates and metrics
Stevan Harnad: Slides for promoting open access mandates and metricsStevan Harnad: Slides for promoting open access mandates and metrics
Stevan Harnad: Slides for promoting open access mandates and metrics
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Project On-Science
Project On-ScienceProject On-Science
Project On-Science
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
Ebi
EbiEbi
Ebi
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics Institute
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Open experiments and open-source
Open experiments and open-sourceOpen experiments and open-source
Open experiments and open-source
 

Mais de open_phacts

Mais de open_phacts (19)

Open PHACTS Webinar Series - Chemistry Platform
Open PHACTS Webinar Series - Chemistry PlatformOpen PHACTS Webinar Series - Chemistry Platform
Open PHACTS Webinar Series - Chemistry Platform
 
Open PHACTS webinar June 2016 - Data2Discovery
Open PHACTS webinar June 2016 - Data2DiscoveryOpen PHACTS webinar June 2016 - Data2Discovery
Open PHACTS webinar June 2016 - Data2Discovery
 
Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016
 
Open PHACTS Webinar: Computational Protocols for In Silico Target Validation
Open PHACTS Webinar: Computational Protocols for In Silico Target ValidationOpen PHACTS Webinar: Computational Protocols for In Silico Target Validation
Open PHACTS Webinar: Computational Protocols for In Silico Target Validation
 
Patent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTSPatent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTS
 
2013-12-04 Experimental data guided docking allows to elucidate the molecular...
2013-12-04 Experimental data guided docking allows to elucidate the molecular...2013-12-04 Experimental data guided docking allows to elucidate the molecular...
2013-12-04 Experimental data guided docking allows to elucidate the molecular...
 
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - KNIME
 
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
2015-05-19 Open PHACTS Drug Discovery Workflow Workshop - The API
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
2014-03-20 Open PHACTS - A Data Platform for Drug Discovery
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
 
2013 Open PHACTS Architecture Poster
2013 Open PHACTS Architecture Poster2013 Open PHACTS Architecture Poster
2013 Open PHACTS Architecture Poster
 
2013 Open PHACTS Scientific Questions Poster
2013 Open PHACTS Scientific Questions Poster2013 Open PHACTS Scientific Questions Poster
2013 Open PHACTS Scientific Questions Poster
 
2013 Open PHACTS Exemplars Poster
2013 Open PHACTS Exemplars Poster2013 Open PHACTS Exemplars Poster
2013 Open PHACTS Exemplars Poster
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
2011-11-07 Open PHACTS Poster
2011-11-07 Open PHACTS Poster2011-11-07 Open PHACTS Poster
2011-11-07 Open PHACTS Poster
 

Último

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Último (20)

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Open PHACTS April 2017 Science webinar Workflow tools

  • 1. Workflow tools for Life Science Research Apr 2017 nick@openphactsfoundation.org
  • 2. This webinar is being recorded and will be uploaded to Slideshare etc afterwards @Open_PHACTS LinkedIn Group RSS & Newsletter
  • 3. Agenda Introduction to common workflow language (CWL) - Michael Crusoe Accessing Open PHACTS with Knime nodes to support Life Science Business questions - James Lumley, Eli Lilly & Company Pipeline Pilot workflows with Open PHACTS Examples Jean-Marc Neefs, Janssen Panel discussion on where next with Workflow and supporting Life Science research
  • 4. Our speakers & panel Michael Crusoe, Common Workflow Language co-founder James Lumley, Informatics, Eli Lilly & Company Jean-Marc Neefs, Janssen Panel: – Michael Crusoe, James Lumley, Jean-Marc Neefs – Derek Marren, Eli Lilly – Daniela Digles, University of Vienna – Andrei Caracoti, Biovia
  • 5. Workflow Examples The Application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to Support Drug Discovery Research PLoS ONE 2014 DOI: 10.1371/journal.pone.0115460 Drug discovery FAQs: workflows for answering multidomain drug discovery questions Drug Discovery Today 2015 DOI: 10.1016/j.drudis.2014.11.006 Open PHACTS computational protocols for in silico target validation of cellular phenotypic screens: knowing the knowns Med. Chem. Commun. 2016 DOI: 10.1039/c6md00065g Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning J Cheminform 2016 DOI: 10.1186/s13321-016-0121-y
  • 6. https://goo.gl/Aujxzahttps://goo.gl/Aujxza Portable life science workflows with the Common Workflow Language Michael R. Crusoe CWL Community Engineer 2017-04-24 @biocrusoe / #CommonWL Open PHACTS: Workflow tools for Life Science Research
  • 7. https://goo.gl/Aujxza Why use a workflow management system? Features can include: ● separation of concerns: focus on the science being done first; then optimize execution later ● automatic job execution: start a complicated analysis involving many pieces with a single command ● scaling (across nodes, clusters, and possibly continents) ● automatically generated graphical user interfaces (example: Galaxy) ● How was this file made? (automatic provenance tracking)
  • 8. https://goo.gl/Aujxza Existing computational research workflow systems https://github.com/common-workflow-language/common-workflo w-language/wiki/Existing-Workflow-systems
  • 9. https://goo.gl/Aujxza Existing computational research workflow systems https://github.com/common-workflow-language/common-workflo w-language/wiki/Existing-Workflow-systems
  • 10. https://goo.gl/Aujxza Existing computational research workflow systems https://github.com/common-workflow-language/common-workflo w-language/wiki/Existing-Workflow-systems
  • 11. https://goo.gl/Aujxza Existing computational research workflow systems https://github.com/common-workflow-language/common-workflo w-language/wiki/Existing-Workflow-systems
  • 12. https://goo.gl/Aujxza Why have a standard? ● Standards create a surface for collaboration that promote innovation ● Research frequently dip in and out of different systems but interoperability is not a basic feature. ● Funders, journals, and other sources of incentives prefer standards over proprietary or single-source approaches
  • 13. https://goo.gl/Aujxza Common Workflow Language v1.0 ● Common format for bioinformatics (and more!) tool & workflow execution ● Community based standards effort, not a specific software package; Very extensible ● Defined with a schema, specification, & test suite ● Designed for shared-nothing clusters, academic clusters, cloud environments, and local execution ● Supports the use of containers (e.g. Docker) and shared research computing clusters with locally installed software
  • 15. https://goo.gl/Aujxza Why use the Common Workflow Language? Develop your pipeline on your local computer (optionally with containers) Execute on your research cluster or in the cloud Deliver to users via workbenches like Arvados, Rabix, Toil. Galaxy, Apache Taverna, AWE, Funnel (GCP) support is in alpha stage.
  • 16. https://goo.gl/Aujxza ● Low barrier to entry for implementers ● Support tooling such as generators, GUIs, converters ● Allow extensions, but must be well marked ● Be part of linked data ecosystem ● Be pragmatic CWL Design principles
  • 17. https://goo.gl/Aujxza Linked Data & CWL ● Hyperlinks are common currency ● Bring your own RDF ontologies for metadata ● Supports SPARQL to query Example: can use the EDAM ontology (ELIXIR-DK) to specify file formats and reason about them: “FASTQ Sanger” encoding is a type of FASTQ file
  • 18. https://goo.gl/Aujxza Use Cases for the CWL standards Publication reproducibility, reusability Workflow creation & improvement across institutions and continents Contests & challenges Analysis on non-public data sets, possibly using GA4GH job & workflow submission API
  • 19. https://goo.gl/Aujxza Early Adopters (US) National Cancer Institute Cloud Pilots (Seven Bridges Genomics, Institute for Systems Biology) Cincinnati Children’s Hospital Medical Research Center (Andrey Kartashov & Artem Barski) bcbio: Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis (docs, BOSC 2016 talk: video, slides) (Brad Chapman et al.) Duke University, Center for Genomic and Computational Biology: GENOMICS OF GENE REGULATION project (BOSC 2016 talk: video, slides, poster)(Dan Leehr et al.) NCI DREAM SMC-RNA Challenge (Kyle Ellrott et al.) Presentation
  • 20. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA Sample Real World CWL Workflow Courtesy US NIH NCI Genomic Data Commons, visualization from https://view.commonwl.org/workflows/github.com/NCI-GDC/gdc-dnaseq-cwl/tree/master/workflows/d naseq/transform.cwl
  • 21. https://goo.gl/Aujxza Announcing: v1.0! http://www.commonwl.org/v1.0/ Authors: Peter Amstutz, Arvados Project, Curoverse Michael R. Crusoe, Common Workflow Language project Nebojša Tijanić, Seven Bridges Genomics Contributors: Brad Chapman, Harvard Chan School of Public Health John Chilton, Galaxy Project, Pennsylvania State University Michael Heuer, UC Berkeley AMPLab Andrey Kartashov, Cincinnati Children's Hospital Dan Leehr, Duke University Hervé Ménager, Institut Pasteur Maya Nedeljkovich, Seven Bridges Genomics Matt Scales, Institute of Cancer Research, London Stian Soiland-Reyes, University of Manchester Luka Stojanovic, Seven Bridges Genomics
  • 22. https://goo.gl/Aujxza How did we do it? Initial group started at BOSC Codefest 2014 Moved to open mailing list and extended onto GitHub & then Gitter chat Frequent (twice a month or more) video chats to work through design issues with summaries emailed Some participants doing CWL community work during their day jobs, some on “nights & weekends”. In October 2015 Seven Bridges sponsored one of the co-founders (M. Crusoe) to work full time on the project
  • 23. https://goo.gl/Aujxza Community Based Standards development Different model than traditional nation-based or regulatory approach We adopted the Open-Stand.org Modern Paradigm for Standards: Cooperation, Adherence to Principles (Due process, Broad consensus, Transparency, Balance, Openness), Collective Empowerment, (Free) Availability, Voluntary Adoption
  • 24. https://goo.gl/Aujxza Challenges Giving a standard to a community that is “free as in puppies”: How does the community participate? How will maintenance be funded? CWL isn’t the only effort that has these needs; can we join with related efforts?
  • 25. https://goo.gl/Aujxza A Grand Opportunity if: properly funded and embraced by the wider community then: the researchobject.org standards + CWL could fulfill the huge need for an executable and complete description of how computationaly derived research results were made
  • 26. https://goo.gl/Aujxza What’s next for the Common Workflow Language? Public charity to own the standard Tooling improvements More implementations (Galaxy, Taverna, Kepler, Xenon, …?) Integration with researchobject.org standards for attribution, provenance, and metadata guidance.
  • 28. https://goo.gl/Aujxza Michael R. Crusoe, who is this guy? Phoenix, Arizona (Sonoran Desert), USA Studied at Arizona State University: Computer Science; time in industry as a developer & system administrator (Google, others); returned to academia to study Microbiology. Introduced to bioinformatics via Anolis (lizard) genome assembly and analysis (Kenro Kusumi, Arizona State University) Returned to software engineering as a Research Software Engineer for k-h-mer project (C. Titus Brown, Michigan State University, then U. of California, Davis)
  • 29. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA File type & metadata Input parameters Output parameters class: CommandLineTool cwlVersion: v1.0 doc: Sort by chromosomal coordinates inputs: aligned_sequences: type: File format: edam:format_2572 # BAM binary alignment format inputBinding: position: 1 outputs: sorted_aligned_sequences: type: stdout format: edam:format_2572 Executable baseCommand: [samtools, sort] hints: DockerRequirement: dockerPull: quay.io/cancercollaboratory/dockstore-tool-samtools-sort Runtime environment $namespaces: { edam: "http://edamontology.org/" } $schemas: [ "http://edamontology.org/EDAM_1.15.owl" ] Linked data support Example: samtools-sort.cwl
  • 30. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA File type & metadata class: CommandLineTool cwlVersion: v1.0 doc: Sort by chromosomal coordinates ● Identify as a CommandLineTool object ● Core spec includes simple comments ● Metadata about tool extensible to arbitrary RDF vocabularies, e.g. ○ Biotools & EDAM ○ Dublin Core Terms (DCT) ○ Description of a Project (DOAP) ● GA4GH Tool Registry project will develop best practices for metadata & attribution
  • 31. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA hints: DockerRequirement: dockerPull: quay.io/[...]samtools-sort Runtime Environment ● Define the execution environment of the tool ● “requirements” must be fulfilled or an error ● “hints” are soft requirements (express preference but not an error if not satisfied) ● Also used to enable optional CWL features ○ Mechanism for defining extensions
  • 32. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA Input parameters ● Specify name & type of input parameters ○ Based on the Apache Avro type system ○ null, boolean, int, string, float, array, record ○ File formats can be IANA Media/MIME types, or from domain specific ontologies, like EDAM for bioinformatics ● “inputBinding”: describes how to turn parameter value into actual command line argument inputs: aligned_sequences: type: File format: edam:format_2572 # BAM binary format inputBinding: position: 1
  • 33. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA File type & metadata Input parameters Output parameters class: CommandLineTool cwlVersion: v1.0 doc: Sort by chromosomal coordinates inputs: aligned_sequences: type: File format: edam:format_2572 # BAM binary alignment format inputBinding: position: 1 outputs: sorted_aligned_sequences: type: stdout format: edam:format_2572 Executable baseCommand: [samtools, sort] hints: DockerRequirement: dockerPull: quay.io/cancercollaboratory/dockstore-tool-samtools-sort Runtime environment $namespaces: { edam: "http://edamontology.org/" } $schemas: [ "http://edamontology.org/EDAM_1.15.owl" ] Linked data support Example: samtools-sort.cwl
  • 34. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA inputs: aligned_sequences: type: File format: edam:format_2572 inputBinding: position: 1 baseCommand: [samtools, sort] aligned_sequences: class: File location: example.bam format: http://edamontology.org/format_2572 [“samtools”, “sort”, “example.bam”] Input object Command Line Building ● Associate input values with parameters ● Apply input bindings to generate strings ● Sort by “position” ● Prefix “base command”
  • 35. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA outputs: sorted_aligned_sequences: type: stdout format: edam:format_2572 Output parameters ● Specify name & type of output parameters ● In this example, capture the STDOUT stream from “samtools sort” and tag it as being BAM formatted.
  • 36. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA Workflows ● Specify data dependencies between steps ● Scatter/gather on steps ● Can nest workflows in steps ● Still working on: ● Conditionals & looping
  • 37. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA Example: grep & count steps: grep: run: grep.cwl in: pattern: pattern infile: infiles scatter: infile out: [outfile] wc: run: wc.cwl in: infiles: grep/outfile out: [outfile] class: Workflow cwlVersion: v1.0 inputs: pattern: string infiles: File[] outputs: outfile: type: File outputSource: wc/outfile requirements: - class: ScatterFeatureRequirement Source file: https://github.com/common-workflow-language/workflows/blob/2855f2c3ea875128ff62101295897d8d11d99b94 /workflows/presentation-demo/grep-and-count.cwl
  • 38. Adapted from Peter Amstutz’s presentation, licensed CC-BY-SA Example: grep & count class: Workflow cwlVersion: v1.0 inputs: pattern: string infiles: File[] outputs: outfile: type: File outputSource: wc/outfile requirements: - class: ScatterFeatureRequirement steps: grep: run: grep.cwl in: pattern: pattern infile: infiles scatter: infile out: [outfile] wc: run: wc.cwl in: infiles: grep/outfile out: [outfile] Tool to run Scatter over input array Connect output of “grep” to input of “wc” Connect output of “wc” to workflow output
  • 39. Accessing the Open PHACTS Linked Data API with KNIME James A. Lumley Research IT, Eli Lilly April 2017
  • 40. The KNIME Analytics Platform Open source platform for data analytics. Over 1000 modules (or nodes) to connect to all major data sources; support for many data types inc. XML/JSON/Images./Docs/Chemical Formats; Math and Stats functions, Predictive modelling and machine learning; Tool blending for Python/R/Weka/SQL/Java; Interactive data views and reporting. “a toolbox for any data scientist”. https://www.knime.org/knime-analytics-platform
  • 41. ♦ 2016 (VU Amsterdam)* • Original Nodes and workflows by Ronald Siebes, VU Amsterdam • OPS_Swagger and OPS_JSON nodes used to create and execute the parameterized API calls, as well as transforming the output to a tabular form ♦ Q2 2017 (Eli Lilly) • Update of Erl Wood KNIME Nodes will add new OPS node developed internally at Eli Lilly with input from OPS – KNIME Node: Luke Bullard – Team input: James Lumley / Derek Marren (Lilly); Daniella Digles / Nick Lynch (OPS); Randy Kerber (d2discovery) – Workflows: James Lumley • Single Node allows user to select the call of interest and return both JSON and Tabular results • Focus of development: Updating to new API, improving usability • Further iterations possible once feedback received OPS-KNIME Nodes * http://www.openphactsfoundation.org/wp/wp-content/uploads/2016/02/2016-02-25_Creating-workflows-for-drug-discovery-with-Open-PHACTS-and-KNIME.pdf
  • 42. OPS & Erl Wood Community Nodes ♦ View based on internal Beta of Lilly opensource Erl Wood nodes due for release Q2 2017 ♦ Community  Erlwood Nodes  Open PHACTS ♦ Open PHACTS sub-folder contains single OPS Linked Data API node that will allow a configured call/return
  • 43. Configuring the OPS Linked Data API node ♦ Preferences panel allows client/workflow level control of API URL Endpoint and API Id/Key, avoiding the need to configure these in the node
  • 44. Using the OPS Linked Data API node App Id and App Key fields are automatically populated if they are set in the preferences Drop down ‘Select Method Type’ allows selection of API call
  • 45. Using the OPS Linked Data API node Input port is optional. Toggle on input field allows user string input or selection of input table column First output port returns formatted data table (corresponding to API param “_format=tsv”)
  • 46. Using the OPS Linked Data API node Drop down ‘Select Method Type’ allows selection of API call Logically grouped methods match developer API docs (swagger) at https://dev.openphacts.org/d ocs/2.1
  • 47. Allows formatted results table or full JSON/XML return for debug/analysis First output port returns formatted data table (corresponding to API param “_format=tsv”) Second output port is optional and if requested, will return JSON or XML response (via second API call without _format param)
  • 48. User input and example return User input
  • 49. User input and example return Raw Tabular Return: Pivoted to show Column Names and Values:
  • 50. User input and example return Optional JSON Output as raw JSON Object
  • 51. User input and example return Rather than parsing the JSON to understand the raw output, the node also has an attached ‘View’ with a hierarchically formatted tree view of the JSON output:
  • 52. User input and example return Generic JSON Extraction to flat table shows additional data returned from API, deeper JSON processing can be done using KNIME JSON nodes
  • 53. JSON/XML Support in KNIME 3.3 Extensive native support for JSON or XML parsing with KNIME 3.3 allows complete/custom parsing of the return JSON object for full debugging
  • 54. Chemistry Support on input SMI Input columns of differing chemical types are automatically converted to SMILES via Marvin if the API param is SMILES based
  • 55. API Timeouts and URL changes Advanced developers can change the API timeout value or edit the API URL on a single node using the Web Service panel
  • 56. 1. A new KNIME 3.3 compatible “OpenPHACTS Linked Data API” node will be released in Q2 2017 2. Designed for users, it provides easy configuration of API settings and parameters with easy to user tabular data return (via API _format parameter) 3. Designed for developers it allows additional full JSON/XML response that can be viewed/parsed by the expert user to see raw response 4. Further example workflows will be release once the node is available Summary
  • 57. Pipeline Pilot workflows with Open PHACTS Examples Jean-Marc Neefs – Janssen
  • 58. Open PHACTS 2012-2013 Targets Pathways Compounds UniProt Gene Ontology Enzyme Classification ChEMBL DrugBank ChEBI WikiPathways Reactome KEGG
  • 59. List compounds active on target X Open PHACTS + Pipeline Pilot Workflow: 1. Search target information • [OPS API call ‘Free Text to Concept’] 2. Get active compounds on that target • [OPS API call ‘Target Pharmacology: List’]
  • 61. Integrated Pipeline includes more data sources
  • 63. Find compounds against Alzheimer’s targets Open PHACTS + Pipeline Pilot Workflow: 1. Search for disease • [OPS API call ‘Free Text to Concept’] 2. Search target information • [OPS API call ‘Targets for Disease: List’] 3. Get active compounds on that target • [OPS API call ‘Target Pharmacology: List’]
  • 66. Source: Digles et al. Medchemcomm. 2016; 7(6): 1237–1244, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5063042/ Complex workflows: Collecting information from phenotypic screens