SlideShare uma empresa Scribd logo
1 de 59
The Taverna Workflow
Management Software Suite:
Past, Present, Future.
Prof Carole Goble CBE FREng FBCS CITP
The University of Manchester, UK
Software Sustainability Institute UK
carole.goble@manchester.ac.uk
http://www.taverna.org.uk
http://www.mygrid.org.uk
More of what we generally do!
Prof Carole Goble CBE FREng FBCS CITP
The University of Manchester, UK
Software Sustainability Institute UK
carole.goble@manchester.ac.uk
http://www.taverna.org.uk
http://www.mygrid.org.uk
e-Science,
Computational Science, Scientific Computing
• Support global scientific collaboration,
enable large scale resource, tools and
results sharing, assist scientific
processing, avoid unnecessary
repeated work.
• Accelerate scientific discovery,
improving scientific productivity,
stimulate technological innovation.
• Cope with scales and speed of
scientific innovation and data.
Data-centric Computation
Scientific workflows over Distributed
Cyber-Infrastructure.
Data sharing Social Methods
libraries and catalogues for all types of
scientific artefacts and all types of
scientists.
Knowledge Management
Metadata, semantics digital exchange,
preservation, publishing
Software Engineering
Software sustainability, software and
data policy, training
Products Methods
Systems Biology
Chemistry
Astro-Physics
Astronomy
Biology
Social Science
Library
Digital
Preservation
Biodiversity
Public Health
Applications
Computer
Science
Software
Engineering
Scientific Informatics
Computational Science
THEORY PRACTICEAPPLICATION
fundamental applied
PRODUCT
(Open Source)
PRINCIPLE
Science
“USE CASE”
Long Tail Little science
Self-organising groups
Disconnected, independent, distributed scientists
Disconnected, independent, distributed resources
Open in the wild.
Organised science
Organised groups
Clubs of scientists
Organised, planned and in-house resources
Closed and well behaved services.
VPH-Share
Models of Human
Physiology
Eagle Genomics
Next Generation
Sequencing
based Patient
Diagnostics
Astronomy &
HelioPhysics
Document
Preservation
Digitisation
Systems Biology
OpenTox Project
Chemistry
Development Kit
Drug Toxicity Ecological
Niche
Modelling
Population
Modelling
Meta-
genomics
Phylo-
genetics
• Data cleaning
• Data movement
• Data retrieval and
annotation
• Data analysis
• Data mining
• knowledge
management
• Data curation and data
warehouse population
• Data visualisation
• Parameter sweeps over
simulations
Drug discovery,
small molecules,
targets,
compounds
OpenPHACTS
BioSTIF
Inputs:
data, parameters,
configurations
Outputs
Workflow in a nutshell • Orchestrate series of
automated / interactive
steps
– Process pipelines
– Analytic and synthesis
procedures
– Repetitive code-run
sweeps
• Housekeeping tasks
– Process data at scale
– Auto documentation
• Mix in house & public
resources, native hosting
– Chain and choreograph
components
– Handle interoperability
– Bridge resources
– Shield operational
complexity and change
Services & Resources
Infrastructures
Taverna Workflow Management
http://www.taverna.org.uk
• Dataflow
– Computational Lambda Calculus with a monad extension*
– Simple control flows, iterations over collections
– Data type agnostic, domain independent
– Data movement, monitoring, staging, reference
– Custom (VO Tables), XML, JSON
• Mixed steps
– Services, codes & command line tools
– SOAP + REST Web Services
– Scripts: R, “In Workflow Programming” Beanshell scripting …
– Codes: Java, libraries, HPC, Grid and ~Cloud platforms etc …
– Nested workflows
– Interactions and Batch
*Turi et al Taverna Workflows: Syntax and Semantics e-Science 2007: 441-448;
Sroka et al A formal semantics for the Taverna 2 workflow model J. Comput. Syst. Sci. 76(6): 490-508 (2010)
• Computational Lambda Calculus
• Visual Programming
• Process mining
• Adaptive & parallel computing
• Cloud computing
• SOA, Semantic Web Services
• Data integration, data quality
• Semantic representation and linked data
• Reporting & tracking, credit propagation
• Workflow reusability, quality, discovery
• Security, monitoring, fault detection
• AI planning, re-run analysis, auto-planning,
auto-repair, auto-composition, auto-
annotation, service discovery, service matching,
auto-substitution
E.Science laboris
Tools
Standards
Services
Weeks -> Hours
Surprise predicted result tested in
lab. DAXX Gene
Genetic differences between breeds
Noyes, PNAS 2011 108(22) 9304-9309
BioDiversity Invasive
Species Modelling
American Horseshow Crabs
in the Baltic
Trypanosomiasis
resistance in African
Cattle
Software as a Service /
(Cloud) Appliance
Analytic bottleneck
Repetitive, unbiased,
accurate record,
taming data,
transparency, avoiding
shortcuts.
Interactive steps
Dev. Years->Weeks
Runs. Weeks -> Hours
Generalised ENM data
mapping and overlaying
pipelines.
Workflow-based Computation
15
#SummerSchool 24-Jun-13
VPH-Share @neurist Aneurysm Morphology Workflow
P a t ie n t P s e u d o id e n t ifi e r (P ID )
D e m o g r a p h ic s
H e ig h t
W e ig h t
V it a l S ig n s
H e a r t R a t e
B lo o d P r e s s u r e
F lo w R a t e
T r a n s ie n t P r e s s u r e
A n e u r y s m P r o p e r t ie s
T is s u e P r o p e r t ie s
W a ll T h ic k n e s s
R is k F a c t o r s
M e d ic a l Im a g e s
M e d ic a t io n s
Patients Patient Avatar Disease Simulation
Work ofl w
Systemic Factors
Gene Expression Pro lfie
P a t ie n t P s e u d o id e n t ifi e r (P ID )
D e m o g r a p h ic s
H e ig h t
W e ig h t
V it a l S ig n s
H e a r t R a t e
B lo o d P r e s s u r e
F lo w R a t e
T r a n s ie n t P r e s s u r e
A n e u r y s m P r o p e r t ie s
T is s u e P r o p e r t ie s
W a ll T h ic k n e s s
R is k F a c t o r s
M e d ic a l Im a g e s
M e d ic a t io n s
A n e u ry sm R u p tu r e P ro fi le
M o rp h o lo g y P r o fi le
H a e m o d y n a m ic P r o fi le
M e c h a n o b io lo g ic a l P r o fi le
P re d ic tio n U n c e rta in ity
Patient Avatar
Updated
RISK
Patients Patient Avatar Disease Simulation
Workflow Patient Avatar
updatedSystemic Factors
Gene Expression Profile
RISK
[Susheel Varma] http://www.vph-share.eu/
• Morphological, hemodynamic and structural analyses have been linked to
aneurysm genesis, growth and rupture.
• Evidence indicating differences in morphology and flow between ruptured
and unruptured aneurysms have been shown for reduced patient cohorts.
• Structural wall mechanics has been used to justify the growth and
remodelling happening at the aneurysm level.
Confidence in
physical measures
+
images
+ BC,
material
+ BC,
material
Morphological
analysis
Direct
diagnostic power
+
Morphological
descriptors
Structural descriptors
Hemodynamic
descriptors
Haemodynamic
analysis
Structural analysis
Practically,
morphological
characterizations might
currently have the
highest predictive
capabilities with respect
to the other analyses.
Morphological Workflow
[Susheel Varma]
Medical image
from imaging equipment
@neurIST
morphological descriptors
Complex indices (Zernike moment invariants)
Basic size indices describing aneurysm sac
depth
neck
Morphological Analysis Workflow
[Susheel Varma]
Implementation in VPH-Share
The @neurIST morphological workflow specification in Taverna
[Susheel Varma]
Biodiversity
marine monitoring and health assessment
ecological niche modelling
Data Intensive Science
Collaborative Science
Pilumnus hirtellusEnclosed sea problem
(Ready et al., 2010)
Sarah Bourlat
Ecological Niche
Modeling
.
Step 1: Explorative modeling
-Use unfiltered data
-Use fixed parameters: Mahalonobis distance
-Native projections
-Test the model, distribution of points, number of points
Step 2: Deep modeling
-Filtering environmentally unique points with BioClim algorithm
-ENM with Support Vector Machine and Maximum Entropy
-Parameter optimization (if necessary) on the model test results
-2 masks (model generate, model project)
Data discoveryData discovery
Data assembly,
cleaning, and
refinement
Data assembly,
cleaning, and
refinement
Ecological Niche
Modeling
Ecological Niche
Modeling
Statistical analysisStatistical analysis
Analytical cycle
Pilumnus hirtellusEnclosed sea problem
(Ready et al., 2010)
The workflows work over large geographical,
taxonomic, and environmental scales, incl.
terrestrial ecosystems
Baltic species invasions of various crabs/sea
creatures
Interactions of different forest insects and trees
Ecological Niche
Modeling
.
Step 1: Explorative modeling
-Use unfiltered data
-Use fixed parameters: Mahalonobis distance
-Native projections
-Test the model, distribution of points, number of points
Step 2: Deep modeling
-Filtering environmentally unique points with BioClim algorithm
-ENM with Support Vector Machine and Maximum Entropy
-Parameter optimization (if necessary) on the model test results
-2 masks (model generate, model project)
Data discoveryData discovery
Data assembly,
cleaning, and
refinement
Data assembly,
cleaning, and
refinement
Ecological Niche
Modeling
Ecological Niche
Modeling
Statistical analysisStatistical analysis
Analytical cycle
Pilumnus hirtellusEnclosed sea problem
(Ready et al., 2010)
The workflows work over large geographical,
taxonomic, and environmental scales, incl.
terrestrial ecosystems
Baltic species invasions of various crabs/sea
creatures
Interactions of different forest insects and trees
BioSTIF
www.biovel.eu
Ecological Niche
Modeling
Workflow (ENM)
data
configuration
parameters
steps
Data and Parameter Sweeps
Hosted installation
Local installations
Taverna:
a Knowledge Discovery Framework
•Asthma sputum inflammatory phenotypes, a transcriptome analysis, Saeedeh
Maleki-Dizaji, Chris Newby,
Rachid Berair, Rod Smallwood , Chris Brightling 2014
(to be submitted)
•A systematic approach to a transcriptome analysis to asthma sputum inflammatory
phenotypes ISMB 2014.
•The Battle of the Sexes starts in the oviduct : modulation of oviductal transcriptome
by X and Y-bearing spermatozoa: Almiñana C, Caballero I, Heath PR, Maleki-Dizaji
S, Parrilla I, Cuello C, Gil MA, Vazquez JL, Vazquez JM, Roca J, Martinez EA, Holt
WV and Fazeli A. submitted to BMC Genomics 2014 ,(In Press)
•transcription regulation network involving E2F6, IRF7 and STAT1, Thomas R.J.
Lovewella ,Andrew J.G. McDonaghb, Andrew G Messengerb, Saeedeh Maleki-
Dizaji, Mimoun Azzouzd and Rachid Tazi-Ahniniaformation submitted to PNAS,
2014
•Kiran, M., Bicak, M., Maleki-Dizaji, S., Holcombe, M. FLAME: A Platform for High
Performance Computing of Complex Systems. Journal of Acta Physica Polonica
2011.
•Maleki-Dizaji S, Holcombe M, Rolfe MD, Fisher P, Green J, Poole RK, Graham AI,
A Systematic Approach to Understanding Escherichia coli Responses to
Oxygen: From Microarray Raw Data to Pathways and Published Abstracts,
Online J Bioinformatics, (1):51-59, 2009
[Saeedeh Maleki-Dizaji]
Application
Runtime
Middleware
Resources/Codes/Services Infrastructures
Repositories
Execution Activity Plug-ins
Application
Scufl
Runtime
Middleware
Resources/Codes/Services
Platforms
Repositories
Taverna Desktop
Workbench
Taverna Online
Web Tool
Portals and Applications
Engine Server
Player
Cmd line
Provenance
Third Party
Servers
BioSTIF
Workflows & workflow
components
PROV, OPM
Data
Provenance
Registries
Taverna Workflow Management
Open extensibility
• Plug-in framework
– Command line tool
– Data Services: VOTables for AstroTaverna
– Optimisations: E.g. Holl. model parameter sweeps
– Infrastructures: Grid, HPC, Web Services
– Domains: CDK, BioMart, VOTable
– Commodities: Excel Spreadsheets, Open Refine, R
• Plug into other frameworks & platforms
– Portals: Scratchpads
– Interactive platforms: iPython Notebook
– Wfms: KNIME Node, Galaxy tool, Kepler Actor
• Third party applications
– Taverna Online
– XworX
– OGC chainer
Taverna Online: 3rd
party app
Dr Vadim Surpin and Vitaly Sharanutsa, Institute for Information Transmission
Problems of Russian Academy of Sciences (IITP RAS)
An online, in-browser application for assembling and running Taverna
Workflows over a HPC platform http://onlinehpc.com/site/main
Interoperability: Data format/identity mismatches
Service interface handling
Components: Well described, behaved,
curated, annotated modularised
workflow modules
• Semantic annotations, prescribed
failover, formats, provenance
• Organised into common families
Taverna Directions
AccessAccess
Framework to access and leverage heterogeneous
legacy applications, services, datasets and codes.
Shielding from complexity.
CustomiseCustomise
Rapid development: Flexibility, Extensibility,
Adaptability, Reuse. Reusable Workflow
Components
ProcessProcess
Automated plumbing + Interaction
Systematic, repetitive and unbiased analysis and
processing and error handling
Ensembles, comparisons, “what ifs”
CustomiseCustomise
Rapid development: Flexibility, Extensibility,
Adaptability, Reuse. Reusable Workflow
Components
ProcessProcess
Automated plumbing + Interaction
Systematic, repetitive and unbiased analysis and
processing and error handling
Ensembles, comparisons, “what ifs”
CustomiseCustomise
Rapid development: Flexibility, Extensibility,
Adaptability, Reuse. Reusable Workflow
Components
AccessAccess
Cloud and Scale, Registries
Standards data formats, programmatic interfaces.
Adapting to change. Security.
Governance of components
ProcessProcess
Seamless, pluggable wf as a service.
Scale. Adaptability. Specific-Generic tension.
Easier development, user experience
Workflow commodities, Research Objects
Design practices for reuse. Credit
Executable interactive notebooks. Provenance
A tool for reproducibility
ReportReport
EmbedEmbed
Workflows in common applications
Integration into reporting & publishing
Underpin integrative platforms.
Service based science and science as a service
Fix on demand.
Notify as needed.
Monitor for decay
Workflow/Service Monitors
3rd
Party Monitors
Workflow analytics
Detect and Repair
QUASAR toolkit
[Zhao et al. Why workflows break e-Science 2012]
The Execution Provenance Gap
Data tracking
Summarisation,
Labelling,
Distillations,
Selective tracking
Filtering
Big
Fine grain
1 White box
One System
Special tools
Collection
A Big Graph
What do I cite?
What did I do?
N Black boxes
Many Systems
My Lab Book
Analytics
Smart in situ Presentation
Why am I citing?
Pinar Alper, Khalid Belhajjame, Carole A. Goble, Pinar Karagoz: Enhancing and abstracting scientific workflow provenance for data
publishing. EDBT/ICDT Workshops 2013: 313-318
Sarah Cohen Boulakia, Jiuqiang Chen, Paolo Missier, Carole A. Goble, Alan R. Williams, Christine Froidevaux: Distilling structure in
Taverna scientific workflows: a refactoring approach. BMC Bioinformatics 15(S-1): S12 (2014)
http://provenanceweek.dlr.de
Tracking Provenance
File Stores Lab Books Repositories
• Granularity
• Scales
• Blackbox
• Hybrid
Research Objects
• Bundles and relates multi-hosted digital resources of a scientific
experiment or investigation using standard mechanisms
• Descriptive reproducibility
• Exchange, Releasing paradigm for publishing
http://www.researchobject.org/ http://www.researchobject.org/
Flexibility
Review, Revise/Discard
Scale
Deploy
into tools
Comparison
Personal
Group
Production
Research Reporting
Harden
http://nbviewer.ipython.org/github/myGrid/DataHackL
eiden/blob/alan/Player_example.ipynb
https://www.youtube.com/watch?v=QVQwSOX5S08 ?
Archiving
Publishing
Component Libraries
Preserving
Recording
Storing
Exchanging
Versioning
Sharing
PACKS
SEEK4Science
Sharing and interlinking Methods, Models, Data…
Data
Model
Article
External
Databases
Metadata
Virtual Liver Network
BMBF “Großprojekt“• ~45 organisations, ~70 groups
• multiscale rep. of the liver
• clinical impact
• general public portal
47
 Same key requirements:
yellow pages, exchange of all
sops/data/models, sharing
rights
 Different biology
• Multiscale data
• Multiscale models
• Imaging
 Different project structure
• Hierarchies (A, A1, A1.2)
• Regional groups of groups
 Flexibility, extensibility, open
sourceness of SEEK key
simulate models
project mgt,
access control
reporting, citation
governance &
policies
yellow pages
of peers
projects,
experts
catalogue and link
data, models, samples,
specimens, sops,
experiments,
publications using
standards
curate &
annotate data
and models using
standards
access, link to and
deposit in public
data and model
repositories
manage, store and
exchange different
types and scales of
data
integrate local and
project tools and
data systems
scaled-out
collection &
processing
experimentalists,
modellers, X-
informaticians,
computational Xs,
software engineers,
computer scientists,
systems
administrators,
resource providers,
tool builders
social scientists,
librarians, curators
Social Computation
Storing, Sharing and Reusing data, methods, models,
between collaborating and competing scientists
e-Laboratories, collaboratories, VREs, repositories
An ego-system
Computer
Scientist
Software
Engineer
Social
Engineer
Knowledge Computation
•Accurate, intelligible and comparable descriptions
•Data interoperability
•Machine readable metadata
Semantic technologies, Ontologies,
Linked Data, Data schema
Semantic Description
Describing and linking data in terms of
shared concepts, relationships and identifiers
Data
object property
data property
subClassOf
Ontology
Person
Organization
Place
State
name
birthdate
bornIn
worksFor state
name
phone
name
livesIn
City
Event
ceo
location
organizer
nearby
startDate
endDate
title
isPartOf
postalCode
Column 1 Column 2 Column 3 Column 4 Column 5
Bill Gates Oct 1955 Microsoft Seattle WA
Mark Zuckerberg May 1984 Facebook White Plains NY
Larry Page Mar 1973 Google East Lansing MI
[Taheriyan et al
adapted]
Curation Knowledge Ramps
Populous
http://www.rightfield.org.uk
Katy Wolstencroft
Pathways
Pharmacological
Activities
Biological
Processes
Transcripts
Pathological
Processes
Diseases
Genes
Proteins
Interactions
Clinical Drug
Applications
Indications
Drugs
Compounds
Pharmacological data for drug discovery
combining public and private datasets
Pre-competitive silo-breaking for competitive analytics
Pathways
Pharmacological
Activities
Biological
Processes
Transcripts
Pathological
Processes
Diseases
Genes
Proteins
Interactions
Clinical Drug
Applications
Indications
Drugs
Compounds
“Find me compounds
that inhibit targets in
NFkB pathway assayed
in only functional assays
with a potency <1 μM”
“What is the
selectivity profile of
known p38 inhibitors?”
“Let me compare
MW, logP and PSA
for known
oxidoreductase
inhibitors”
Broad data:
combining public and private datasets
NanopubNanopub
DbDb
VoIDVoID
Data Cache
(Virtuoso Triple
Store)
Semantic Workflow EngineSemantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexin
g
CorePlatformCorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoIDVoID
DbDb
NanopubNanopub
DbDb
VoIDVoID
DbDb
VoIDVoID
NanopubNanopub
VoIDVoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
ChemBio
Navigator
Target
Dossier
Pipeline
Pilot
Under the hood
Strict Relaxed
Analysing Browsing
Dynamic Equality
skos:closeMatch
(Drug Name)
skos:closeMatch
(Drug Name)
skos:exactMatch
(InChI)
CS
Research
Software
Engineering
Science Engage
Delivery & Support
2001-
2006
CS
Research
Software
Engineering
Science Engage
Delivery & Support
2006
-today
“Startup-Like”
Balance Innovation with Usefulness
Software Engineering
Research Software Engineers.
Sustainable software.
Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a
Computational science: ...Error…why scientific programming does not compute.
Training
• Training infrastructure
• Scalable training approaches
• Review needs
• Coordinate activities and materials
• Liaise with Nodes and Hub
Data-centric Computation
Scientific workflows over Distributed
Cyber-Infrastructure.
Data sharing Social Methods
libraries and catalogues for all types of
scientific artefacts and all types of
scientists.
Knowledge Management
Metadata, semantics digital exchange,
preservation, publishing
Software Engineering
Software sustainability, software and
data policy, training
Products Methods
Systems Biology
Chemistry
Astro-Physics
Astronomy
Biology
Social Science
Library
Digital
Preservation
Biodiversity
Public Health
Applications
Lemberger T Mol Syst Biol 2014;10:715
©2014 by European Molecular Biology Organization
Born Reproducible | Exchangeable |
Reusable
Rich descriptions
Open & Available
Transparent
Method
Re-executable
• myGrid
– http://www.mygrid.org.uk
• Taverna
– http://www.taverna.org.uk
• myExperiment
– http://www.myexperiment.org
• BioCatalogue
– http://www.biocatalogue.org
• SEEK and SysMO-SEEK
– http://www.seek4science.org
– http://seek.sysmo-db.org
• RightField
– http://www.rightfield.org.uk
• BioVeL
– http://www.biovel.eu
• Wf4ever
– http://www.wf4ever-project.org
• Research Object
– http://www.researchobject.org
• Software Sustainability Institute
– http://www.software.ac.uk
The Taverna Workflow Management Software Suite - Past, Present, Future

Mais conteúdo relacionado

Mais procurados

Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Final Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.KeyFinal Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.Keyguest3d0531
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsLeighton Pritchard
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseRothamsted Research, UK
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Susanna-Assunta Sansone
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionYannick Djoumbou
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Carole Goble
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsGigaScience, BGI Hong Kong
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Pistoia Alliance
 
Interoperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesInteroperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesRothamsted Research, UK
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the futurePistoia Alliance
 

Mais procurados (20)

Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Final Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.KeyFinal Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.Key
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 
Interoperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use CasesInteroperable Data for KnetMiner and DFW Use Cases
Interoperable Data for KnetMiner and DFW Use Cases
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 

Semelhante a The Taverna Workflow Management Software Suite - Past, Present, Future

2014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES22014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES2myGrid team
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Joel Saltz
 
20170110_IOuellette_CV
20170110_IOuellette_CV20170110_IOuellette_CV
20170110_IOuellette_CVIan Ouellette
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeCarole Goble
 
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...Diagnostic hypothesis refinement in reproducible workflows for advanced medic...
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...Cezary Mazurek
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Joel Saltz
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansJameel Syed
 
CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016Alexander Venzin
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdfAdhySugara2
 
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
NeISSProject
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherNils Gehlenborg
 

Semelhante a The Taverna Workflow Management Software Suite - Past, Present, Future (20)

2014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES22014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES2
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Cv long
Cv longCv long
Cv long
 
20170110_IOuellette_CV
20170110_IOuellette_CV20170110_IOuellette_CV
20170110_IOuellette_CV
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeWorkflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, Rome
 
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...Diagnostic hypothesis refinement in reproducible workflows for advanced medic...
Diagnostic hypothesis refinement in reproducible workflows for advanced medic...
 
DR KL CV v5
DR KL CV v5DR KL CV v5
DR KL CV v5
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
 
CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdf
 
AJH CV sept2016
AJH CV sept2016AJH CV sept2016
AJH CV sept2016
 
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
ISABELLE HARROCH(1)
ISABELLE HARROCH(1)ISABELLE HARROCH(1)
ISABELLE HARROCH(1)
 
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All Together
 

Mais de myGrid team

2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflowsmyGrid team
 
2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity examplemyGrid team
 
2014 Taverna Tutorial Components
2014 Taverna Tutorial Components2014 Taverna Tutorial Components
2014 Taverna Tutorial ComponentsmyGrid team
 
2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions2014 Taverna Tutorial Interactions
2014 Taverna Tutorial InteractionsmyGrid team
 
2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflowsmyGrid team
 
2014 Taverna Tutorial R script
2014 Taverna Tutorial R script2014 Taverna Tutorial R script
2014 Taverna Tutorial R scriptmyGrid team
 
2014 Taverna tutorial Tool service
2014 Taverna tutorial Tool service2014 Taverna tutorial Tool service
2014 Taverna tutorial Tool servicemyGrid team
 
2014 Taverna tutorial Shims and Beanshell scripts
2014 Taverna tutorial Shims and Beanshell scripts2014 Taverna tutorial Shims and Beanshell scripts
2014 Taverna tutorial Shims and Beanshell scriptsmyGrid team
 
2014 Taverna tutorial REST and Biocatalogue
2014 Taverna tutorial REST and Biocatalogue2014 Taverna tutorial REST and Biocatalogue
2014 Taverna tutorial REST and BiocataloguemyGrid team
 
2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced TavernamyGrid team
 
2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath2014 Taverna tutorial Xpath
2014 Taverna tutorial XpathmyGrid team
 
2014 Taverna tutorial Spreadsheet import
2014 Taverna tutorial Spreadsheet import2014 Taverna tutorial Spreadsheet import
2014 Taverna tutorial Spreadsheet importmyGrid team
 
2014 Taverna tutorial Simple workflow
2014 Taverna tutorial Simple workflow2014 Taverna tutorial Simple workflow
2014 Taverna tutorial Simple workflowmyGrid team
 
2014 Taverna tutorial REST services
2014 Taverna tutorial REST services2014 Taverna tutorial REST services
2014 Taverna tutorial REST servicesmyGrid team
 
2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperimentmyGrid team
 
2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflowsmyGrid team
 
SWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionSWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionmyGrid team
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloudmyGrid team
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software SuitemyGrid team
 

Mais de myGrid team (20)

Taverna summary
Taverna summaryTaverna summary
Taverna summary
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
 
2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example
 
2014 Taverna Tutorial Components
2014 Taverna Tutorial Components2014 Taverna Tutorial Components
2014 Taverna Tutorial Components
 
2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions
 
2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows
 
2014 Taverna Tutorial R script
2014 Taverna Tutorial R script2014 Taverna Tutorial R script
2014 Taverna Tutorial R script
 
2014 Taverna tutorial Tool service
2014 Taverna tutorial Tool service2014 Taverna tutorial Tool service
2014 Taverna tutorial Tool service
 
2014 Taverna tutorial Shims and Beanshell scripts
2014 Taverna tutorial Shims and Beanshell scripts2014 Taverna tutorial Shims and Beanshell scripts
2014 Taverna tutorial Shims and Beanshell scripts
 
2014 Taverna tutorial REST and Biocatalogue
2014 Taverna tutorial REST and Biocatalogue2014 Taverna tutorial REST and Biocatalogue
2014 Taverna tutorial REST and Biocatalogue
 
2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna
 
2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath
 
2014 Taverna tutorial Spreadsheet import
2014 Taverna tutorial Spreadsheet import2014 Taverna tutorial Spreadsheet import
2014 Taverna tutorial Spreadsheet import
 
2014 Taverna tutorial Simple workflow
2014 Taverna tutorial Simple workflow2014 Taverna tutorial Simple workflow
2014 Taverna tutorial Simple workflow
 
2014 Taverna tutorial REST services
2014 Taverna tutorial REST services2014 Taverna tutorial REST services
2014 Taverna tutorial REST services
 
2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment
 
2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows
 
SWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionSWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice Description
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software Suite
 

Último

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

The Taverna Workflow Management Software Suite - Past, Present, Future

  • 1. The Taverna Workflow Management Software Suite: Past, Present, Future. Prof Carole Goble CBE FREng FBCS CITP The University of Manchester, UK Software Sustainability Institute UK carole.goble@manchester.ac.uk http://www.taverna.org.uk http://www.mygrid.org.uk
  • 2. More of what we generally do! Prof Carole Goble CBE FREng FBCS CITP The University of Manchester, UK Software Sustainability Institute UK carole.goble@manchester.ac.uk http://www.taverna.org.uk http://www.mygrid.org.uk
  • 3. e-Science, Computational Science, Scientific Computing • Support global scientific collaboration, enable large scale resource, tools and results sharing, assist scientific processing, avoid unnecessary repeated work. • Accelerate scientific discovery, improving scientific productivity, stimulate technological innovation. • Cope with scales and speed of scientific innovation and data.
  • 4. Data-centric Computation Scientific workflows over Distributed Cyber-Infrastructure. Data sharing Social Methods libraries and catalogues for all types of scientific artefacts and all types of scientists. Knowledge Management Metadata, semantics digital exchange, preservation, publishing Software Engineering Software sustainability, software and data policy, training Products Methods Systems Biology Chemistry Astro-Physics Astronomy Biology Social Science Library Digital Preservation Biodiversity Public Health Applications
  • 5. Computer Science Software Engineering Scientific Informatics Computational Science THEORY PRACTICEAPPLICATION fundamental applied PRODUCT (Open Source) PRINCIPLE Science “USE CASE”
  • 6. Long Tail Little science Self-organising groups Disconnected, independent, distributed scientists Disconnected, independent, distributed resources Open in the wild. Organised science Organised groups Clubs of scientists Organised, planned and in-house resources Closed and well behaved services.
  • 7. VPH-Share Models of Human Physiology Eagle Genomics Next Generation Sequencing based Patient Diagnostics Astronomy & HelioPhysics Document Preservation Digitisation Systems Biology OpenTox Project Chemistry Development Kit Drug Toxicity Ecological Niche Modelling Population Modelling Meta- genomics Phylo- genetics • Data cleaning • Data movement • Data retrieval and annotation • Data analysis • Data mining • knowledge management • Data curation and data warehouse population • Data visualisation • Parameter sweeps over simulations Drug discovery, small molecules, targets, compounds OpenPHACTS
  • 8. BioSTIF Inputs: data, parameters, configurations Outputs Workflow in a nutshell • Orchestrate series of automated / interactive steps – Process pipelines – Analytic and synthesis procedures – Repetitive code-run sweeps • Housekeeping tasks – Process data at scale – Auto documentation • Mix in house & public resources, native hosting – Chain and choreograph components – Handle interoperability – Bridge resources – Shield operational complexity and change Services & Resources Infrastructures
  • 9. Taverna Workflow Management http://www.taverna.org.uk • Dataflow – Computational Lambda Calculus with a monad extension* – Simple control flows, iterations over collections – Data type agnostic, domain independent – Data movement, monitoring, staging, reference – Custom (VO Tables), XML, JSON • Mixed steps – Services, codes & command line tools – SOAP + REST Web Services – Scripts: R, “In Workflow Programming” Beanshell scripting … – Codes: Java, libraries, HPC, Grid and ~Cloud platforms etc … – Nested workflows – Interactions and Batch *Turi et al Taverna Workflows: Syntax and Semantics e-Science 2007: 441-448; Sroka et al A formal semantics for the Taverna 2 workflow model J. Comput. Syst. Sci. 76(6): 490-508 (2010)
  • 10. • Computational Lambda Calculus • Visual Programming • Process mining • Adaptive & parallel computing • Cloud computing • SOA, Semantic Web Services • Data integration, data quality • Semantic representation and linked data • Reporting & tracking, credit propagation • Workflow reusability, quality, discovery • Security, monitoring, fault detection • AI planning, re-run analysis, auto-planning, auto-repair, auto-composition, auto- annotation, service discovery, service matching, auto-substitution E.Science laboris Tools Standards Services
  • 11. Weeks -> Hours Surprise predicted result tested in lab. DAXX Gene Genetic differences between breeds Noyes, PNAS 2011 108(22) 9304-9309 BioDiversity Invasive Species Modelling American Horseshow Crabs in the Baltic Trypanosomiasis resistance in African Cattle Software as a Service / (Cloud) Appliance Analytic bottleneck Repetitive, unbiased, accurate record, taming data, transparency, avoiding shortcuts. Interactive steps Dev. Years->Weeks Runs. Weeks -> Hours Generalised ENM data mapping and overlaying pipelines. Workflow-based Computation
  • 12. 15 #SummerSchool 24-Jun-13 VPH-Share @neurist Aneurysm Morphology Workflow P a t ie n t P s e u d o id e n t ifi e r (P ID ) D e m o g r a p h ic s H e ig h t W e ig h t V it a l S ig n s H e a r t R a t e B lo o d P r e s s u r e F lo w R a t e T r a n s ie n t P r e s s u r e A n e u r y s m P r o p e r t ie s T is s u e P r o p e r t ie s W a ll T h ic k n e s s R is k F a c t o r s M e d ic a l Im a g e s M e d ic a t io n s Patients Patient Avatar Disease Simulation Work ofl w Systemic Factors Gene Expression Pro lfie P a t ie n t P s e u d o id e n t ifi e r (P ID ) D e m o g r a p h ic s H e ig h t W e ig h t V it a l S ig n s H e a r t R a t e B lo o d P r e s s u r e F lo w R a t e T r a n s ie n t P r e s s u r e A n e u r y s m P r o p e r t ie s T is s u e P r o p e r t ie s W a ll T h ic k n e s s R is k F a c t o r s M e d ic a l Im a g e s M e d ic a t io n s A n e u ry sm R u p tu r e P ro fi le M o rp h o lo g y P r o fi le H a e m o d y n a m ic P r o fi le M e c h a n o b io lo g ic a l P r o fi le P re d ic tio n U n c e rta in ity Patient Avatar Updated RISK Patients Patient Avatar Disease Simulation Workflow Patient Avatar updatedSystemic Factors Gene Expression Profile RISK [Susheel Varma] http://www.vph-share.eu/
  • 13. • Morphological, hemodynamic and structural analyses have been linked to aneurysm genesis, growth and rupture. • Evidence indicating differences in morphology and flow between ruptured and unruptured aneurysms have been shown for reduced patient cohorts. • Structural wall mechanics has been used to justify the growth and remodelling happening at the aneurysm level. Confidence in physical measures + images + BC, material + BC, material Morphological analysis Direct diagnostic power + Morphological descriptors Structural descriptors Hemodynamic descriptors Haemodynamic analysis Structural analysis Practically, morphological characterizations might currently have the highest predictive capabilities with respect to the other analyses. Morphological Workflow [Susheel Varma]
  • 14. Medical image from imaging equipment @neurIST morphological descriptors Complex indices (Zernike moment invariants) Basic size indices describing aneurysm sac depth neck Morphological Analysis Workflow [Susheel Varma]
  • 15. Implementation in VPH-Share The @neurIST morphological workflow specification in Taverna [Susheel Varma]
  • 16. Biodiversity marine monitoring and health assessment ecological niche modelling Data Intensive Science Collaborative Science Pilumnus hirtellusEnclosed sea problem (Ready et al., 2010) Sarah Bourlat
  • 17.
  • 18. Ecological Niche Modeling . Step 1: Explorative modeling -Use unfiltered data -Use fixed parameters: Mahalonobis distance -Native projections -Test the model, distribution of points, number of points Step 2: Deep modeling -Filtering environmentally unique points with BioClim algorithm -ENM with Support Vector Machine and Maximum Entropy -Parameter optimization (if necessary) on the model test results -2 masks (model generate, model project) Data discoveryData discovery Data assembly, cleaning, and refinement Data assembly, cleaning, and refinement Ecological Niche Modeling Ecological Niche Modeling Statistical analysisStatistical analysis Analytical cycle Pilumnus hirtellusEnclosed sea problem (Ready et al., 2010) The workflows work over large geographical, taxonomic, and environmental scales, incl. terrestrial ecosystems Baltic species invasions of various crabs/sea creatures Interactions of different forest insects and trees
  • 19. Ecological Niche Modeling . Step 1: Explorative modeling -Use unfiltered data -Use fixed parameters: Mahalonobis distance -Native projections -Test the model, distribution of points, number of points Step 2: Deep modeling -Filtering environmentally unique points with BioClim algorithm -ENM with Support Vector Machine and Maximum Entropy -Parameter optimization (if necessary) on the model test results -2 masks (model generate, model project) Data discoveryData discovery Data assembly, cleaning, and refinement Data assembly, cleaning, and refinement Ecological Niche Modeling Ecological Niche Modeling Statistical analysisStatistical analysis Analytical cycle Pilumnus hirtellusEnclosed sea problem (Ready et al., 2010) The workflows work over large geographical, taxonomic, and environmental scales, incl. terrestrial ecosystems Baltic species invasions of various crabs/sea creatures Interactions of different forest insects and trees BioSTIF
  • 20.
  • 21.
  • 25. Taverna: a Knowledge Discovery Framework •Asthma sputum inflammatory phenotypes, a transcriptome analysis, Saeedeh Maleki-Dizaji, Chris Newby, Rachid Berair, Rod Smallwood , Chris Brightling 2014 (to be submitted) •A systematic approach to a transcriptome analysis to asthma sputum inflammatory phenotypes ISMB 2014. •The Battle of the Sexes starts in the oviduct : modulation of oviductal transcriptome by X and Y-bearing spermatozoa: Almiñana C, Caballero I, Heath PR, Maleki-Dizaji S, Parrilla I, Cuello C, Gil MA, Vazquez JL, Vazquez JM, Roca J, Martinez EA, Holt WV and Fazeli A. submitted to BMC Genomics 2014 ,(In Press) •transcription regulation network involving E2F6, IRF7 and STAT1, Thomas R.J. Lovewella ,Andrew J.G. McDonaghb, Andrew G Messengerb, Saeedeh Maleki- Dizaji, Mimoun Azzouzd and Rachid Tazi-Ahniniaformation submitted to PNAS, 2014 •Kiran, M., Bicak, M., Maleki-Dizaji, S., Holcombe, M. FLAME: A Platform for High Performance Computing of Complex Systems. Journal of Acta Physica Polonica 2011. •Maleki-Dizaji S, Holcombe M, Rolfe MD, Fisher P, Green J, Poole RK, Graham AI, A Systematic Approach to Understanding Escherichia coli Responses to Oxygen: From Microarray Raw Data to Pathways and Published Abstracts, Online J Bioinformatics, (1):51-59, 2009 [Saeedeh Maleki-Dizaji]
  • 26. Application Runtime Middleware Resources/Codes/Services Infrastructures Repositories Execution Activity Plug-ins Application Scufl Runtime Middleware Resources/Codes/Services Platforms Repositories Taverna Desktop Workbench Taverna Online Web Tool Portals and Applications Engine Server Player Cmd line Provenance Third Party Servers BioSTIF Workflows & workflow components PROV, OPM Data Provenance Registries
  • 27. Taverna Workflow Management Open extensibility • Plug-in framework – Command line tool – Data Services: VOTables for AstroTaverna – Optimisations: E.g. Holl. model parameter sweeps – Infrastructures: Grid, HPC, Web Services – Domains: CDK, BioMart, VOTable – Commodities: Excel Spreadsheets, Open Refine, R • Plug into other frameworks & platforms – Portals: Scratchpads – Interactive platforms: iPython Notebook – Wfms: KNIME Node, Galaxy tool, Kepler Actor • Third party applications – Taverna Online – XworX – OGC chainer
  • 28. Taverna Online: 3rd party app Dr Vadim Surpin and Vitaly Sharanutsa, Institute for Information Transmission Problems of Russian Academy of Sciences (IITP RAS) An online, in-browser application for assembling and running Taverna Workflows over a HPC platform http://onlinehpc.com/site/main
  • 29. Interoperability: Data format/identity mismatches Service interface handling Components: Well described, behaved, curated, annotated modularised workflow modules • Semantic annotations, prescribed failover, formats, provenance • Organised into common families
  • 30. Taverna Directions AccessAccess Framework to access and leverage heterogeneous legacy applications, services, datasets and codes. Shielding from complexity. CustomiseCustomise Rapid development: Flexibility, Extensibility, Adaptability, Reuse. Reusable Workflow Components ProcessProcess Automated plumbing + Interaction Systematic, repetitive and unbiased analysis and processing and error handling Ensembles, comparisons, “what ifs” CustomiseCustomise Rapid development: Flexibility, Extensibility, Adaptability, Reuse. Reusable Workflow Components ProcessProcess Automated plumbing + Interaction Systematic, repetitive and unbiased analysis and processing and error handling Ensembles, comparisons, “what ifs” CustomiseCustomise Rapid development: Flexibility, Extensibility, Adaptability, Reuse. Reusable Workflow Components AccessAccess Cloud and Scale, Registries Standards data formats, programmatic interfaces. Adapting to change. Security. Governance of components ProcessProcess Seamless, pluggable wf as a service. Scale. Adaptability. Specific-Generic tension. Easier development, user experience Workflow commodities, Research Objects Design practices for reuse. Credit Executable interactive notebooks. Provenance A tool for reproducibility ReportReport EmbedEmbed Workflows in common applications Integration into reporting & publishing Underpin integrative platforms. Service based science and science as a service
  • 31. Fix on demand. Notify as needed. Monitor for decay Workflow/Service Monitors 3rd Party Monitors Workflow analytics Detect and Repair QUASAR toolkit [Zhao et al. Why workflows break e-Science 2012]
  • 32. The Execution Provenance Gap Data tracking Summarisation, Labelling, Distillations, Selective tracking Filtering Big Fine grain 1 White box One System Special tools Collection A Big Graph What do I cite? What did I do? N Black boxes Many Systems My Lab Book Analytics Smart in situ Presentation Why am I citing? Pinar Alper, Khalid Belhajjame, Carole A. Goble, Pinar Karagoz: Enhancing and abstracting scientific workflow provenance for data publishing. EDBT/ICDT Workshops 2013: 313-318 Sarah Cohen Boulakia, Jiuqiang Chen, Paolo Missier, Carole A. Goble, Alan R. Williams, Christine Froidevaux: Distilling structure in Taverna scientific workflows: a refactoring approach. BMC Bioinformatics 15(S-1): S12 (2014) http://provenanceweek.dlr.de
  • 33. Tracking Provenance File Stores Lab Books Repositories • Granularity • Scales • Blackbox • Hybrid
  • 34. Research Objects • Bundles and relates multi-hosted digital resources of a scientific experiment or investigation using standard mechanisms • Descriptive reproducibility • Exchange, Releasing paradigm for publishing http://www.researchobject.org/ http://www.researchobject.org/
  • 38. SEEK4Science Sharing and interlinking Methods, Models, Data… Data Model Article External Databases Metadata
  • 39. Virtual Liver Network BMBF “Großprojekt“• ~45 organisations, ~70 groups • multiscale rep. of the liver • clinical impact • general public portal 47  Same key requirements: yellow pages, exchange of all sops/data/models, sharing rights  Different biology • Multiscale data • Multiscale models • Imaging  Different project structure • Hierarchies (A, A1, A1.2) • Regional groups of groups  Flexibility, extensibility, open sourceness of SEEK key
  • 40. simulate models project mgt, access control reporting, citation governance & policies yellow pages of peers projects, experts catalogue and link data, models, samples, specimens, sops, experiments, publications using standards curate & annotate data and models using standards access, link to and deposit in public data and model repositories manage, store and exchange different types and scales of data integrate local and project tools and data systems scaled-out collection & processing
  • 41. experimentalists, modellers, X- informaticians, computational Xs, software engineers, computer scientists, systems administrators, resource providers, tool builders social scientists, librarians, curators Social Computation Storing, Sharing and Reusing data, methods, models, between collaborating and competing scientists e-Laboratories, collaboratories, VREs, repositories An ego-system
  • 43. Knowledge Computation •Accurate, intelligible and comparable descriptions •Data interoperability •Machine readable metadata Semantic technologies, Ontologies, Linked Data, Data schema
  • 44. Semantic Description Describing and linking data in terms of shared concepts, relationships and identifiers Data object property data property subClassOf Ontology Person Organization Place State name birthdate bornIn worksFor state name phone name livesIn City Event ceo location organizer nearby startDate endDate title isPartOf postalCode Column 1 Column 2 Column 3 Column 4 Column 5 Bill Gates Oct 1955 Microsoft Seattle WA Mark Zuckerberg May 1984 Facebook White Plains NY Larry Page Mar 1973 Google East Lansing MI [Taheriyan et al adapted]
  • 47. Pathways Pharmacological Activities Biological Processes Transcripts Pathological Processes Diseases Genes Proteins Interactions Clinical Drug Applications Indications Drugs Compounds “Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM” “What is the selectivity profile of known p38 inhibitors?” “Let me compare MW, logP and PSA for known oxidoreductase inhibitors” Broad data: combining public and private datasets
  • 48. NanopubNanopub DbDb VoIDVoID Data Cache (Virtuoso Triple Store) Semantic Workflow EngineSemantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexin g CorePlatformCorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoIDVoID DbDb NanopubNanopub DbDb VoIDVoID DbDb VoIDVoID NanopubNanopub VoIDVoID Public Content Commercial Public Ontologies User Annotations Apps ChemBio Navigator Target Dossier Pipeline Pilot Under the hood
  • 49. Strict Relaxed Analysing Browsing Dynamic Equality skos:closeMatch (Drug Name) skos:closeMatch (Drug Name) skos:exactMatch (InChI)
  • 53. Software Engineering Research Software Engineers. Sustainable software.
  • 54. Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a Computational science: ...Error…why scientific programming does not compute.
  • 55. Training • Training infrastructure • Scalable training approaches • Review needs • Coordinate activities and materials • Liaise with Nodes and Hub
  • 56. Data-centric Computation Scientific workflows over Distributed Cyber-Infrastructure. Data sharing Social Methods libraries and catalogues for all types of scientific artefacts and all types of scientists. Knowledge Management Metadata, semantics digital exchange, preservation, publishing Software Engineering Software sustainability, software and data policy, training Products Methods Systems Biology Chemistry Astro-Physics Astronomy Biology Social Science Library Digital Preservation Biodiversity Public Health Applications
  • 57. Lemberger T Mol Syst Biol 2014;10:715 ©2014 by European Molecular Biology Organization Born Reproducible | Exchangeable | Reusable Rich descriptions Open & Available Transparent Method Re-executable
  • 58. • myGrid – http://www.mygrid.org.uk • Taverna – http://www.taverna.org.uk • myExperiment – http://www.myexperiment.org • BioCatalogue – http://www.biocatalogue.org • SEEK and SysMO-SEEK – http://www.seek4science.org – http://seek.sysmo-db.org • RightField – http://www.rightfield.org.uk • BioVeL – http://www.biovel.eu • Wf4ever – http://www.wf4ever-project.org • Research Object – http://www.researchobject.org • Software Sustainability Institute – http://www.software.ac.uk

Notas do Editor

  1. Mature workflow platform – since 2004
  2. Mature workflow platform – since 2004
  3. Bioinformaticians in the wild No predetermined VOs Exploratory investigations Services in the wild Natively and distributedly hosted Data and Platform agnostic Production level engine to handle cross cutting concerns and large data collections Customisation opportunities Experiment with Semantic Technologies Domain independence Restrictive vs open worlds OPEN STUFF Independent life science informaticians in the field Expert bioinformaticians but not programmers An open community Open applications Independent third party world-wide service providers, local and remote over the web In house applications, tools and datasets Open (and closed) worlds.
  4. Open SourceManaged worldsWild worlds
  5. Underpin integrative platforms. Powering service based science and science as a service A tool for reproducibility logos Coordinate execution of services and codes. Dataflow at scale Reusable variants Comparable repetitions Import own data / codes + public libraries/datasets Honour hosted codes Shield operational complexity Auto-document provenance Package up dependencies
  6. aimed at different layers of the software stack “The Many Faces of IT as Service”, Foster, Tuecke, 2005 “Provisioning” – reservation to configuration to … … make sure resource will do what I want it to do, with the right qualities of service Virtualization = separation of concerns between provider &amp; consumer of “content” Client and service Service provider and resource provider Provisioning = assemble &amp; configure resources to meet user needs Management = sustain desired qualities of service despite dynamic environment
  7. It’s a framework! Provenance collection… W3C PROV+, OPM formats OAuth security plug-in Java, Grid services, R scripts, libraries (BioConductor, libSBML…) Just released Taverna 2.5, since 17 April It&amp;apos;s now 642 workbench and 500 CLT downloads 100,000+ downloads over its lifetime. Audit last year to track startups – just under 1000unique starts in one month
  8. IInteraction: Visual programming, workflow reusability, workflow quality, workflow discovery Service oriented computing, cloud computing, grid computing, optimisation, parallelism, adaptation, security, monitoring and fault correction AI &amp; Semantics: re-run analysis, auto-planning, auto-repair, auto-composition, auto-annotation, service discovery, service matching, auto-substitution Data integration, data mapping, service integration, provenance tracking, credit propagation, data spaces, data quality
  9. Understanding genetic differences between breeds of cattle Ecological niche modeling of Baltic invasives Collection, Preparation &amp; Production Pipelines Exploratory analytics Simulation codes Text mining Auto recommendations Visual analytics
  10. Morphological, hemodynamic and structural analyses have been linked to aneurysm genesis, growth and rupture. Evidence indicating differences in morphology and flow between ruptured and unruptured aneurysms have been shown for reduced patient cohorts. Structural wall mechanics has been used to justify the growth and remodelling happening at the aneurysm level.
  11. Collecting, processing and management of big data Metagenomics, genotyping, genome sequencing, phylogenetics, gene expression analysis, proteomics, metabolomics, auto sampling Analytics and management of broad data from many different disciplines Coupling analytical metagenomics with meaningful ecological interpretations Continuous development of novel methods and technologies Functional trait-based ecology approach proposed by Barberán et. al 2012.
  12. Not all things are batch VPH-Share opens a VNC connection spawned instance. Taverna Interaction Service Users interact with a workflow (wherever it is running) in a web browser. Interaction Service Workbench Plug-in
  13. The BioVeL Ecological Niche Modelling workflow running while embedded into the AntKey Scratchpads site
  14. Custom resources and platforms Components Plug-in Framework Infrastructures: Grid, HPC, Web Services (SOAP, REST) Domain: CDK, BioMart, VOTable, SADI Common Tools: Excel Spreadsheets, Open Refine, R
  15. COMPUTING POWER The service provides two types of computing nodes: Amazon AWS cluster computing instances Automatic configuration of computer clusters on AWS cloud resources In-house powerful computing cluster Hundreds of Intel Xeon (3.00 Ghz, 4 cores) nodes available 2 CPUs per node 8 cores per node 2 Gb of RAM per core 100 Gb of local storage per node Providers Contact us to access you own computing facilities with our service OnlineHPC looks for partnership with supercomputer providers all over the world. Contact us for details.
  16. Large number of re-usable, versioned components 26 ENM components 42 components in myExperiment A workflow in their own right Test by running individually Annotatable for semantic description of profile Create new workflows remixing any components – like the ENM ones we have made.
  17. Research Objects, Metadata structuring Annotation by Stealth, Shared Templates Other communities Workflows Apps Workflow commodities Adaptability, Tiers of infrastructure Computational Reproducibility is hard in the wild: description / execution
  18. Added after the fact Shims – beanshell programming in the small Mapping services for names Curated service signatures Data and semantic interoperability in the services, service families and service collections (that is where your types are) Data agnostic, Semantic layering Shim services Workflow flexibility and reusability but makes things untidy Next steps – Shim libraries and packaged components Annotation What do the services DO? And HOW? Expert curation One size does not fit all: scientists need simplish metadata for decision support; automated validation, configuration, repair needs rich metadata decision making. Next steps – BioCatalogue social &amp; auto curation through myExperiment
  19. Workflow Run RO BundleFolder structure or Zip file with some JSONUnpack into local file system, ship to myExperiment or notebook
  20. 1 constantly running server for workflows that aren’t security sensitive Multiple commandline tools For secure workflows, spawn own server and own command line in a bubble Start up performance issues: start server, start cmmdline start image start apps. VPH-Share plugin exposes in Taverna Online list of tools you can instantiate on their VM Execution deals with requesting ofstart and close down of VM. WSDL at a specific location rebinds the tool. BioCatalogue work by Dimitri for unbound WSDL for the tools
  21. Player needs a workflow file from the portal or myExperiment or something else. Rails plugin for running Taverna Workflows Integrates into any Rails app Embed workflows into any web page Job queuing system scales runs with the number of workflows the servers can handle. Each run in parallel with its own worker. Input provenance: setup, input gathering, parameters and data used Runs: Taverna Server operations, interactions, run workflow, re-run / restart Results management: storing, viewing, downloading, result type rendering Service credential management: for secure services within workflows Look and results rendering fully customizable LifeWatch, Scratchpads, personal web page, … Just like embedding a YouTube video Gets bigger when it needs to &amp; tells you when its full. Result type rendering: Text, XML, JSON, HTML, Images, PDF, Workflow errors, Links for types that browsers cannot show inline, more…..
  22. Taverna server spawns commandline tool for user separation. The components of the architecture: An OSGi platform, with the Taverna Platform API implemented by Taverna Core  executes a workflow using the Taverna Engine uses Activity plugins for the different service types (WSDL, REST, Biomart, R scripts, command line tools, etc) also implemented by the Taverna Server client which uses the Java Client library to proxy running of a workflow on the Taverna Server The Taverna workbench to design and run workflows UI plugins for each service type executes workflows using the Taverna platform API The Taverna command line which executes workflows using the Taverna platform API A Taverna Server, which exposes the Taverna platform API as a REST API and SOAP API for executing workflows Taverna Player, which use the Ruby client library to execute workflows on the Taverna Server Taverna Lite, which also uses the Ruby client library to execute workflows, but also manage a repository of workflows and allow user interactions. The OSGi framework (OSGi being an acronym for &amp;quot;Open Services Gateway initiative&amp;quot;) is a module system and service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments. Applications or components (coming in the form of bundles for deployment) can be remotely installed, started, stopped, updated, and uninstalled without requiring a reboot; management of Java packages/classes is specified in great detail. Application life cycle management (start, stop, install, etc.) is done via APIs that allow for remote downloading of management policies. The service registry allows bundles to detect the addition of new services, or the removal of services, and adapt accordingly. The OSGi specifications have moved beyond the original focus of service gateways, and are now used in applications ranging from mobile phones to the open source Eclipse IDE. Other application areas include automobiles, industrial automation, building automation, PDAs, grid computing, entertainment, fleet management and application servers.
  23. ENCODE threads exchange between tools and researchers bundles and relates digital resources of a scientific experiment or investigation using standard mechanisms
  24. Explore, Personal…. Recording and reporting Production…. Reporting.
  25. Issues: non-secure html using http inside secure https iframe in ipython doesn’t work – need to update interaction service to deliver on https.
  26. Variety: common metadata models rich metadata collection ecosystem Validity: auto record of experiment set-up, citable and shareable descriptions curation, publication, mixed stewardship third part availability model executability citability, QC/QA. trust. Social issues of understanding the culture of risk, reward, sharing and reporting.
  27. Blending SEEK and openBIS together
  28. It’s a lot like a start-up Software Engineering for Science, Software sustainability, software and data policy, training
  29. Why did I start as a Computer Scientist and, proudly, end up as a Software Engineer and Social Worker? Web Science related activity Making people think its their idea Nearly every time I ask people they ask for today’s and not tomorrow.
  30. Sample of three commercial datasets Information on handful of targets only Gemma Sattertwaite mentioned this
  31. Sample of three commercial datasets Information on handful of targets only
  32. Cache copies of data Chemistry data normalisation/alignment through ChemSpider Domain specific API API calls populate SPARQL queries
  33. It’s like a start up Social Software Engineering T shaped people
  34. “As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software” An aside
  35. Training infrastructure A pilot training e-support service platform Share training material Scalable training approaches Training the trainers, Support network Trainer pool, Share know-how Review needs Cooperating training sectors Manage and monitor outcomes Coordinate activities and materials Workshops, bootcamps, online Pop-up training provision Liaise with Nodes and Hub Programmes retain branding
  36. The multidimensional paper A scientific article can be envisioned as juxtaposed layers—Title, Abstract, Synopsis, Article, Expanded View and Datasets—that provide access to the paper with increasing resolution and allow readers to zoom in or out to access the information at the required level of granularity. A scientific article can be envisioned as juxtaposed layers—Title, Abstract, Synopsis, Article, Expanded View and Datasets—that provide access to the paper with increasing resolution and allow readers to zoom in or out to access th