Workflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated
Towards Workflow Ecosystems Through Semantic and Standard Representations
1. 1
Towards Workflow Ecosystems Through Semantic and Standard Representations
Yolanda Gil
Information Sciences Institute
and Department of Computer Science
University of Southern California
http://www.isi.edu/~gil
@yolandagil
gil@isi.edu
Daniel Garijo, Oscar Corcho
OEG-DIA
Facultad de Informática, Universidad Politécnica de Madrid
http://purl.org/net/dgarijo
@dgarijo,@ocorcho
{dgarijo,ocorcho}@fi.upm.es
2. 2
Outline
“Towards Workflow Ecosystems Through Semantic and Standard Representations”
Motivation
What is a workflow ecosystem
The WEST workflow ecosystem
Semantic and standard representations in WEST
3. 3
Proliferation of Workflow Systems and Workflow Functions
Workflow design
Workflow validation
Workflow execution
Workflow visualization
Workflow mining
End users need a more fluid way to utilize workflow functions based on initial application requirements and as those requirements evolve over time
4. 4
Workflow ecosystems
Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions:
•Functional heterogeneity: Diversity of workflow tools and workflow functions integrated, developed by independent parties
•Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
•Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
5. 5
Interoperability of Workflow Systems: Prior Work
Integrations of workflow systems
•PEGASUS: Condor (execution) [Deelman et al 2006], PASSOA (provenance) [Miles et al 2007], WINGS (generation) [Gil et al 2007], nanoHUB (creation) [McLennan et al 2013]
•Taverna: Galaxy (execution) [Abouelhoda et al 2012], myExperiment (repository) [De Roure et al 2009], DistillFlow (mining) [Starlinger et al 2012]
Workflow interchange languages
•Provenance: Open Provenance Model (OPM) [Moreau et al 2011], W3C PROV standard [Gil and Miles 2013]
•Workflows: OPMW [Garijo et al 2013], D-PROV [Missier et al 2013], Wfprov [Belhajjame et al 2013], P-PLAN [Garijo and Gil 2012]
•IWIR [Plankensteiner et al WORKS’11]
•WS-BPEL, BPMN
6. 6
Workflow ecosystems
Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions:
•Functional heterogeneity: Diversity of workflow tools and workflow functions integrated, developed by independent parties
–Today: 1-2 workflow tools with 2-3 workflow functions developed by 2-3 institutions
•Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
–Today: only one consumer tool, and if more than one they have same function (eg two execution engines)
•Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
–Today: workflows exported must be fully imported
18. 18
Overview of Semantic and Standard Representations in WEST
Workflow template
Plan Definition
Workflow execution
OPM, PROV
P-Plan
OPMW
Generic Provenance
Plan Execution
Execution of
19. 19
Overview of Semantic and Standard Representations in WEST: PROV and OPM
[Moreau et al 2013]
[Moreau et al 2011]
20. 20
Overview of Semantic and Standard Representations in WEST: PROV and P-PLAN
[Moreau et al 2013]
[Garijo and Gil 2013]
21. 21
Overview of Semantic and Standard Representations in WEST: OPMW
[Garijo et al 2012]
22. 22
Abstraction Heterogeneity: Mapping Across Models Through Queries
CONSTRUCT{
?activity2 p-plan:isPrecededBy ?activity. }
WHERE{
?activity a opmw:WorkflowTemplateProcess.
?activity2 a opmw:WorkflowTemplateProcess.
?activity2 opmw:uses / opmw:isGeneratedBy ?activity. }
Returning P-Plan from OPMW objects:
CONSTRUCT{ ?activity a prov:Activity. ?activity2 a prov:Activity. ?activity2 prov:used ?u1 . ?u1 prov:wasGeneratedBy?activity. } WHERE{ ?activity a opmw:WorkflowExecutionProcess. ?activity2 a opmw:WorkflowExecutionProcess. ?activity2 opmv:used ?u1. ?u1 opmv:wasGeneratedBy ?activity. }
Returning PROV from OPMW objects:
P-Plan
OPMW
PROV
OPMW
23. 23
Abstraction Heterogeneity: Mapping to Other Models Through Queries
CONSTRUCT{
?activity a wfprov:ProcessRun.
?activity2 a wfprov:ProcessRun.
?activity2 wfprov:usedInput ?u1.
?u1 wfprov:wasOutputFrom ?activity.
}
WHERE{
?activity a opmw:WorkflowExecutionProcess.
?activity2 a opmw:WorkflowExecutionProcess.
?activity2 opmv:used ?u1.
?u1 opmv:wasGeneratedBy ?activity.
}
Returning WfProv from OPMW objects:
WfProv
OPMW
30. 30
Conclusions: Workflow ecosystems and WEST
Workflow ecosystems scale up integration:
•Functional heterogeneity: Diversity of workflow tools and workflow functions integrated, developed by independent parties
–Today: 1-2 workflow tools with 2-3 workflow functions from 2-3 institutions
–WEST: 9 workflow functions from 6 research groups
•Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
–Today: only 1 consumer tool, or if several then same function
–WEST: 2-5 consumer tools with different functions
•Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
–Today: workflows exported must be fully imported
–WEST: 4+ models provide different granularity
31. 31
Benefits
Interoperability across tools with different workflow functions
Flexibility to interchange data at different granularities across tools
Facilitating the integration of content modeled in other (compatible) vocabularies
WEST
32. 32
Limitations and Future Work
Less expressivity than IWIR and D-PROV
Converters should be included in each workflow tool
No general “workflow ecosystem APIs” yet
WEST
33. 33
Thank you!
http://www.wings-workflows.org
http://www.isi.edu/~gil
Wings contributors: Varun Ratnakar, Daniel Garijo (UPM), Ricky Sethi, Hyunjoon Jo, Jihie Kim, Yan Liu, Dave Kale, Ralph Bergmann (U Trier), William Cheung (HKBU), Pedro Gonzalez & Gonzalo Castro (UCM), Paul Groth (VUA)
Wings collaborators: Ewa Deelman & Gaurang Mehta & Karan Vahi (USC), Sofus Macskassy (ISI), Natalia Villanueva & Ari Kassin (UTEP)
Wings/OODT: Chris Mattmann (JPL), Paul Ramirez (JPL), Dan Crichton (JPL), Rishi Verma (JPL)
Biomedical workflows: Phil Bourne & Sarah Kinnings (UCSD), Chris Mason (Cornell), Joel Saltz & Tahsin Kurk (Emory U.), Jill Mesirov & Michael Reich (Broad), Randall Wetzel (CHLA), Shannon McWeeney & Christina Zhang (OHSU)
Geosciences workflows: Chris Duffy (PSU), Paul Hanson (U Wisconsin, Tom Harmon & Sandra Villamizar (U Merced), Tom Jordan & Phil Maechlin (USC), Kim Olsen (SDSU)
And many others!