7. Astronomy Use Case:
A Repeater's Story
● Dealing with big amounts of tabular
data
● A lot of small scripts to avoid creating
blackbox process
● Local resource sharing, public
access only after publication
● Data must be frequently updated
from external data repositories
● Data updates must be tested before
being executed
● Data must be locally stored with
versioning
● “... we don't like to spread [the tasks]
and lose controls who is doing
what ...”
8. Research Objects
http:/www.wf4ever-project.org
●
Aggregation – Pointers or literals of
internal and external content;
●
Identity –Equivalence, equality;
●
Metadata – A reusable object;
●
Lifecycle – Stages of development.
Impacts on available functionality;
●
Versioning – Recording changes;
●
Security – Access, authentication,
ownership, trust;
●
Graceful Degradation of
Understanding – Opaque RO
domain content.
●
Mixed stewardship
●
Provenance
ROs are Content Aware Objects
●
Of compound objects
that bundle things together
●
Of evolutions
●
Of dynamic objects and static
objects
9. Biology Use Case: A Reuser's Story
● Takes a set of genes from gene experiment results
performed by others, as read in a scientific paper
● Perform 'dry' analysis to understand which genes and
which biological processes were disturbed by which
chemical compounds
● basic affymetrix data processing
● statistical analysis to identify genes that are significantly
differentially expressed under different conditions (with/without the
compounds)
● find those pathways that are most prominent among the filtered
genes
10. Biology Use Case: A Reuser's Story
● Search for existing experiments from
myExperiment (http://myexperiment.org)
● Challenge: Understand the workflow
● Perform test runs with test data and his own data
● Read others' logs
● Read annotations to workflows
● Reuse scripts from colleagues and perform
tests that his colleagues are familiar with
11. How Can It be Supported?
● A reference to the source of the data and the people to acknowledge for it.
● The initial hypothesis
● The conceptual workflow or a summary of the experiment plan
● References to workflows that were tested, with comments on their application for
the user's use case
● The workflow of the user's, possibly with a backlog of previous versions that the
user wishes to keep for reference (with notes and comments)
● The runs of the user's own workflow, results and the recorded steps that lead to
the results, in some cases with comments for later reference (e.g. 'here I used
parameter A, next time I may try B')
● The final hypothesis, with comments.
● A reference to the results of the workflow
● Design logs that record the user's considerations while making the workflow
● Run logs that record the user's considerations while running and interpreting the
workflow
15. Take home
● Provenance should be user-driven
● Linked Data should be a means to an end
● http://www.wf4ever-project.org
16. Acknowledgement
● Marco Roos of Leiden Unveristy (NL) and Jose
Enrique Ruiz of Instituto de Astrofísica de
Andalucía (Spain)
● Carole Goble of University of Manchester (UK)
and Jose Manuel Gomez of iSOCO (Spain)
● Hui Hua and Jenny Molly of University of
Oxford (UK)