SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Data Provenance and Scientific Workflow Management 
Data Provenance 
Neuroscience Data 
Scientific Workflow Management 
(and Questionnaires) 
Kelly Rosa Braghetto 
kellyrb@ime.usp.br 
Departamento de Ciência da Computação 
Instituto de Matemática e Estatística 
Universidade de São Paulo 
05 de Junho de 2013 
1 / 21
Data Provenance and Scientific Workflow Management 
Agenda 
1 Data Provenance 
2 Neuroscience Data 
CARMEN Project 
NEMO Project 
3 Scientific Workflow Management Systems (SWMS) 
Taverna 
4 Questionnaires 
2 / 21
Data Provenance and Scientific Workflow Management 
Data Provenance 
Data Provenance 
Frequently asked questions for Scientists 
Where was a document found? 
How was this data set produced? 
Were all facts included in this decision? 
Were all the latest figures included in this diagram? 
Can this scientific experiment be reproduced? 
Source: http://openprovenance.org/ 
3 / 21
Data Provenance and Scientific Workflow Management 
Data Provenance 
Data Provenance 
What is Provenance? 
Provenance refers to the sources of information, such as entities and 
processes, involved in producing or delivering an artifact. 
Why does Provenance matter? 
The provenance of information is crucial in deciding whether 
information is to be trusted, how it should be integrated with other 
diverse information sources, and how to give credit to its originators 
when reusing it. 
In an open and inclusive environment such as the Web, users find 
information that is often contradictory or questionable. 
People make trust judgments based on provenance that may or may 
not be explicitly offered to them. Problem: lack of a standard 
model. 
Source: http://www.w3.org/2011/prov/wiki/Main_Page 4 / 21
Data Provenance and Scientific Workflow Management 
Data Provenance 
Works devoted to Data Provenance 
Provenance Working Group, maintained by W3C 
“Mission: to support the widespread publication and use of 
provenance information of Web documents, data, and 
resources.” 
http://www.w3.org/2011/prov/wiki/Main_Page 
Wf4Ever project 
“Wf4Ever addresses some of the challenges associated to the 
preservation of scientific experiments in data-intensive science.” 
http://www.wf4ever-project.org/ 
Open Provenance Model (OPM) 
http://openprovenance.org/ 
5 / 21
Data Provenance and Scientific Workflow Management 
Data Provenance 
Open Provenance Model (OPM) 
The Open Provenance Model is a model of provenance that is 
designed to meet the following requirements: 
1 To allow provenance information to be exchanged between 
systems, by means of a compatibility layer based on a shared 
provenance model. 
2 To allow developers to build and share tools that operate on 
such a provenance model. 
3 To define provenance in a precise, technology-agnostic manner. 
4 To support a digital representation of provenance for any 
’thing’, whether produced by computer systems or not. 
5 To allow multiple levels of description to coexist. 
6 To define a core set of rules that identify the valid inferences 
that can be made on provenance representation. 
6 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
Projects recording provenance of neuroscience 
data 
Code Analysis, Repository & Modelling for e-Neuroscience 
(CARMEN) 
http://www.carmen.org.uk/ 
“CARMEN is an e-Science Pilot Project funded by the Engineering 
and Physical Sciences Research Council (UK). It will deliver a 
virtual laboratory for neurophysiology, enabling sharing and 
collaborative exploitation of data, analysis code and expertise. 
Neural activity recordings (signals and image series) are the primary 
data types.” 
Neural ElectroMagnetic Ontologies (NEMO) 
http://nemo.nic.uoregon.edu/wiki/NEMO 
[More details in the next slides...] 
7 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
CARMEN Project 
The CARMEN consortium 
“A core part of our work is the development of minimum reporting 
guidelines for annotation of data and other computational resources 
for the purpose of sharing” 
Result: a MINI module for Electrophysiology 
MINI (Minimum Information about a Neuroscience 
investigation) – is a family of reporting guideline documents 
A module represents the minimum information that should be 
reported about a dataset to: 
facilitate computational access and analysis 
to allow a reader to interpret and critically evaluate the process 
performed and the conclusions reached 
to support their experimental corroboration 
8 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
CARMEN Project 
MINI module for Electrophysiology 
The reporting recommendadions cover both extracellular and 
intracellular electrophysiology 
Covered data: 
date stamps and responsible persons 
the subject under study 
the subject task or stimulus if appropriate 
the recording protocol 
and the resulting description of time series data 
The entire module is described in: 
http://www.carmen.org.uk/standards/mini.pdf 
The module is registered in the MIBBI portal 
(http://www.biosharing.org/standards/mibbi and 
http://mibbi.sourceforge.net/legacy.shtml). 
MIBBI – Minimum Information for Biological and Biomedical 
Investigations – is a pioneering project that aims to coordinate 
guidelines for reporting of metadata across domains 9 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
Neural ElectroMagnetic Ontologies (NEMO) 
An NIH funded project 
Aims to create EEG and MEG ontologies and ontology based 
tools. These resources will be used to support representation, 
classification, and meta-analysis of brain electromagnetic data. 
Based on three pillars: DATA, ONTOLOGY, and DATABASE 
Data – raw EEG, averaged EEG (ERPs), and ERP data 
analysis results 
Ontologies – include concepts related to ERP data (including 
spatial and temporal features of ERP patterns), data 
provenance, and the cognitive and linguistic paradigms that 
were used to collect the data 
Database – the NEMO database portal is a large repository 
that stores NEMO consortium data, data analysis results, and 
data provenance 
Site: http://nemo.nic.uoregon.edu 
10 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
Ontology (informal definition) 
In both computer science and information science, an ontology 
represents a set of concepts within a domain and the 
relationships between those concepts. It is used to reason 
about the objects within that domain. 
Ontologies are used as a form of knowledge representation 
about the world or some part of it. 
Ontologies generally describe: 
Individuals: the basic or “ground level” objects 
Classes: sets, collections, or types of objects 
Attributes: properties, features, characteristics, or parameters 
that objects can have and share 
Relations: ways that objects can be related to one another 
Events: the changing of attributes or relations 
Source: http://neurolex.org 
11 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
MINEMO – an extension of the MINI module for 
Electrophysiology 
MINEMO = Minimal Information for Neural Electromagnetic 
Ontologies 
“A standards-compliant method for analysis and integration of 
event-related potentials (ERP) data”; in other words: a 
checklist for the description of ERP studies 
The checklist comprises no more than 60 fields; 20 of these 
fields are considered “mandatory” 
MINEMO promotes the use of controlled vocabularies (or 
lexicons) for data annotation. Aim: to conduct cross-lab 
meta-analysis 
Each MINEMO checklist item is linked to a term defined in 
the NEMO ontology 
12 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
Subset of “mandatory” MINEMO terms 
1 Research lab (General features) 
2 Experiment (General features) 
3 Publication 
4 Study subjects (Group characteristics) 
5 Experiment condition 
6 Stimulus representation 
7 Behavioral data collection 
8 EEG data collection 
9 EEG/ERP data preprocessing 
10 EEG/ERP data file 
The entire set of terms is defined in the article: 
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3235514/ 
They are also in the MIBBI portal: 
13 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
More about NEMO... 
Data in the NEMO Portal are aligned with the MINEMO 
checklist and ontology 
https://portal.nemo.nic.uoregon.edu 
NIF (the Neuroscience Information Framework project – 
http://www.neuinfo.org/) uses the NEMO ontology. NIF 
aggregates online sources of neuroscience data, including 
database, web sites, and publications, and provides a search 
interface across these disparate sources 
The NEMO ontology can be seen in: 
http://bioportal.bioontology.org/ontologies/40522 
14 / 21
Data Provenance and Scientific Workflow Management 
Neuroscience Data 
NEMO Project 
A “detail” to worry about... 
The MINI module for Electrophysiology and MINEMO do not cover 
the description of image data 
To see later: 
MIfMRI – Minimum Information about an fMRI Study 
http://www.fmrimethods.org/ 
15 / 21
Data Provenance and Scientific Workflow Management 
Scientific Workflow Management Systems (SWMS) 
Scientific Workflows 
A data analysis (or processing) generally can be described as a 
workflow, e.g., a set of computational tasks that “transform” 
data 
In Bioinformatics, a workflow is frequently called pipeline 
In a workflow, the output data of a task is generally used as 
input data for other(s) tasks(s). So, the flow of data defines 
an execution order for the workflows tasks 
Frequently, a same task can be appear in more than one 
workflow 
16 / 21
Data Provenance and Scientific Workflow Management 
Scientific Workflow Management Systems (SWMS) 
Scientific Workflow Management System 
(SWMS) 
A computational tool that controls the execution of workflows 
It provides mechanisms for a scientist to describe his/her 
workflow using “intuitive” modeling languages 
It can optimize the execution considering the characteristics of 
the available computational resources 
It helps to generate provenance data of an analysis process. In 
addition, it improves the reproducibility of analyses 
17 / 21
Data Provenance and Scientific Workflow Management 
Scientific Workflow Management Systems (SWMS) 
Most successful SWMSs 
Taverna – http://www.taverna.org.uk 
VisTrails – http://www.vistrails.org 
Kepler – https://kepler-project.org 
Galaxy – http://galaxyproject.org 
18 / 21
Data Provenance and Scientific Workflow Management 
Scientific Workflow Management Systems (SWMS) 
Online workflow repositories – collaborative 
science 
MyExperiments project (http://www.myexperiment.org/): 
Users upload their workflow models 
Models are categorized according their research domain 
Users can search and download models uploaded by other users 
Site stores models from different SWMSs (Taverna, Kepler, 
etc.) 
19 / 21
Data Provenance and Scientific Workflow Management 
Scientific Workflow Management Systems (SWMS) 
Taverna 
Taverna 
Features: 
Graphical user interface for the description of the workflows 
Easy installation and use 
Recording of the “execution history” and intermediate results 
(= provenance data of the entire analysis) 
Provenance export capability to OPM 
20 / 21
Data Provenance and Scientific Workflow Management 
Questionnaires 
Automatic Generation of Online Questionnaires 
There are computational tools that automatically generate 
electronic questionnaires. 
One of the most used is the LimeSurvey 
(https://www.limesurvey.org/). 
Functionalities of the LimeSurvey: 
Generates online questionnaires 
Has a big set of question types 
Keeps questionnaire data in a real database 
Manages users 
Creates a print version of questionnaires 
Makes basic statistical analysis 
... 
21 / 21

Mais conteúdo relacionado

Mais procurados

EPGP Informatics Publication - nihms-369795
EPGP Informatics Publication - nihms-369795EPGP Informatics Publication - nihms-369795
EPGP Informatics Publication - nihms-369795Michael Williams
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...Nit Celesc
 
Big data from small data:  A survey of the neuroscience landscape through the...
Big data from small data:  A survey of the neuroscience landscape through the...Big data from small data:  A survey of the neuroscience landscape through the...
Big data from small data:  A survey of the neuroscience landscape through the...Maryann Martone
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...Maryann Martone
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYcscpconf
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDatabricks
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchFranciscoJAzuajeG
 
E bank uk_linking_research_data_scholarly
E bank uk_linking_research_data_scholarlyE bank uk_linking_research_data_scholarly
E bank uk_linking_research_data_scholarlyLuisa Francisco
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Amit Sheth
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallAlexander Pico
 

Mais procurados (20)

EPGP Informatics Publication - nihms-369795
EPGP Informatics Publication - nihms-369795EPGP Informatics Publication - nihms-369795
EPGP Informatics Publication - nihms-369795
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
C a s e - b a s e d S y s t e m f o r I n n o v a t i o n M a n a g e m e n t...
 
Big data from small data:  A survey of the neuroscience landscape through the...
Big data from small data:  A survey of the neuroscience landscape through the...Big data from small data:  A survey of the neuroscience landscape through the...
Big data from small data:  A survey of the neuroscience landscape through the...
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
Navigating the Neuroscience Data Landscape
Navigating the Neuroscience Data LandscapeNavigating the Neuroscience Data Landscape
Navigating the Neuroscience Data Landscape
 
DR KL CV v5
DR KL CV v5DR KL CV v5
DR KL CV v5
 
NRNB EAC Meeting 2012
NRNB EAC Meeting 2012NRNB EAC Meeting 2012
NRNB EAC Meeting 2012
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge Graphs
 
MPS webinar master deck
MPS webinar master deckMPS webinar master deck
MPS webinar master deck
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical research
 
E bank uk_linking_research_data_scholarly
E bank uk_linking_research_data_scholarlyE bank uk_linking_research_data_scholarly
E bank uk_linking_research_data_scholarly
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
Ai and biology
Ai and biologyAi and biology
Ai and biology
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
 

Semelhante a Data Provenance and Scientific Workflow Management

Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Elia Brodsky
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 
Data modelling and data processing generated by human eye movements
Data modelling and data processing generated by human eye movements  Data modelling and data processing generated by human eye movements
Data modelling and data processing generated by human eye movements IJECEIAES
 
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi
 
A Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisA Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisMichele Thomas
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Enayat Rajabi
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)butest
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksOscar Corcho
 
Databases set for scientific research.pptx
Databases set for scientific research.pptxDatabases set for scientific research.pptx
Databases set for scientific research.pptxzahraashouman
 
Metid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for ScienceMetid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for Scienceale93756
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalWaqas Tariq
 

Semelhante a Data Provenance and Scientific Workflow Management (20)

Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
B.3.5
B.3.5B.3.5
B.3.5
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Data modelling and data processing generated by human eye movements
Data modelling and data processing generated by human eye movements  Data modelling and data processing generated by human eye movements
Data modelling and data processing generated by human eye movements
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-Review
 
A Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And AnalysisA Systems Approach To Qualitative Data Management And Analysis
A Systems Approach To Qualitative Data Management And Analysis
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)Bibliography (Microsoft Word, 61k)
Bibliography (Microsoft Word, 61k)
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor Networks
 
NeOn Project : Lifecycle support for Networked Ontologies
NeOn Project : Lifecycle support for Networked Ontologies NeOn Project : Lifecycle support for Networked Ontologies
NeOn Project : Lifecycle support for Networked Ontologies
 
NeOn project
NeOn projectNeOn project
NeOn project
 
Databases set for scientific research.pptx
Databases set for scientific research.pptxDatabases set for scientific research.pptx
Databases set for scientific research.pptx
 
Metid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for ScienceMetid Match 2014 - SEEK for Science
Metid Match 2014 - SEEK for Science
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 

Mais de NeuroMat

Gromov and the ”ergo-brain”
Gromov and the ”ergo-brain”Gromov and the ”ergo-brain”
Gromov and the ”ergo-brain”NeuroMat
 
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...NeuroMat
 
Hidden context tree modeling of EEG data
Hidden context tree modeling of EEG dataHidden context tree modeling of EEG data
Hidden context tree modeling of EEG dataNeuroMat
 
Functional Regression Analysis
Functional Regression AnalysisFunctional Regression Analysis
Functional Regression AnalysisNeuroMat
 
Goodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data caseGoodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data caseNeuroMat
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03NeuroMat
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02NeuroMat
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01NeuroMat
 
MEPHunter: Making things easier for signal visualization and analysis
MEPHunter: Making things easier for signal visualization and analysisMEPHunter: Making things easier for signal visualization and analysis
MEPHunter: Making things easier for signal visualization and analysisNeuroMat
 
Neuroscience Experiments System - NES - Versão 0.1
Neuroscience Experiments System - NES - Versão 0.1Neuroscience Experiments System - NES - Versão 0.1
Neuroscience Experiments System - NES - Versão 0.1NeuroMat
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...NeuroMat
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...NeuroMat
 
Desafios matemáticos e computacionais da neurociência
Desafios matemáticos e computacionais da neurociênciaDesafios matemáticos e computacionais da neurociência
Desafios matemáticos e computacionais da neurociênciaNeuroMat
 
Introdução elementar à modelagem estocástica de cadeias simbólicas
Introdução elementar à modelagem estocástica de cadeias simbólicasIntrodução elementar à modelagem estocástica de cadeias simbólicas
Introdução elementar à modelagem estocástica de cadeias simbólicasNeuroMat
 

Mais de NeuroMat (14)

Gromov and the ”ergo-brain”
Gromov and the ”ergo-brain”Gromov and the ”ergo-brain”
Gromov and the ”ergo-brain”
 
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...
Perspectives on Applications of a Stochastic Spiking Neuron Model to Neural N...
 
Hidden context tree modeling of EEG data
Hidden context tree modeling of EEG dataHidden context tree modeling of EEG data
Hidden context tree modeling of EEG data
 
Functional Regression Analysis
Functional Regression AnalysisFunctional Regression Analysis
Functional Regression Analysis
 
Goodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data caseGoodness–of–fit tests for regression models: the functional data case
Goodness–of–fit tests for regression models: the functional data case
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 03
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 02
 
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01
Introdução ao Armazenamento de Dados de Experimentos em Neurociência - Parte 01
 
MEPHunter: Making things easier for signal visualization and analysis
MEPHunter: Making things easier for signal visualization and analysisMEPHunter: Making things easier for signal visualization and analysis
MEPHunter: Making things easier for signal visualization and analysis
 
Neuroscience Experiments System - NES - Versão 0.1
Neuroscience Experiments System - NES - Versão 0.1Neuroscience Experiments System - NES - Versão 0.1
Neuroscience Experiments System - NES - Versão 0.1
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
 
Desafios matemáticos e computacionais da neurociência
Desafios matemáticos e computacionais da neurociênciaDesafios matemáticos e computacionais da neurociência
Desafios matemáticos e computacionais da neurociência
 
Introdução elementar à modelagem estocástica de cadeias simbólicas
Introdução elementar à modelagem estocástica de cadeias simbólicasIntrodução elementar à modelagem estocástica de cadeias simbólicas
Introdução elementar à modelagem estocástica de cadeias simbólicas
 

Último

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Último (20)

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 

Data Provenance and Scientific Workflow Management

  • 1. Data Provenance and Scientific Workflow Management Data Provenance Neuroscience Data Scientific Workflow Management (and Questionnaires) Kelly Rosa Braghetto kellyrb@ime.usp.br Departamento de Ciência da Computação Instituto de Matemática e Estatística Universidade de São Paulo 05 de Junho de 2013 1 / 21
  • 2. Data Provenance and Scientific Workflow Management Agenda 1 Data Provenance 2 Neuroscience Data CARMEN Project NEMO Project 3 Scientific Workflow Management Systems (SWMS) Taverna 4 Questionnaires 2 / 21
  • 3. Data Provenance and Scientific Workflow Management Data Provenance Data Provenance Frequently asked questions for Scientists Where was a document found? How was this data set produced? Were all facts included in this decision? Were all the latest figures included in this diagram? Can this scientific experiment be reproduced? Source: http://openprovenance.org/ 3 / 21
  • 4. Data Provenance and Scientific Workflow Management Data Provenance Data Provenance What is Provenance? Provenance refers to the sources of information, such as entities and processes, involved in producing or delivering an artifact. Why does Provenance matter? The provenance of information is crucial in deciding whether information is to be trusted, how it should be integrated with other diverse information sources, and how to give credit to its originators when reusing it. In an open and inclusive environment such as the Web, users find information that is often contradictory or questionable. People make trust judgments based on provenance that may or may not be explicitly offered to them. Problem: lack of a standard model. Source: http://www.w3.org/2011/prov/wiki/Main_Page 4 / 21
  • 5. Data Provenance and Scientific Workflow Management Data Provenance Works devoted to Data Provenance Provenance Working Group, maintained by W3C “Mission: to support the widespread publication and use of provenance information of Web documents, data, and resources.” http://www.w3.org/2011/prov/wiki/Main_Page Wf4Ever project “Wf4Ever addresses some of the challenges associated to the preservation of scientific experiments in data-intensive science.” http://www.wf4ever-project.org/ Open Provenance Model (OPM) http://openprovenance.org/ 5 / 21
  • 6. Data Provenance and Scientific Workflow Management Data Provenance Open Provenance Model (OPM) The Open Provenance Model is a model of provenance that is designed to meet the following requirements: 1 To allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. 2 To allow developers to build and share tools that operate on such a provenance model. 3 To define provenance in a precise, technology-agnostic manner. 4 To support a digital representation of provenance for any ’thing’, whether produced by computer systems or not. 5 To allow multiple levels of description to coexist. 6 To define a core set of rules that identify the valid inferences that can be made on provenance representation. 6 / 21
  • 7. Data Provenance and Scientific Workflow Management Neuroscience Data Projects recording provenance of neuroscience data Code Analysis, Repository & Modelling for e-Neuroscience (CARMEN) http://www.carmen.org.uk/ “CARMEN is an e-Science Pilot Project funded by the Engineering and Physical Sciences Research Council (UK). It will deliver a virtual laboratory for neurophysiology, enabling sharing and collaborative exploitation of data, analysis code and expertise. Neural activity recordings (signals and image series) are the primary data types.” Neural ElectroMagnetic Ontologies (NEMO) http://nemo.nic.uoregon.edu/wiki/NEMO [More details in the next slides...] 7 / 21
  • 8. Data Provenance and Scientific Workflow Management Neuroscience Data CARMEN Project The CARMEN consortium “A core part of our work is the development of minimum reporting guidelines for annotation of data and other computational resources for the purpose of sharing” Result: a MINI module for Electrophysiology MINI (Minimum Information about a Neuroscience investigation) – is a family of reporting guideline documents A module represents the minimum information that should be reported about a dataset to: facilitate computational access and analysis to allow a reader to interpret and critically evaluate the process performed and the conclusions reached to support their experimental corroboration 8 / 21
  • 9. Data Provenance and Scientific Workflow Management Neuroscience Data CARMEN Project MINI module for Electrophysiology The reporting recommendadions cover both extracellular and intracellular electrophysiology Covered data: date stamps and responsible persons the subject under study the subject task or stimulus if appropriate the recording protocol and the resulting description of time series data The entire module is described in: http://www.carmen.org.uk/standards/mini.pdf The module is registered in the MIBBI portal (http://www.biosharing.org/standards/mibbi and http://mibbi.sourceforge.net/legacy.shtml). MIBBI – Minimum Information for Biological and Biomedical Investigations – is a pioneering project that aims to coordinate guidelines for reporting of metadata across domains 9 / 21
  • 10. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project Neural ElectroMagnetic Ontologies (NEMO) An NIH funded project Aims to create EEG and MEG ontologies and ontology based tools. These resources will be used to support representation, classification, and meta-analysis of brain electromagnetic data. Based on three pillars: DATA, ONTOLOGY, and DATABASE Data – raw EEG, averaged EEG (ERPs), and ERP data analysis results Ontologies – include concepts related to ERP data (including spatial and temporal features of ERP patterns), data provenance, and the cognitive and linguistic paradigms that were used to collect the data Database – the NEMO database portal is a large repository that stores NEMO consortium data, data analysis results, and data provenance Site: http://nemo.nic.uoregon.edu 10 / 21
  • 11. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project Ontology (informal definition) In both computer science and information science, an ontology represents a set of concepts within a domain and the relationships between those concepts. It is used to reason about the objects within that domain. Ontologies are used as a form of knowledge representation about the world or some part of it. Ontologies generally describe: Individuals: the basic or “ground level” objects Classes: sets, collections, or types of objects Attributes: properties, features, characteristics, or parameters that objects can have and share Relations: ways that objects can be related to one another Events: the changing of attributes or relations Source: http://neurolex.org 11 / 21
  • 12. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project MINEMO – an extension of the MINI module for Electrophysiology MINEMO = Minimal Information for Neural Electromagnetic Ontologies “A standards-compliant method for analysis and integration of event-related potentials (ERP) data”; in other words: a checklist for the description of ERP studies The checklist comprises no more than 60 fields; 20 of these fields are considered “mandatory” MINEMO promotes the use of controlled vocabularies (or lexicons) for data annotation. Aim: to conduct cross-lab meta-analysis Each MINEMO checklist item is linked to a term defined in the NEMO ontology 12 / 21
  • 13. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project Subset of “mandatory” MINEMO terms 1 Research lab (General features) 2 Experiment (General features) 3 Publication 4 Study subjects (Group characteristics) 5 Experiment condition 6 Stimulus representation 7 Behavioral data collection 8 EEG data collection 9 EEG/ERP data preprocessing 10 EEG/ERP data file The entire set of terms is defined in the article: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3235514/ They are also in the MIBBI portal: 13 / 21
  • 14. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project More about NEMO... Data in the NEMO Portal are aligned with the MINEMO checklist and ontology https://portal.nemo.nic.uoregon.edu NIF (the Neuroscience Information Framework project – http://www.neuinfo.org/) uses the NEMO ontology. NIF aggregates online sources of neuroscience data, including database, web sites, and publications, and provides a search interface across these disparate sources The NEMO ontology can be seen in: http://bioportal.bioontology.org/ontologies/40522 14 / 21
  • 15. Data Provenance and Scientific Workflow Management Neuroscience Data NEMO Project A “detail” to worry about... The MINI module for Electrophysiology and MINEMO do not cover the description of image data To see later: MIfMRI – Minimum Information about an fMRI Study http://www.fmrimethods.org/ 15 / 21
  • 16. Data Provenance and Scientific Workflow Management Scientific Workflow Management Systems (SWMS) Scientific Workflows A data analysis (or processing) generally can be described as a workflow, e.g., a set of computational tasks that “transform” data In Bioinformatics, a workflow is frequently called pipeline In a workflow, the output data of a task is generally used as input data for other(s) tasks(s). So, the flow of data defines an execution order for the workflows tasks Frequently, a same task can be appear in more than one workflow 16 / 21
  • 17. Data Provenance and Scientific Workflow Management Scientific Workflow Management Systems (SWMS) Scientific Workflow Management System (SWMS) A computational tool that controls the execution of workflows It provides mechanisms for a scientist to describe his/her workflow using “intuitive” modeling languages It can optimize the execution considering the characteristics of the available computational resources It helps to generate provenance data of an analysis process. In addition, it improves the reproducibility of analyses 17 / 21
  • 18. Data Provenance and Scientific Workflow Management Scientific Workflow Management Systems (SWMS) Most successful SWMSs Taverna – http://www.taverna.org.uk VisTrails – http://www.vistrails.org Kepler – https://kepler-project.org Galaxy – http://galaxyproject.org 18 / 21
  • 19. Data Provenance and Scientific Workflow Management Scientific Workflow Management Systems (SWMS) Online workflow repositories – collaborative science MyExperiments project (http://www.myexperiment.org/): Users upload their workflow models Models are categorized according their research domain Users can search and download models uploaded by other users Site stores models from different SWMSs (Taverna, Kepler, etc.) 19 / 21
  • 20. Data Provenance and Scientific Workflow Management Scientific Workflow Management Systems (SWMS) Taverna Taverna Features: Graphical user interface for the description of the workflows Easy installation and use Recording of the “execution history” and intermediate results (= provenance data of the entire analysis) Provenance export capability to OPM 20 / 21
  • 21. Data Provenance and Scientific Workflow Management Questionnaires Automatic Generation of Online Questionnaires There are computational tools that automatically generate electronic questionnaires. One of the most used is the LimeSurvey (https://www.limesurvey.org/). Functionalities of the LimeSurvey: Generates online questionnaires Has a big set of question types Keeps questionnaire data in a real database Manages users Creates a print version of questionnaires Makes basic statistical analysis ... 21 / 21