SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
by Barend Mons 
Brought to you for 
in V parts 
A Plea 
For Professional Datapublishing 
Bringing Data to Broadway
FAIR play 
For Research Data and other Research Objects 
Findable 
Accessible 
Interoperable 
Reusable 
The Cast
Part I 
Moaning and Lamenting
Singers and Dancers
A-The 
Curse of Multidisciplinarity
I can not keep my data experts !!!
f 
2005: Text Mining ? 
Why Bury it first and then mine it again !
Part II 
The Explicitome 
and the Elusive Part 
(our own fault) 
The Explicitome: everything we already asserted
The Elusive Explicitome Phenomenon 
example from: Yepes & Verspoor, 2013 
narrative 
Tables/figures 
abstract 
# of assertions 
Supplementary data 
5 500* 1000 50K-1M+ 
# of SNP-Phen: 2% 4% 50%* 
The Elusive Explicitome: what escapes us (95%) 
Hurdle 1: 
Paywalls 
Hurdle 2: 
‘TIF’walls 
Hurdle 3: 
The Wall of Broken Links
Data loss is real and significant, while data growth is 
staggering 
Nature news, 19 December 2013 • Computer speed and storage 
capacity is doubling every 18 
months and this rate is steady 
• DNA sequence data is 
doubling every 6-8 months 
over the last 3 years and looks 
‘Oops, that link was the laptop of my PhD student’ to continue for this decade
The trends in e-Science 
Computer Analytics 
(takes charge) 
Enormity of datasets 
(beyond narrative) 
Collaborative Intelligence 
(calls for million minds) 
Irreversable movement 
(towards OA) 
FAIR 
? 
Data 
Publishing & 
Stewardship
Professionalise Data Stewardship 
A 
F 
R 
I 
Educate, Reward and Keep Data Experts
Part 3 
Unavoidable: some science of ‘our own’ 
Part III 
INTERMEZZO 
Some Research…. 
but…..as ….Sorry examples, for the LS examples….. 
sorry
Simplified eScience 
RO’s 
The Explicitome 
+ 
WorkFlows 
Ridiculogram 
New 
dataset 
User 
New 
Insights
Thanks to 
Peter WittenBurg
FAIR for computers FAIR for people 
AERIAL SURVEY 
pattern recognition in 
Ridiculograms 
HUMAN EXCAVATION 
rationalisation and 
‘confirmational reading’ 
X 
‘Why would I believe this association’???
For KD we need each association only once 
23 
Cardinal Assertion 
(<1011) 
n identical 
assertions 
‘n’ different 
provenances
We publish about less than a million LS Concepts ! 
24 106 concept clusters (Knowlets)
www.biosemantics.org LUMC - LIACS 
BioSemantics Knowledge Discovery Pipeline 
⊲ 
data sources ‘coordinated’ data 
! 
nanopub cache 
cardinal 
assertion 
store 
semantic 
data 
indexing modelling 
reasoning 
algorithms 
trends 
phase 
transitions 
‘new’ data 
alerts differentials 
{ 
funding 
priorities 
• gene 
• disease 
semantic 
query 
{
© Phortos Consultants 
44,000 hypotheses (PPI) 
What about the other 43,999 ?
Part 3 
Unavoidable: some science of ‘our own’ 
Part IV 
Towards Solutions 
Bigger is not Better 
Zipping the Explicitome 
but…..as examples, sorry
Electronic 
Health 
Databases 
The Rescued Explicitome 
Value 
Added 
Databases 
narrative 
Tables/figures 
Supplementary data 
abstract 
PROVENANCE 
Total Explicitome 
an estimated 
1014 asserted associations 
in 2,500 data sources 
ETL to 
FAIR 
FAIR 
to 
read
Assertions 
Concepts 
1014 
1011 
106 
Semantic MedLine 
U+C+CT+EG+GO = 36 M 
80% 
20% 
Cardinal 
Zipping the Explicitome
Part 3 
Unavoidable: some science of ‘our own’ 
Part V 
(FAIR) data should take 
CENTER STAGE 
but…..as examples, sorry
DOI 
PID 
ARK 
Handles 
UUID 
TURI’s 
?
PID 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
A simplified diagram of a Digital (data) Object irrespective of technological choices and naming
Digital Object Architecture 
PID 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
s are Digital Objects 
Some Research Objects Nanopublications are Research Objects 
are
Data as increasingly FAIR Digital Objects 
Totally UNFAIR 
PID 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
Usable for Humans 
PID 
Findable 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
PID 
FAIR metadata 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
PID 
FAIR data-restricted 
access 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
FAIR data- 
Open Access 
PID 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements) 
Open Access/Functionally Linked 
PID 
FAIR data- 
Metadata (intrinsic) 
'provenance' (user defined) 
Data (elements)
The Data Stewardship Cycle 
35 
5%
Data Owners 
(supp) 
data 
Data 
bases 
Repositories 
FAIRport proof of concept 
ELIXIR FAIR Data Search Index 
End-users 
FAIR L2 
ELIXIR 
Data 
FAIR 
Port 
ELIXIR federated data 
ELIXIR semantic data repository 
FAIR L1 
Search for 
datasets 
Download 
data (sub) 
sets in many 
formats (xml, 
rdf, json etc) 
FAIR 
L3 
FAIR L4 
ASPs, Inhouse IT, 
Bioinformatics 
Etc.. 
Tools & 
Applications 
Elixir 
Fin. 
Elixir 
Esp. 
Elixir 
Nor. 
Elixir 
Elixir UK 
Elixir SWE 
NL.. 
Elixir 
Fin. 
Elixir 
Esp. 
Elixir 
Nor. 
Elixir 
Elixir UK 
Elixir SWE 
NL.. 
www.nanopubmed.org
Parties needed Typical Candidates NL-example 
Tusted Party 
Usually Public Sector 
With 'data stewardship' mandate 
1 
Executive Party/ 
Coordinator 
Usually Public or Private Sector 
With Expert Knowledge on Project 
ans relation management 
2 
Technology 
Providers 
3 4 PID/ARTA stewards 
DTL/ELIXIR-nl 
others 
5 DOA architecture/IMS CNRI + EURETOS 
6 Publishing pipeline EURETOS 
7 Repository Software 
8 eInfrastructure
Malpractices……. 
Journal Impact Factor 
Ignore Altmetrics 
No data stewardship plan 
Obstruct Tenure 
Data Experts 
‘supplementary data’ 
Knowledge Sharing Impaired
NITRD 
FORCE11 ORCID VIVO 
4/10/14 
EUDAT 
40 
DATAVERSE 
BD2K 
DANS 
ELIXIR 
NIHCom 
mons 
H2020 
DRYAD RDA 
FigShare 
Nanopub 
Biosharing 
Elsevier 
Science Nature 
SageBio 
HVP 
DataCite 
EGA 
Reseach Objects 
Nebulus 
Embassy 
SADI 
EURETOS 
YARCdata 
IMI 
interoperability 
ISA 
Open PHACTS 
Data Fabric
Good practices (apart from collaborating) 
‘professional data publishing’ 
RO Impact Factor 
Award Altmetrics 
5% for 
data stewardship plan 
Train & Tenure 
Data Experts 
FAIR play
THE END 
Thank you!
Endorsed by 82 organisations and [y] individuals 
1. FAIR guiding principles with public discussion forum: 
https://www.force11.org/group/fairgroup/fairprinciples 
2. Notes and Annexes: https://www.force11.org/node/6062/ 
3. Group home page https://www.force11.org/group/fairgroup 
COMMENT: (till October 1st) 
ENDORSE: (after October 1st)

Mais conteúdo relacionado

Mais procurados

IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrCarly Strasser
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) SkillsOscar Corcho
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Stuart Chalk
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...Dr. Haxel Consult
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
ICIC 2017: The Next Era: Deep Learning for Biomedical Research
ICIC 2017: The Next Era: Deep Learning for Biomedical ResearchICIC 2017: The Next Era: Deep Learning for Biomedical Research
ICIC 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2Daniel JACOB
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 

Mais procurados (17)

IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan Starr
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
ICIC 2017: The Next Era: Deep Learning for Biomedical Research
ICIC 2017: The Next Era: Deep Learning for Biomedical ResearchICIC 2017: The Next Era: Deep Learning for Biomedical Research
ICIC 2017: The Next Era: Deep Learning for Biomedical Research
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Make your data great again - Ver 2
Make your data great again - Ver 2Make your data great again - Ver 2
Make your data great again - Ver 2
 
Coming to terms to FAIR semantics
Coming to terms to FAIR semanticsComing to terms to FAIR semantics
Coming to terms to FAIR semantics
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 

Semelhante a Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway"

Barend Mons slides from #ISMB 2014: Trends in data publishing
Barend Mons slides from #ISMB 2014: Trends in data publishingBarend Mons slides from #ISMB 2014: Trends in data publishing
Barend Mons slides from #ISMB 2014: Trends in data publishingGigaScience, BGI Hong Kong
 
Data Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionData Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionChristoforos Anagnostopoulos
 
CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730jeffreylancaster
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Emerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsEmerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsDavid De Roure
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptSangrangBargayary3
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
How to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data TeamHow to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data TeamTraveloka
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
DevelopingDataScienceProfession
DevelopingDataScienceProfessionDevelopingDataScienceProfession
DevelopingDataScienceProfessionGary Rector
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
 
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...Tiago Henriques
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirSpark Summit
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 
Introduction to OPEN DATA and other hypes (2017/18)
Introduction to OPEN DATA and other hypes (2017/18)Introduction to OPEN DATA and other hypes (2017/18)
Introduction to OPEN DATA and other hypes (2017/18)Julià Minguillón
 
Lecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdfLecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdfJojo314349
 

Semelhante a Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway" (20)

Barend Mons slides from #ISMB 2014: Trends in data publishing
Barend Mons slides from #ISMB 2014: Trends in data publishingBarend Mons slides from #ISMB 2014: Trends in data publishing
Barend Mons slides from #ISMB 2014: Trends in data publishing
 
Data Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionData Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinction
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Emerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsEmerging Forms of Data and Analytics
Emerging Forms of Data and Analytics
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
How to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data TeamHow to Feed a Data Hungry Organization – by Traveloka Data Team
How to Feed a Data Hungry Organization – by Traveloka Data Team
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
DevelopingDataScienceProfession
DevelopingDataScienceProfessionDevelopingDataScienceProfession
DevelopingDataScienceProfession
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
 
"Cool" metadata for FAIR data
"Cool" metadata for FAIR data"Cool" metadata for FAIR data
"Cool" metadata for FAIR data
 
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
Introduction to OPEN DATA and other hypes (2017/18)
Introduction to OPEN DATA and other hypes (2017/18)Introduction to OPEN DATA and other hypes (2017/18)
Introduction to OPEN DATA and other hypes (2017/18)
 
Lecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdfLecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdf
 

Mais de Research Data Alliance

The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersResearch Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchResearch Data Alliance
 

Mais de Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 

Último

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway"

  • 1. by Barend Mons Brought to you for in V parts A Plea For Professional Datapublishing Bringing Data to Broadway
  • 2. FAIR play For Research Data and other Research Objects Findable Accessible Interoperable Reusable The Cast
  • 3. Part I Moaning and Lamenting
  • 5. A-The Curse of Multidisciplinarity
  • 6. I can not keep my data experts !!!
  • 7. f 2005: Text Mining ? Why Bury it first and then mine it again !
  • 8.
  • 9.
  • 10. Part II The Explicitome and the Elusive Part (our own fault) The Explicitome: everything we already asserted
  • 11. The Elusive Explicitome Phenomenon example from: Yepes & Verspoor, 2013 narrative Tables/figures abstract # of assertions Supplementary data 5 500* 1000 50K-1M+ # of SNP-Phen: 2% 4% 50%* The Elusive Explicitome: what escapes us (95%) Hurdle 1: Paywalls Hurdle 2: ‘TIF’walls Hurdle 3: The Wall of Broken Links
  • 12. Data loss is real and significant, while data growth is staggering Nature news, 19 December 2013 • Computer speed and storage capacity is doubling every 18 months and this rate is steady • DNA sequence data is doubling every 6-8 months over the last 3 years and looks ‘Oops, that link was the laptop of my PhD student’ to continue for this decade
  • 13. The trends in e-Science Computer Analytics (takes charge) Enormity of datasets (beyond narrative) Collaborative Intelligence (calls for million minds) Irreversable movement (towards OA) FAIR ? Data Publishing & Stewardship
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. Professionalise Data Stewardship A F R I Educate, Reward and Keep Data Experts
  • 19. Part 3 Unavoidable: some science of ‘our own’ Part III INTERMEZZO Some Research…. but…..as ….Sorry examples, for the LS examples….. sorry
  • 20. Simplified eScience RO’s The Explicitome + WorkFlows Ridiculogram New dataset User New Insights
  • 21. Thanks to Peter WittenBurg
  • 22. FAIR for computers FAIR for people AERIAL SURVEY pattern recognition in Ridiculograms HUMAN EXCAVATION rationalisation and ‘confirmational reading’ X ‘Why would I believe this association’???
  • 23. For KD we need each association only once 23 Cardinal Assertion (<1011) n identical assertions ‘n’ different provenances
  • 24. We publish about less than a million LS Concepts ! 24 106 concept clusters (Knowlets)
  • 25. www.biosemantics.org LUMC - LIACS BioSemantics Knowledge Discovery Pipeline ⊲ data sources ‘coordinated’ data ! nanopub cache cardinal assertion store semantic data indexing modelling reasoning algorithms trends phase transitions ‘new’ data alerts differentials { funding priorities • gene • disease semantic query {
  • 26. © Phortos Consultants 44,000 hypotheses (PPI) What about the other 43,999 ?
  • 27. Part 3 Unavoidable: some science of ‘our own’ Part IV Towards Solutions Bigger is not Better Zipping the Explicitome but…..as examples, sorry
  • 28. Electronic Health Databases The Rescued Explicitome Value Added Databases narrative Tables/figures Supplementary data abstract PROVENANCE Total Explicitome an estimated 1014 asserted associations in 2,500 data sources ETL to FAIR FAIR to read
  • 29. Assertions Concepts 1014 1011 106 Semantic MedLine U+C+CT+EG+GO = 36 M 80% 20% Cardinal Zipping the Explicitome
  • 30. Part 3 Unavoidable: some science of ‘our own’ Part V (FAIR) data should take CENTER STAGE but…..as examples, sorry
  • 31. DOI PID ARK Handles UUID TURI’s ?
  • 32. PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) A simplified diagram of a Digital (data) Object irrespective of technological choices and naming
  • 33. Digital Object Architecture PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) s are Digital Objects Some Research Objects Nanopublications are Research Objects are
  • 34. Data as increasingly FAIR Digital Objects Totally UNFAIR PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) Usable for Humans PID Findable Metadata (intrinsic) 'provenance' (user defined) Data (elements) PID FAIR metadata Metadata (intrinsic) 'provenance' (user defined) Data (elements) PID FAIR data-restricted access Metadata (intrinsic) 'provenance' (user defined) Data (elements) FAIR data- Open Access PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) Open Access/Functionally Linked PID FAIR data- Metadata (intrinsic) 'provenance' (user defined) Data (elements)
  • 35. The Data Stewardship Cycle 35 5%
  • 36.
  • 37. Data Owners (supp) data Data bases Repositories FAIRport proof of concept ELIXIR FAIR Data Search Index End-users FAIR L2 ELIXIR Data FAIR Port ELIXIR federated data ELIXIR semantic data repository FAIR L1 Search for datasets Download data (sub) sets in many formats (xml, rdf, json etc) FAIR L3 FAIR L4 ASPs, Inhouse IT, Bioinformatics Etc.. Tools & Applications Elixir Fin. Elixir Esp. Elixir Nor. Elixir Elixir UK Elixir SWE NL.. Elixir Fin. Elixir Esp. Elixir Nor. Elixir Elixir UK Elixir SWE NL.. www.nanopubmed.org
  • 38. Parties needed Typical Candidates NL-example Tusted Party Usually Public Sector With 'data stewardship' mandate 1 Executive Party/ Coordinator Usually Public or Private Sector With Expert Knowledge on Project ans relation management 2 Technology Providers 3 4 PID/ARTA stewards DTL/ELIXIR-nl others 5 DOA architecture/IMS CNRI + EURETOS 6 Publishing pipeline EURETOS 7 Repository Software 8 eInfrastructure
  • 39. Malpractices……. Journal Impact Factor Ignore Altmetrics No data stewardship plan Obstruct Tenure Data Experts ‘supplementary data’ Knowledge Sharing Impaired
  • 40. NITRD FORCE11 ORCID VIVO 4/10/14 EUDAT 40 DATAVERSE BD2K DANS ELIXIR NIHCom mons H2020 DRYAD RDA FigShare Nanopub Biosharing Elsevier Science Nature SageBio HVP DataCite EGA Reseach Objects Nebulus Embassy SADI EURETOS YARCdata IMI interoperability ISA Open PHACTS Data Fabric
  • 41. Good practices (apart from collaborating) ‘professional data publishing’ RO Impact Factor Award Altmetrics 5% for data stewardship plan Train & Tenure Data Experts FAIR play
  • 43. Endorsed by 82 organisations and [y] individuals 1. FAIR guiding principles with public discussion forum: https://www.force11.org/group/fairgroup/fairprinciples 2. Notes and Annexes: https://www.force11.org/node/6062/ 3. Group home page https://www.force11.org/group/fairgroup COMMENT: (till October 1st) ENDORSE: (after October 1st)