SlideShare uma empresa Scribd logo
1 de 17
Anita de Waard
VP Research Data Collaborations
Elsevier, Jericho, VT
Some Thoughts on Collectively Creating
Networks of Ideas, Data and Software
How do we unify the needs of the
collective and the individual?
“Let us endeavor to build systems that allow
a kid in Mali who wants to learn about proteomics
to not be overwhelmed by the irrelevant and the untrue.”
- John Perry Barlow, iAnnotate 2014
Collectively create nimble and robust systems of
knowledge management that interconnect ideas, data
and software.
Automated caption/body
text splitting & linking
Precision Recall F-score
56.3 76.0 64.7
Statement type
Connecting Ideas: Big Mechanism
Content
Universal
schema
Surface form
relations
Structured
relations
Factorization
model
Matrix
Construction
Open
Information
Extraction
Entity
Resolution
Matrix
Factorization
Knowledge
graph
Curation
Predicted
relations
Matrix
Completion
Taxonomy
Triple
Extraction
Connecting Ideas:
Towards an Elsevier Knowledge Graph
14M articles from
Science Direct
3.3M triples
475M triples
49M triples p x r matrix p x k, k x r latent factor
matrices
~102 triples
920K concepts
from EMMeT
• Ongoing proof-of-concept work by Paul Groth, Sujit Pal and Ron Daniel of
Elsevier Labs
• Unsupervised, scalable and built with off-the-shelf technologies
• Based on recent work at University College London and University of
Massachusetts Amherst
Riedel, Sebastian, Limin Yao, Andrew McCallum, and Benjamin M. Marlin. "Relation extraction with
matrix factorization and universal schemas." (2013).
Connecting Research Data:
https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data
Linking Papers to Data, Phase 1
• Supplementary data at PANGAEA
• Bidirectional links between PANGAEA &
ScienceDirect
• Data visualized next to the article
http://www.elsevier.com/databaselinking
Linking Papers to Data, Phase 2
• ICSU/WDS/RDA Publishing Data
Service Working group
• Currently creating linked-data
model for exposing DOI to DOI
links outside publisher’s firewall
• Merged with National Data
Service pilot with the same goal
• Collaboration between CrossRef,
DataCite, Europe PubMed
Central, ANDS, Thompson
Reuters, Elsevier
• About to deliver:
http://dliservice.research-
infrastructures.eu/#/api
Objective: move from
a plethora of (mostly)
bilateral arrangements
between the different
players…
.. a one-for-all cross-
referencing service for
articles and data
.. to ..
Researchers
Funding
AgencyInstitution
Data
Repository
Dataset
JournalPaper
Current Systems for Linking Data
1. Researcher creates datasets
2. Researcher writes paper & publishes in journal
3. (Sometimes,) dataset gets posted to repository
4. Researcher reports (post-hoc) to Institution and Funder
2
2
1
3
4
4
Researchers
Funding
AgencyInstitution
Data
Repository
Dataset
JournalPaper
Issues with the Current Situation:
2
2
1
3
4
4
iii. No link between data
and paper
iv. Funders/Institutions informed as an afterthought
i. Too much work for researchers
ii. Data posting not mandatory
Researchers
Funding
Agency
Institution
Data
Repository
Dataset
Journal
Paper
A Proposal To Address These Issues:
1. Researcher creates datasets and posts to repository
(under embargo)
2. Funder is automatically notified of dataset publication
3. Researcher writes paper & publishes in journal; embargo is lifted and data linked
- NB this also allows release of non-used data for negative result and reproducibility
4. Funder and institution get report on publication and embargo lifting
2
1
1
3
3
3
4
4
i. Less
Work!
iv. Better
Tracking!
iii. Better
Linking!
ii. More
Data
Stored!
One piece of the puzzle: Mendeley Data:
https://data.mendeley.com/datasets/xz6gv65m6d/6
Linked to published
papers – or not
Linked to Github – or not
Versioning and
provenance
Another Piece of the Puzzle: DataSearch:
http://datasearchdemo.elsevier.com/indexed#/search/mercury
Federated
Poor API
Rich API
FTP & Index
Federated
Poor API
Rich API
FTP & Index
Federated
Poor API
Rich API
FTP & Index
Data
Enrichment
Manual
Automated
(User) Intent
Ranking
Filtering (how
to mix
federated &
indexed rich &
poor)
Search
Rendering
Search all data
Faceted query/Results
refinement
Store & Use results
Gener
al
UI
Doma
in
UI
Filterin
g
Feeding
user signals
back into
Search
ranking
Evaluation
How Do We Evaluate Discoverability?
Birds of a Feather on Data Search: https://rd-alliance.org/bof-data-search.html
How do we pay for all this?
RDA Cost Recovery WG
• Cochair with Ingrid Dillo (DANS), Simon Hodson (CODATA)
• Goal: write a report regarding new potential funding models for
data repositories, allow them to start sharing this knowledge
• Interviewed 24 repositories on their funding (current and future)
• Now summarising stories and trends – will present at RDAP7
Terms of funding for main income stream (in %)
Software As A First-Class Knowledge Object:
Working with Networks of Partners
Force11:
– Multi-stakeholder, member-driven organisation
– Unites scholars, tool developers, librarians, publishers, funding agencies etc. etc.
– E.g. Software citation group, akin to Data Citation Group
– Will present at Force16 in Portland, OR April 17-19, 2016
National Data Service:
– Multi-stakeholder group, based around supercomputing centres
– Aims to be a ‘connective tissue’ between data creation, curation, storage etc projects.
– Inviting Pilots: two or more partners who have not worked together, interested in collaborating
on a data-centric project to solve a real-world needs: can include software sharing
– E.g. Datasearch, Data Linking systems
RDA:
– CoLead Data publishing, linking group
– Colead Cost Recovery group
– Active in Chemistry, Earth Science groups
– Starting BoF Data Search
The National
DATA SERVICE
Anita de Waard
VP Research Data Collaborations, Elsevier
a.dewaard@elsevier.com
@anitadewaard
In summary:
Let’s collectively enable ‘an account of the present
undertakings, studies and labours of the ingenious in many
considerable parts of the world’,
by connecting ideas, data, and software
through interconnected partnerships!

Mais conteúdo relacionado

Mais procurados

A Blueprint for the Research Data Landscape
A Blueprint for the Research Data LandscapeA Blueprint for the Research Data Landscape
A Blueprint for the Research Data LandscapeSayeed Choudhury
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016Susanna-Assunta Sansone
 
Who will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynoteWho will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynoteJisc RDM
 
Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Jisc
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingMerce Crosas
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data networkJisc RDM
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to librariesJisc RDM
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodeiASIS&T
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierMaaike Duine
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexSusanna-Assunta Sansone
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesLIBER Europe
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...EDINA, University of Edinburgh
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Varsha Khodiyar
 
EDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEnvironmental Data Initiative
 

Mais procurados (20)

A Blueprint for the Research Data Landscape
A Blueprint for the Research Data LandscapeA Blueprint for the Research Data Landscape
A Blueprint for the Research Data Landscape
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
Who will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynoteWho will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynote
 
Valen Metadata and the [Data] Repository
Valen Metadata and the [Data] RepositoryValen Metadata and the [Data] Repository
Valen Metadata and the [Data] Repository
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014Authority files - Jisc Digital Festival 2014
Authority files - Jisc Digital Festival 2014
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Why does research data matter to libraries
Why does research data matter to librariesWhy does research data matter to libraries
Why does research data matter to libraries
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing Elsevier
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
NIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery IndexNIH BD2K bioCADDIE DataMed: Data Discovery Index
NIH BD2K bioCADDIE DataMed: Data Discovery Index
 
A Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data RepositoriesA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories
 
Meadows apr28-1
Meadows apr28-1Meadows apr28-1
Meadows apr28-1
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...
 
EDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable Units
 

Destaque

Taking your ball and going home
Taking your ball and going homeTaking your ball and going home
Taking your ball and going homePhil Cryer
 
Data hosting infrastructure for primary biodiversity data
Data hosting infrastructure for primary biodiversity dataData hosting infrastructure for primary biodiversity data
Data hosting infrastructure for primary biodiversity dataPhil Cryer
 
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...Phil Cryer
 
Clustered and distributed
 storage with
 commodity hardware 
and open source ...
Clustered and distributed
 storage with
 commodity hardware 
and open source ...Clustered and distributed
 storage with
 commodity hardware 
and open source ...
Clustered and distributed
 storage with
 commodity hardware 
and open source ...Phil Cryer
 
Getting started with Mantl
Getting started with MantlGetting started with Mantl
Getting started with MantlPhil Cryer
 
Pets versus Cattle: servers evolved
Pets versus Cattle: servers evolvedPets versus Cattle: servers evolved
Pets versus Cattle: servers evolvedPhil Cryer
 
What if Petraeus Was a Hacker?
What if Petraeus Was a Hacker?What if Petraeus Was a Hacker?
What if Petraeus Was a Hacker?Phil Cryer
 
Moving towards unified logging
Moving towards unified loggingMoving towards unified logging
Moving towards unified loggingPhil Cryer
 

Destaque (8)

Taking your ball and going home
Taking your ball and going homeTaking your ball and going home
Taking your ball and going home
 
Data hosting infrastructure for primary biodiversity data
Data hosting infrastructure for primary biodiversity dataData hosting infrastructure for primary biodiversity data
Data hosting infrastructure for primary biodiversity data
 
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
 
Clustered and distributed
 storage with
 commodity hardware 
and open source ...
Clustered and distributed
 storage with
 commodity hardware 
and open source ...Clustered and distributed
 storage with
 commodity hardware 
and open source ...
Clustered and distributed
 storage with
 commodity hardware 
and open source ...
 
Getting started with Mantl
Getting started with MantlGetting started with Mantl
Getting started with Mantl
 
Pets versus Cattle: servers evolved
Pets versus Cattle: servers evolvedPets versus Cattle: servers evolved
Pets versus Cattle: servers evolved
 
What if Petraeus Was a Hacker?
What if Petraeus Was a Hacker?What if Petraeus Was a Hacker?
What if Petraeus Was a Hacker?
 
Moving towards unified logging
Moving towards unified loggingMoving towards unified logging
Moving towards unified logging
 

Semelhante a Collaboratively creating a network of ideas, data and software

Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryRobert H. McDonald
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - IntroductionMaaike Duine
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Lucy McKenna
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?Philip Bourne
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Data Collaboration Stack
Data Collaboration StackData Collaboration Stack
Data Collaboration StackPierre Brunelle
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
 

Semelhante a Collaboratively creating a network of ideas, data and software (20)

The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital Library
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - Introduction
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?If Data Are The New Oil, How Do We Prevent Global Warming?
If Data Are The New Oil, How Do We Prevent Global Warming?
 
Data management plans
Data management plansData management plans
Data management plans
 
Data Collaboration Stack
Data Collaboration StackData Collaboration Stack
Data Collaboration Stack
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 

Mais de Anita de Waard

Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papersAnita de Waard
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective DataAnita de Waard
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful DataAnita de Waard
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with dataAnita de Waard
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective dataAnita de Waard
 
The habits of highly successful data:
The habits of highly successful data: The habits of highly successful data:
The habits of highly successful data: Anita de Waard
 

Mais de Anita de Waard (20)

Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
Argumentation in biology papers
Argumentation in biology papersArgumentation in biology papers
Argumentation in biology papers
 
Ten Habits of Highly Effective Data
Ten Habits of Highly Effective DataTen Habits of Highly Effective Data
Ten Habits of Highly Effective Data
 
Ten Habits of Highly Successful Data
Ten Habits of Highly Successful DataTen Habits of Highly Successful Data
Ten Habits of Highly Successful Data
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
 
Ten habits of highly effective data
Ten habits of highly effective dataTen habits of highly effective data
Ten habits of highly effective data
 
The habits of highly successful data:
The habits of highly successful data: The habits of highly successful data:
The habits of highly successful data:
 

Último

User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 

Último (20)

User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 

Collaboratively creating a network of ideas, data and software

  • 1. Anita de Waard VP Research Data Collaborations Elsevier, Jericho, VT Some Thoughts on Collectively Creating Networks of Ideas, Data and Software
  • 2. How do we unify the needs of the collective and the individual? “Let us endeavor to build systems that allow a kid in Mali who wants to learn about proteomics to not be overwhelmed by the irrelevant and the untrue.” - John Perry Barlow, iAnnotate 2014 Collectively create nimble and robust systems of knowledge management that interconnect ideas, data and software.
  • 3. Automated caption/body text splitting & linking Precision Recall F-score 56.3 76.0 64.7 Statement type Connecting Ideas: Big Mechanism
  • 4. Content Universal schema Surface form relations Structured relations Factorization model Matrix Construction Open Information Extraction Entity Resolution Matrix Factorization Knowledge graph Curation Predicted relations Matrix Completion Taxonomy Triple Extraction Connecting Ideas: Towards an Elsevier Knowledge Graph 14M articles from Science Direct 3.3M triples 475M triples 49M triples p x r matrix p x k, k x r latent factor matrices ~102 triples 920K concepts from EMMeT • Ongoing proof-of-concept work by Paul Groth, Sujit Pal and Ron Daniel of Elsevier Labs • Unsupervised, scalable and built with off-the-shelf technologies • Based on recent work at University College London and University of Massachusetts Amherst Riedel, Sebastian, Limin Yao, Andrew McCallum, and Benjamin M. Marlin. "Relation extraction with matrix factorization and universal schemas." (2013).
  • 6. Linking Papers to Data, Phase 1 • Supplementary data at PANGAEA • Bidirectional links between PANGAEA & ScienceDirect • Data visualized next to the article http://www.elsevier.com/databaselinking
  • 7. Linking Papers to Data, Phase 2 • ICSU/WDS/RDA Publishing Data Service Working group • Currently creating linked-data model for exposing DOI to DOI links outside publisher’s firewall • Merged with National Data Service pilot with the same goal • Collaboration between CrossRef, DataCite, Europe PubMed Central, ANDS, Thompson Reuters, Elsevier • About to deliver: http://dliservice.research- infrastructures.eu/#/api Objective: move from a plethora of (mostly) bilateral arrangements between the different players… .. a one-for-all cross- referencing service for articles and data .. to ..
  • 8. Researchers Funding AgencyInstitution Data Repository Dataset JournalPaper Current Systems for Linking Data 1. Researcher creates datasets 2. Researcher writes paper & publishes in journal 3. (Sometimes,) dataset gets posted to repository 4. Researcher reports (post-hoc) to Institution and Funder 2 2 1 3 4 4
  • 9. Researchers Funding AgencyInstitution Data Repository Dataset JournalPaper Issues with the Current Situation: 2 2 1 3 4 4 iii. No link between data and paper iv. Funders/Institutions informed as an afterthought i. Too much work for researchers ii. Data posting not mandatory
  • 10. Researchers Funding Agency Institution Data Repository Dataset Journal Paper A Proposal To Address These Issues: 1. Researcher creates datasets and posts to repository (under embargo) 2. Funder is automatically notified of dataset publication 3. Researcher writes paper & publishes in journal; embargo is lifted and data linked - NB this also allows release of non-used data for negative result and reproducibility 4. Funder and institution get report on publication and embargo lifting 2 1 1 3 3 3 4 4 i. Less Work! iv. Better Tracking! iii. Better Linking! ii. More Data Stored!
  • 11. One piece of the puzzle: Mendeley Data: https://data.mendeley.com/datasets/xz6gv65m6d/6 Linked to published papers – or not Linked to Github – or not Versioning and provenance
  • 12. Another Piece of the Puzzle: DataSearch: http://datasearchdemo.elsevier.com/indexed#/search/mercury
  • 13. Federated Poor API Rich API FTP & Index Federated Poor API Rich API FTP & Index Federated Poor API Rich API FTP & Index Data Enrichment Manual Automated (User) Intent Ranking Filtering (how to mix federated & indexed rich & poor) Search Rendering Search all data Faceted query/Results refinement Store & Use results Gener al UI Doma in UI Filterin g Feeding user signals back into Search ranking Evaluation How Do We Evaluate Discoverability? Birds of a Feather on Data Search: https://rd-alliance.org/bof-data-search.html
  • 14. How do we pay for all this? RDA Cost Recovery WG • Cochair with Ingrid Dillo (DANS), Simon Hodson (CODATA) • Goal: write a report regarding new potential funding models for data repositories, allow them to start sharing this knowledge • Interviewed 24 repositories on their funding (current and future) • Now summarising stories and trends – will present at RDAP7 Terms of funding for main income stream (in %)
  • 15. Software As A First-Class Knowledge Object:
  • 16. Working with Networks of Partners Force11: – Multi-stakeholder, member-driven organisation – Unites scholars, tool developers, librarians, publishers, funding agencies etc. etc. – E.g. Software citation group, akin to Data Citation Group – Will present at Force16 in Portland, OR April 17-19, 2016 National Data Service: – Multi-stakeholder group, based around supercomputing centres – Aims to be a ‘connective tissue’ between data creation, curation, storage etc projects. – Inviting Pilots: two or more partners who have not worked together, interested in collaborating on a data-centric project to solve a real-world needs: can include software sharing – E.g. Datasearch, Data Linking systems RDA: – CoLead Data publishing, linking group – Colead Cost Recovery group – Active in Chemistry, Earth Science groups – Starting BoF Data Search The National DATA SERVICE
  • 17. Anita de Waard VP Research Data Collaborations, Elsevier a.dewaard@elsevier.com @anitadewaard In summary: Let’s collectively enable ‘an account of the present undertakings, studies and labours of the ingenious in many considerable parts of the world’, by connecting ideas, data, and software through interconnected partnerships!