SlideShare uma empresa Scribd logo
1 de 22
ODIN –
      ORCID and DATACITE Interoperability Network




           Presentation to CLOSER Leadership Team

           November 2012

           John Kaye – British Library



www.slideshare.net/johnkayebl                  www.odin-project.eu


              Funded by The European Union Seventh Framework
                                Programme
Overview

• Overview
• Project Structure
• Humanities and Social Science Proof of
  Concept
• High Energy Physics Proof of Concept
• Results
• Commonalties
• Risks
Overview

•   2 year project funded under EC FP7 Coordination and Action Programme

•   ORCID (Open Researcher and Contributor ID Initiative)

•   Datacite Consortium – BL is UK registration agent

•   Partners: ORCID, Datacite, BL, CERN, Dryad, arXiv, ANDS

•   Build on ORCID and Datacite initiatives to uniquely identify and connect
    scientists and datasets

•   ‘Datasets’ has a broad definition (anything but journals) so can include grey
    literature, presentations, code etc.

•   Connect information across multiple services and infrastructures for
    scholarly communications
Overview

•   Infrastructure already exists for researchers to build up an open
    portfolio of research objects

•   Register an ORCID ID www.orcid.org and link published papers
    using ORCID’s tools

•   Non published outputs (working papers, datasets) can be deposited
    in figshare http://figshare.com/ given a DataCite DOI and linked back
    and added to ORCID profile

•   ODIN wants to expand on this principle and engage with data
    centres and institutional repositories to allow easier more open
    discovery of non-traditional research outputs.
Project Structure
Proofs of Concept Objectives

 •   Develop two disciplinary proofs of the concept of open and interoperable
     persistent identifiers of data and contributors in scholarly communication, in
     a variety of current and future scenarios.

 Specific goals:

     •   Prove the ability to navigate across data and contributors in the Humanities and Social
         Sciences (HSS) where data and contributors are separated in space and time, with curators
         bridging the gap;

     •   Prove the ability to navigate across data and contributors in High-Energy Physics (HEP),
         where multiple version of articles in preliminary and final form, with several thousand
         contributors, need to be associated with a correspondent dataset hosted in different
         systems

     •   Identify, by a critical analysis of the proofs of concept, common issues in open and
         interoperable permanent identifiers of data and contributors, by establishing a common
         cross-disciplinary view on the relevant workflows
Deliverables and Time frames

 • D3.1 HSS Proof of Concept – Aug 2013
 • D3.2 HEP Proof of Concept – Aug 2013
 • D3.3 Commonalities – Sept 2014

 • MS5 Commonalities Identified Jan 2014

 • D3.1 and D3.2 Validated by the community at 1st year
   event
 • Input from ANDS and arXiv
Humanities and
Social Sciences
HSS: Birth Cohort Studies

• Why Birth Cohort Studies?
  •   Investment
  •   Established/Long history
  •   Tradition of data curation
  •   High Re-use
  •   Derived Data
  •   Multi-disciplinary
  •   BL Involvement in CLOSER (Cohort and Longitudinal
      Studies Enhancement Resource)
HSS: Current Status

•    HSS British Birth Cohort characteristics:
    •   High re-use of data
    •   Data analysed across cohorts (e.g. 1958 questions alongside 2000)
    •   Derived data often kept outside original repository
    •   Lots of ‘grey literature’ (working papers, pre-prints etc.)
    •   Different publication spaces (publishers, institutional repositories)

•    Challenges:
    •    Uniquely associate articles/datasets with authors/contributors from a range of
         data sources
    •    Authors/creators/researchers go back a long way (could be as early as 1946)
    •    How to deal with non-digital research outputs
    •    How to deal with cross-cohort analysis (multiple datasets, derived datasets)
    •    Associate datasets with articles and track impact of data re-use
    •    Survey questions often more important to identify than actual survey (survey
         contains thousands of variables)
HSS: Objectives

•   Indentify workflows and develop conceptual model

•   Provide technical solutions for Identifying and connecting data creators,
    authors, researchers, contributors and research objects related to British
    Birth Cohort Studies

•   Identify, use and link existing identifiers and data sources where possible

•   Identify deficiencies in identification or relationship data and develop or
    propose solutions

•   Work with the research community to develop user case studies and data
    collection and enhancement

•   Create an open and interoperable network linking people and research
    objects to allow Impact Tracking and Resource Discovery
HSS Proof of Concept

                              Data Creator,
                              Researcher, Author
                              Birth Cohort Study
                              dataset
                               Non- Birth Cohort
                               Study dataset
                               Derived dataset

                               Grey Literature
              1958
               1958            Published article

                                Citation
                                Data Creator
                                Derived Data
                                Creator
                                External Data input
                                Author: Grey lit
External Data
 External Data
(Census,
 (Census,                       Author: Article
         ))
Health etc
 Health etc
                      1970
                       1970
HSS Proof of Concept

                              Data Creator,
                              Researcher, Author
                              Birth Cohort Study
                              dataset
                               Non- Birth Cohort
                               Study dataset
                               Derived dataset

                               Grey Literature
              1958
               1958            Published article

                                Citation
                                Data Creator
                                Derived Data
                                Creator
                                External Data input
                                Author: Grey lit
External Data
 External Data
(Census,
 (Census,                       Author: Article
         ))
Health etc
 Health etc
                      1970
                       1970
HSS Proof of Concept

                              Data Creator,
                              Researcher, Author
                              Birth Cohort Study
                              dataset
                               Non- Birth Cohort
                               Study dataset
                               Derived dataset

                               Grey Literature
              1958
               1958            Published article

                                Citation
                                Data Creator
                                Derived Data
                                Creator
                                External Data input
                                Author: Grey lit
External Data
 External Data
(Census,
 (Census,                       Author: Artticle
         ))
Health etc
 Health etc
                      1970
                       1970
HSS Proof of Concept

                              Data Creator,
                              Researcher, Author
                              Birth Cohort Study
                              dataset
                               Non- Birth Cohort
                               Study dataset
                               Derived dataset

                               Grey Literature
              1958
               1958            Published article

                                Citation
                                Data Creator
                                Derived Data
                                Creator
                                External Data input
                                Author: Grey lit
External Data
 External Data
(Census,
 (Census,                       Author: Article
         ))
Health etc
 Health etc
                      1970
                       1970
HSS: Identifiers and
Data Sources
  Researchers etc.: ORCID, ISNI, JISC Names, SCOPUS, Surveys, Citation DB’s,
  UK Data Service, Catalogue metadata

  Source Datasets: DataCite DOIs, ESDS

  Derived Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Surveys, Institutional
  Repositories

  ‘External’ Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Other datacentres,
  NHS, Institutional etc.

  Grey Literature: DataCite DOIs, Institutional ID’s, No ID’s, Surveys, ESDS,
  Institutions

  Published Literature: CrossRef DOIs, Institutional ID’s, No ID’s, SCOPUS Surveys,
  ESDS, Institutions, Citation DB’s, Catalogue metadata
High Energy Physics
Current status (I)

 HEP (High-Energy Physics) field specificities:
     Multiversioning: from preprint versions until final publications
     Hyperauthorship: hundreds/thousands of scientists signing the
      same article
     Data levels of abstraction (CERN, Inspire, HEPData)
     Different publication spaces (arXiv, Inspire, publishers)

 Challenges:
     Author identification, improvement of the disambiguation
      process done in place
     Uniquely associate articles/datasets with authors/contributors
     Version management during the long publication process
Current status (II)




                      Current Inspire interface
Current status (III)
                          Disambiguation process
                           among thousands of authors:
                              Names and affiliations
                              Different ways to write the
                               same information
                              Clustering algorithm




                                     Current Inspire interface
Phase 2:
Results and Commonalities
•   Results to feed into Hackathon event and strategy
•   Assessment and validation by research community and international
    partners
•   BL and CERN come together to find commonalities in the disciplines to
    inform WP4 (interoperability)
      • This process will incorporate knowledge from the results of the
        Hackathon as well as the conceptual model for global interoperability of
        data and contributor identifiers developed in WP4
      • This task will result in a more comprehensive view on disciplinary and
        interdisciplinary needs, and will produce information, internally
        transferred to the other work packages
Questions?

John Kaye – Lead Curator Digital Social Sciences
The British Library
96 Euston Road
London NW1 2DB

john.kaye@bl.uk

Twitter: @johnkayebl

Telephone: 020 7412 7450

Project Website http://odin-project.eu/

Blog: http://britishlibrary.typepad.co.uk/socialscience/

Mais conteúdo relacionado

Mais procurados

Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview   Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview Jennifer D'Souza
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 
Use of Research (Meta-)Data - Finding researchers in/across organizations -
Use of Research (Meta-)Data  - Finding researchers in/across organizations -Use of Research (Meta-)Data  - Finding researchers in/across organizations -
Use of Research (Meta-)Data - Finding researchers in/across organizations - National Institute of Informatics (NII)
 
The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...National Institute of Informatics (NII)
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrCarly Strasser
 

Mais procurados (10)

Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview   Open Research Knowledge Graph (ORKG) - an overview
Open Research Knowledge Graph (ORKG) - an overview
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
Use of Research (Meta-)Data - Finding researchers in/across organizations -
Use of Research (Meta-)Data  - Finding researchers in/across organizations -Use of Research (Meta-)Data  - Finding researchers in/across organizations -
Use of Research (Meta-)Data - Finding researchers in/across organizations -
 
Identifying psychological research data in the digital environment.
Identifying psychological research data in the digital environment. Identifying psychological research data in the digital environment.
Identifying psychological research data in the digital environment.
 
The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Linked Data and Sevices
Linked Data and SevicesLinked Data and Sevices
Linked Data and Sevices
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan Starr
 

Destaque

DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...
DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...
DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...University of California Curation Center
 
What is DataCite-screenshots
What is DataCite-screenshotsWhat is DataCite-screenshots
What is DataCite-screenshotsdatacite
 
Mobile access to educational resources in humanities and social sciences
Mobile access to educational resources in humanities and social sciencesMobile access to educational resources in humanities and social sciences
Mobile access to educational resources in humanities and social sciencesIreland & UK Moodlemoot 2012
 
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resources
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital ResourcesBL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resources
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resourcesjohnkayebl
 
Summary of data citation synthesis activity & Review
Summary of data citation synthesis activity & ReviewSummary of data citation synthesis activity & Review
Summary of data citation synthesis activity & ReviewMicah Altman
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUKerstin Lehnert
 
Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Robin Rice
 

Destaque (7)

DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...
DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...
DMPTool Webinar 5: Promoting institutional services with the DMPTool; EZID as...
 
What is DataCite-screenshots
What is DataCite-screenshotsWhat is DataCite-screenshots
What is DataCite-screenshots
 
Mobile access to educational resources in humanities and social sciences
Mobile access to educational resources in humanities and social sciencesMobile access to educational resources in humanities and social sciences
Mobile access to educational resources in humanities and social sciences
 
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resources
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital ResourcesBL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resources
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital Resources
 
Summary of data citation synthesis activity & Review
Summary of data citation synthesis activity & ReviewSummary of data citation synthesis activity & Review
Summary of data citation synthesis activity & Review
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
 
Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...Now we are six: Integrating Edinburgh DataShare into local and internet in...
Now we are six: Integrating Edinburgh DataShare into local and internet in...
 

Semelhante a ODIN Project Presentation to CLOSER Leadership Team

DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefCrossref
 
Zooniverse teachers workshop
Zooniverse teachers workshopZooniverse teachers workshop
Zooniverse teachers workshopLaura Whyte
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Rebecca Grant DPASSH presentation 2015
Rebecca Grant DPASSH presentation 2015Rebecca Grant DPASSH presentation 2015
Rebecca Grant DPASSH presentation 2015dri_ireland
 
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...ALISS
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnTodd Vision
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
How dinosaurs broke our system: challenges in building national researcher id...
How dinosaurs broke our system: challenges in building national researcher id...How dinosaurs broke our system: challenges in building national researcher id...
How dinosaurs broke our system: challenges in building national researcher id...Amanda Hill
 
Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeEric Kansa
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsJohn Kunze
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...GigaScience, BGI Hong Kong
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...GigaScience, BGI Hong Kong
 
THe HathiTrust Research Center: Digital Humanities at Scale
THe HathiTrust Research Center: Digital Humanities at ScaleTHe HathiTrust Research Center: Digital Humanities at Scale
THe HathiTrust Research Center: Digital Humanities at ScaleRobert H. McDonald
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and HumanitiesAndrew Prescott
 
Doing data in the social sciences and humanities: links to and from published...
Doing data in the social sciences and humanities: links to and from published...Doing data in the social sciences and humanities: links to and from published...
Doing data in the social sciences and humanities: links to and from published...EDINA, University of Edinburgh
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Stella Wisdom
 
Day 1 lecture_intro
Day 1 lecture_introDay 1 lecture_intro
Day 1 lecture_intronniiicc
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingGigaScience, BGI Hong Kong
 

Semelhante a ODIN Project Presentation to CLOSER Leadership Team (20)

DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
 
Zooniverse teachers workshop
Zooniverse teachers workshopZooniverse teachers workshop
Zooniverse teachers workshop
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Rebecca Grant DPASSH presentation 2015
Rebecca Grant DPASSH presentation 2015Rebecca Grant DPASSH presentation 2015
Rebecca Grant DPASSH presentation 2015
 
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
 
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
Publishing of Scientific Data  - Science Foundation Ireland Summit 2010Publishing of Scientific Data  - Science Foundation Ireland Summit 2010
Publishing of Scientific Data - Science Foundation Ireland Summit 2010
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
How dinosaurs broke our system: challenges in building national researcher id...
How dinosaurs broke our system: challenges in building national researcher id...How dinosaurs broke our system: challenges in building national researcher id...
How dinosaurs broke our system: challenges in building national researcher id...
 
Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History Collections
 
Research
ResearchResearch
Research
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
THe HathiTrust Research Center: Digital Humanities at Scale
THe HathiTrust Research Center: Digital Humanities at ScaleTHe HathiTrust Research Center: Digital Humanities at Scale
THe HathiTrust Research Center: Digital Humanities at Scale
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
Doing data in the social sciences and humanities: links to and from published...
Doing data in the social sciences and humanities: links to and from published...Doing data in the social sciences and humanities: links to and from published...
Doing data in the social sciences and humanities: links to and from published...
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
 
Day 1 lecture_intro
Day 1 lecture_introDay 1 lecture_intro
Day 1 lecture_intro
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
 

Mais de johnkayebl

Introduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientistsIntroduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientistsjohnkayebl
 
CLOSER User Needs Assessment
CLOSER User Needs AssessmentCLOSER User Needs Assessment
CLOSER User Needs Assessmentjohnkayebl
 
Kings presentation nov 2012
Kings presentation nov 2012Kings presentation nov 2012
Kings presentation nov 2012johnkayebl
 
BL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - DatasetsBL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - Datasetsjohnkayebl
 
Manchester Business School Nov 2010
Manchester Business School Nov 2010Manchester Business School Nov 2010
Manchester Business School Nov 2010johnkayebl
 
British Library Social Science National Postgraduate Training Day - Datasets ...
British Library Social Science National Postgraduate Training Day - Datasets ...British Library Social Science National Postgraduate Training Day - Datasets ...
British Library Social Science National Postgraduate Training Day - Datasets ...johnkayebl
 

Mais de johnkayebl (6)

Introduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientistsIntroduction to British Library digital resources for social scientists
Introduction to British Library digital resources for social scientists
 
CLOSER User Needs Assessment
CLOSER User Needs AssessmentCLOSER User Needs Assessment
CLOSER User Needs Assessment
 
Kings presentation nov 2012
Kings presentation nov 2012Kings presentation nov 2012
Kings presentation nov 2012
 
BL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - DatasetsBL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - Datasets
 
Manchester Business School Nov 2010
Manchester Business School Nov 2010Manchester Business School Nov 2010
Manchester Business School Nov 2010
 
British Library Social Science National Postgraduate Training Day - Datasets ...
British Library Social Science National Postgraduate Training Day - Datasets ...British Library Social Science National Postgraduate Training Day - Datasets ...
British Library Social Science National Postgraduate Training Day - Datasets ...
 

ODIN Project Presentation to CLOSER Leadership Team

  • 1. ODIN – ORCID and DATACITE Interoperability Network Presentation to CLOSER Leadership Team November 2012 John Kaye – British Library www.slideshare.net/johnkayebl www.odin-project.eu Funded by The European Union Seventh Framework Programme
  • 2. Overview • Overview • Project Structure • Humanities and Social Science Proof of Concept • High Energy Physics Proof of Concept • Results • Commonalties • Risks
  • 3. Overview • 2 year project funded under EC FP7 Coordination and Action Programme • ORCID (Open Researcher and Contributor ID Initiative) • Datacite Consortium – BL is UK registration agent • Partners: ORCID, Datacite, BL, CERN, Dryad, arXiv, ANDS • Build on ORCID and Datacite initiatives to uniquely identify and connect scientists and datasets • ‘Datasets’ has a broad definition (anything but journals) so can include grey literature, presentations, code etc. • Connect information across multiple services and infrastructures for scholarly communications
  • 4. Overview • Infrastructure already exists for researchers to build up an open portfolio of research objects • Register an ORCID ID www.orcid.org and link published papers using ORCID’s tools • Non published outputs (working papers, datasets) can be deposited in figshare http://figshare.com/ given a DataCite DOI and linked back and added to ORCID profile • ODIN wants to expand on this principle and engage with data centres and institutional repositories to allow easier more open discovery of non-traditional research outputs.
  • 6. Proofs of Concept Objectives • Develop two disciplinary proofs of the concept of open and interoperable persistent identifiers of data and contributors in scholarly communication, in a variety of current and future scenarios. Specific goals: • Prove the ability to navigate across data and contributors in the Humanities and Social Sciences (HSS) where data and contributors are separated in space and time, with curators bridging the gap; • Prove the ability to navigate across data and contributors in High-Energy Physics (HEP), where multiple version of articles in preliminary and final form, with several thousand contributors, need to be associated with a correspondent dataset hosted in different systems • Identify, by a critical analysis of the proofs of concept, common issues in open and interoperable permanent identifiers of data and contributors, by establishing a common cross-disciplinary view on the relevant workflows
  • 7. Deliverables and Time frames • D3.1 HSS Proof of Concept – Aug 2013 • D3.2 HEP Proof of Concept – Aug 2013 • D3.3 Commonalities – Sept 2014 • MS5 Commonalities Identified Jan 2014 • D3.1 and D3.2 Validated by the community at 1st year event • Input from ANDS and arXiv
  • 9. HSS: Birth Cohort Studies • Why Birth Cohort Studies? • Investment • Established/Long history • Tradition of data curation • High Re-use • Derived Data • Multi-disciplinary • BL Involvement in CLOSER (Cohort and Longitudinal Studies Enhancement Resource)
  • 10. HSS: Current Status • HSS British Birth Cohort characteristics: • High re-use of data • Data analysed across cohorts (e.g. 1958 questions alongside 2000) • Derived data often kept outside original repository • Lots of ‘grey literature’ (working papers, pre-prints etc.) • Different publication spaces (publishers, institutional repositories) • Challenges: • Uniquely associate articles/datasets with authors/contributors from a range of data sources • Authors/creators/researchers go back a long way (could be as early as 1946) • How to deal with non-digital research outputs • How to deal with cross-cohort analysis (multiple datasets, derived datasets) • Associate datasets with articles and track impact of data re-use • Survey questions often more important to identify than actual survey (survey contains thousands of variables)
  • 11. HSS: Objectives • Indentify workflows and develop conceptual model • Provide technical solutions for Identifying and connecting data creators, authors, researchers, contributors and research objects related to British Birth Cohort Studies • Identify, use and link existing identifiers and data sources where possible • Identify deficiencies in identification or relationship data and develop or propose solutions • Work with the research community to develop user case studies and data collection and enhancement • Create an open and interoperable network linking people and research objects to allow Impact Tracking and Resource Discovery
  • 12. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey lit External Data External Data (Census, (Census, Author: Article )) Health etc Health etc 1970 1970
  • 13. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey lit External Data External Data (Census, (Census, Author: Article )) Health etc Health etc 1970 1970
  • 14. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey lit External Data External Data (Census, (Census, Author: Artticle )) Health etc Health etc 1970 1970
  • 15. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey lit External Data External Data (Census, (Census, Author: Article )) Health etc Health etc 1970 1970
  • 16. HSS: Identifiers and Data Sources Researchers etc.: ORCID, ISNI, JISC Names, SCOPUS, Surveys, Citation DB’s, UK Data Service, Catalogue metadata Source Datasets: DataCite DOIs, ESDS Derived Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Surveys, Institutional Repositories ‘External’ Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Other datacentres, NHS, Institutional etc. Grey Literature: DataCite DOIs, Institutional ID’s, No ID’s, Surveys, ESDS, Institutions Published Literature: CrossRef DOIs, Institutional ID’s, No ID’s, SCOPUS Surveys, ESDS, Institutions, Citation DB’s, Catalogue metadata
  • 18. Current status (I)  HEP (High-Energy Physics) field specificities:  Multiversioning: from preprint versions until final publications  Hyperauthorship: hundreds/thousands of scientists signing the same article  Data levels of abstraction (CERN, Inspire, HEPData)  Different publication spaces (arXiv, Inspire, publishers)  Challenges:  Author identification, improvement of the disambiguation process done in place  Uniquely associate articles/datasets with authors/contributors  Version management during the long publication process
  • 19. Current status (II) Current Inspire interface
  • 20. Current status (III)  Disambiguation process among thousands of authors:  Names and affiliations  Different ways to write the same information  Clustering algorithm Current Inspire interface
  • 21. Phase 2: Results and Commonalities • Results to feed into Hackathon event and strategy • Assessment and validation by research community and international partners • BL and CERN come together to find commonalities in the disciplines to inform WP4 (interoperability) • This process will incorporate knowledge from the results of the Hackathon as well as the conceptual model for global interoperability of data and contributor identifiers developed in WP4 • This task will result in a more comprehensive view on disciplinary and interdisciplinary needs, and will produce information, internally transferred to the other work packages
  • 22. Questions? John Kaye – Lead Curator Digital Social Sciences The British Library 96 Euston Road London NW1 2DB john.kaye@bl.uk Twitter: @johnkayebl Telephone: 020 7412 7450 Project Website http://odin-project.eu/ Blog: http://britishlibrary.typepad.co.uk/socialscience/