SlideShare a Scribd company logo
1 of 10
Reuse of Repository Data

              Valerie Enriquez
Motivation
 Data deposit vs. data reuse
 Why track the reuse of data?
    Transparency
    Collaboration
        Confirm existing data
        Refute existing data
        Combine with existing data to form new conclusions
    Healthy Competition
    Invigoration
Initial Questions
 How is data currently cited and how often?
 How do we find data citations using available
 resources (search engines, databases, etc.)?
 How difficult is it to find data citations using these
 tools and why?
 What are the best/worst ways to find data citations?
 How do the citations vary across discipline,
 repository and publication?
 What is the most common citation? Repository
 name? Data author name? Unique identifier like a
 study number or DOI?
To whose benefit?
 Scientists
 Academic researchers
 Students
 Anyone who uses or deposits data
 Anyone interested in the citation or reuse of data
 Similar projects
  See also: list of projects, discussion and editorials on
  the OpenWetware DataONE Web Resources page:
  http://openwetware.org/wiki/User:Valerie_Enriquez/Not
  ebook/DataONE_Web_resources
Methods
 Initial search process: Test    Limits
 TreeBASE searches                 Date range: 2008-2010
 Focused search                    Language: English
  Repositories                     Journal articles only
  1.  TreeBASE                   Repository-specific search
  2.  Pangaea                    terms
  3.  ORNL DAAC                    TreeBASE: repository name,
                                   study accession number
  Databases                        (S####), data author name
  1.  ISI Web of Science Cited     Pangaea: repository name,
      Reference Search             DOI
  2.  Scirus                       prefix:10.1594/PANGAEA.
  3.  Google Scholar               ######, data author name
                                   ORNL DAAC: repository
                                   name, DOI prefix:
                                   10.3334/ORNLDAAC/###,
                                   data author name, project
                                   name (BOREAS, FLUXNET,
                                   etc.)
Initial Analysis
1.   Search comparison spreadsheet hosted here
     Search methods, terms and datasets used to construct
     search terms were captured as well as the total number of
     results followed by respective hits and misses.
     Percentages of hits vs. misses calculated within the
     spreadsheet.
     Reasons for miss captured
     Reasons for hit captured
2.   Shared fields template from Sarah with my input
     data hosted here
     Hosts data about individual articles, including DOIs as
     applicable, metadata and coding for hits and misses.
Stumbles and other Worrisome Things
 Finding focus and the
 difficulty of going
 beyond the obvious
 “Missing” searches
 How broad is too
 broad? How narrow is
 too narrow?
 Article cited vs. data
 cited
                          Image courtesy of:
                          http://currentskateofmind.com/2008/03/25/glo
                          ssary-of-skating-falls/
Initial Findings
                            ISI Web of Science             Scirus                Google Scholar

TreeBASE                1.     $ Repository name   1.   $ Repository name   1.    # Repository name
                        2.     *                   2.   # Study Accession   2.    # Study Accession
                        3.     $ Cited Author           Number                    Number
                               Name/original       3.   # Cited Author      3.    # Cited Author
                               publication              Name/original             Name/original
                               title/date               publication               publication
                                                        title/date                title/date


Pangaea                 1.     $ Repository name   1.   Repository name     1.    # Repository name
                        2.     *                   2.   $ DOI prefix        2.    $ DOI prefix
                        3.     $ Cited Author      3.   # Cited Author      3.    # Cited Author
                               Name/original            Name/original             Name/original
                               publication              publication               publication
                               title/date               title/date                title/date


ORNL DAAC               1.     $ Repository name 1. $ Repository name       1.    # Repository name
                        2.     *                 2. $ DOI prefix            2.    $ DOI prefix
                        3.     $ Cited Author    3. $ Cited Author          3.    $ Cited Author
                               Name/original        Name/project                  Name/project
                               publication          name/original                 name/original
*: invalid field input $:      title/date           publication
                            effective search #: ineffective search                publication
                                                    title/date                    title/date
Lessons Learned

                          Hey, I think I found that data
                          citation you were looking for.




Image courtesy of: http://www.squidoo.com/stop_information_overload
Where do we go from here?
 Solidify conclusions from initial findings.
 Compare data with other interns.
 Examine other repositories, search terms and
 databases.
 Write article about how difficult it is to find data reuse
 citations. Some possible publications:
   Collection Management
   DLib Link provided by Heather.
   Information Services & Use Author Guidelines
   Informing Science
   International Digital Curation Conference Call for Papers. Link provided by Nic.
   Journal of the American Society for Information Science & Technology
   Journal of Information Science
   Library Technology Reports
   Scientometrics

More Related Content

Viewers also liked (6)

Vigo presentation updated_062011
Vigo presentation updated_062011Vigo presentation updated_062011
Vigo presentation updated_062011
 
19 Ky Nang Thuyet Phuc2333
19 Ky Nang Thuyet Phuc233319 Ky Nang Thuyet Phuc2333
19 Ky Nang Thuyet Phuc2333
 
eHealth Procurement
eHealth ProcurementeHealth Procurement
eHealth Procurement
 
Linked In Tutorial
Linked In TutorialLinked In Tutorial
Linked In Tutorial
 
Weblog
WeblogWeblog
Weblog
 
Network Your Way To The Top
Network Your Way To The TopNetwork Your Way To The Top
Network Your Way To The Top
 

Similar to Reuse of repository_data_2.0

Researchinthe library 1
Researchinthe library 1Researchinthe library 1
Researchinthe library 1
Zarah Gagatiga
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
Fabien Gandon
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
Creating a works cited page for a variety of sources
Creating a works cited page for a variety of sourcesCreating a works cited page for a variety of sources
Creating a works cited page for a variety of sources
Lori Ward
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and MuseumsLinked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
Harvard style referencing sample from assignmentsupport.com essay writing ser...
Harvard style referencing sample from assignmentsupport.com essay writing ser...Harvard style referencing sample from assignmentsupport.com essay writing ser...
Harvard style referencing sample from assignmentsupport.com essay writing ser...
https://writeessayuk.com/
 

Similar to Reuse of repository_data_2.0 (20)

Reuse of Repository Data
Reuse of Repository DataReuse of Repository Data
Reuse of Repository Data
 
Search Strategy
Search StrategySearch Strategy
Search Strategy
 
Identity, Location, and Citation at NEON
Identity, Location, and Citation at NEONIdentity, Location, and Citation at NEON
Identity, Location, and Citation at NEON
 
Citation and referencing continumn
Citation and referencing continumnCitation and referencing continumn
Citation and referencing continumn
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Project literature search
Project literature searchProject literature search
Project literature search
 
Nr 439 research database assignment form
Nr 439 research database assignment formNr 439 research database assignment form
Nr 439 research database assignment form
 
Nr 439 research database assignment form
Nr 439 research database assignment formNr 439 research database assignment form
Nr 439 research database assignment form
 
Library Database Application report
Library Database Application reportLibrary Database Application report
Library Database Application report
 
Researchinthe library 1
Researchinthe library 1Researchinthe library 1
Researchinthe library 1
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
 
Creating a works cited page for a variety of sources
Creating a works cited page for a variety of sourcesCreating a works cited page for a variety of sources
Creating a works cited page for a variety of sources
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and MuseumsLinked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
 
Harvard style referencing sample from assignmentsupport.com essay writing ser...
Harvard style referencing sample from assignmentsupport.com essay writing ser...Harvard style referencing sample from assignmentsupport.com essay writing ser...
Harvard style referencing sample from assignmentsupport.com essay writing ser...
 
Harvard
HarvardHarvard
Harvard
 
Harvard
HarvardHarvard
Harvard
 
Researcher Identifiers and National Federated Search Portal for Japanese Inst...
Researcher Identifiers and National Federated Search Portal for Japanese Inst...Researcher Identifiers and National Federated Search Portal for Japanese Inst...
Researcher Identifiers and National Federated Search Portal for Japanese Inst...
 
Sara
SaraSara
Sara
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Reuse of repository_data_2.0

  • 1. Reuse of Repository Data Valerie Enriquez
  • 2. Motivation Data deposit vs. data reuse Why track the reuse of data? Transparency Collaboration Confirm existing data Refute existing data Combine with existing data to form new conclusions Healthy Competition Invigoration
  • 3. Initial Questions How is data currently cited and how often? How do we find data citations using available resources (search engines, databases, etc.)? How difficult is it to find data citations using these tools and why? What are the best/worst ways to find data citations? How do the citations vary across discipline, repository and publication? What is the most common citation? Repository name? Data author name? Unique identifier like a study number or DOI?
  • 4. To whose benefit? Scientists Academic researchers Students Anyone who uses or deposits data Anyone interested in the citation or reuse of data Similar projects See also: list of projects, discussion and editorials on the OpenWetware DataONE Web Resources page: http://openwetware.org/wiki/User:Valerie_Enriquez/Not ebook/DataONE_Web_resources
  • 5. Methods Initial search process: Test Limits TreeBASE searches Date range: 2008-2010 Focused search Language: English Repositories Journal articles only 1. TreeBASE Repository-specific search 2. Pangaea terms 3. ORNL DAAC TreeBASE: repository name, study accession number Databases (S####), data author name 1. ISI Web of Science Cited Pangaea: repository name, Reference Search DOI 2. Scirus prefix:10.1594/PANGAEA. 3. Google Scholar ######, data author name ORNL DAAC: repository name, DOI prefix: 10.3334/ORNLDAAC/###, data author name, project name (BOREAS, FLUXNET, etc.)
  • 6. Initial Analysis 1. Search comparison spreadsheet hosted here Search methods, terms and datasets used to construct search terms were captured as well as the total number of results followed by respective hits and misses. Percentages of hits vs. misses calculated within the spreadsheet. Reasons for miss captured Reasons for hit captured 2. Shared fields template from Sarah with my input data hosted here Hosts data about individual articles, including DOIs as applicable, metadata and coding for hits and misses.
  • 7. Stumbles and other Worrisome Things Finding focus and the difficulty of going beyond the obvious “Missing” searches How broad is too broad? How narrow is too narrow? Article cited vs. data cited Image courtesy of: http://currentskateofmind.com/2008/03/25/glo ssary-of-skating-falls/
  • 8. Initial Findings ISI Web of Science Scirus Google Scholar TreeBASE 1. $ Repository name 1. $ Repository name 1. # Repository name 2. * 2. # Study Accession 2. # Study Accession 3. $ Cited Author Number Number Name/original 3. # Cited Author 3. # Cited Author publication Name/original Name/original title/date publication publication title/date title/date Pangaea 1. $ Repository name 1. Repository name 1. # Repository name 2. * 2. $ DOI prefix 2. $ DOI prefix 3. $ Cited Author 3. # Cited Author 3. # Cited Author Name/original Name/original Name/original publication publication publication title/date title/date title/date ORNL DAAC 1. $ Repository name 1. $ Repository name 1. # Repository name 2. * 2. $ DOI prefix 2. $ DOI prefix 3. $ Cited Author 3. $ Cited Author 3. $ Cited Author Name/original Name/project Name/project publication name/original name/original *: invalid field input $: title/date publication effective search #: ineffective search publication title/date title/date
  • 9. Lessons Learned Hey, I think I found that data citation you were looking for. Image courtesy of: http://www.squidoo.com/stop_information_overload
  • 10. Where do we go from here? Solidify conclusions from initial findings. Compare data with other interns. Examine other repositories, search terms and databases. Write article about how difficult it is to find data reuse citations. Some possible publications: Collection Management DLib Link provided by Heather. Information Services & Use Author Guidelines Informing Science International Digital Curation Conference Call for Papers. Link provided by Nic. Journal of the American Society for Information Science & Technology Journal of Information Science Library Technology Reports Scientometrics