Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
Purdue University Research Repository - OR2013
1. The Purdue University
Research Repository (PURR):
Providing institutional data services with a virtual
research environment, data publication, and archiving
July 8th, 2013
Courtney Matthews
Digital Data Repository
Specialist
Purdue
University
Libraries
Michael Witt
Interdisciplinary Research
Librarian & PURR Director
3. PURR
INSTITUTIONALCOLLABORATION
Libraries, ITaP, and OVPR initiative
A Purdue research core*
A hub**
A Libraries institutional repository***
3
* Purdue Research Cores: http://www.purdue.edu/research/cores/
** HUBzero platform: http://hubzero.org/
*** Purdue e-Pubs: http://docs.lib.purdue.edu/
*** e-Archives: http://e-archives.lib.purdue.edu/
4. PURR
SERVICE AND PLATFORM
PURR is a free online research data collaboration platform and service solution for
Purdue faculty, graduates students, and staff.
Research data - spreadsheets, images, output from sensors and
instruments, transcripts, surveys, software source code and tools, video, and
observation logs
PURR provides:
Data management plan (DMP) resources and consultation
Collaborative research data project space
Dataset publication with Digital Object Identifier (DOI) *
Long-term preservation and management
4
* PURR uses Datacite DOIs. To learn more go to http://datacite.org/whatisdoi
5. DMP REQUIREMENTS
GRANTFUNDINGAGENCIES & PRESIDENTIALDIRECTIVE
NSF DMP Guide - https://purr.purdue.edu/overview
OSTP Memo - http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
5
National Science Foundation
President’s Office of Science
and Technology Policy
6. SERVICES & SUPPORT
GUIDES,TOOLS
DMP boilerplate text
DMP guides and tutorials
DMP Q & A
• Storage & pricing
• Knowledge base
• Contact Us
See https://purr.purdue.edu/dmp
6
7. DATA PROJECT
7
COLLABORATION
A project is a dedicated working space on PURR for you to
collaborate and prepare data for publication and curation
for a research project or study.
8. DATA PUBLISHING
8
DATACITE& DIGITAL OBJECTIDENTIFIERS
A dataset publication is a collection of files and metadata chosen from a
project to be disseminated with a Datacite Digital Object Identifier (DOI).
Example: 10.4231/D39P2W550
Purdue is a member of Datacite, an international not-for-profit organization
that advocates for access to research data on the internet.
Purdue is one of three Datacite DOI registrars in the United States.
Publishing with Datacite DOIs increase your dataset’s:
Citation Impact Validation Reuse
PURR uses Datacite DOIs. To learn more go to http://datacite.org/whatisdoi
9. DATA PRESERVATION
9
DATASETPUBLICATIONS
Maintained by the PURR for 10 years
PURR is seeking ISO 16363 trusted digital repository certification
After 10 years they are selected / deselected for permanent inclusion
in the Libraries collection
11. PURR BY THE NUMBERS
PURR IS GROWING
720+ grant proposals cite PURR in their DMPs
48 of these grants have been awarded
761 registered users
217 research projects
11
13. THE PURDUE UNIVERSITY
RESEARCH REPOSITORY:
PURR
Courtney Earl Matthews
Digital Data Repository Specialist
MORE INFO?
Purdue University
https://purr.purdue.edu
Notas do Editor
Let’s take a quick look at our agenda.I’ll provide a quick backgrounder on PURR, Discuss PURR as a Data Management Planning solution, I’ll talk about the PURR data curation services workflow, Then I’ll give you a look at PURR by walking through the steps necessary to create a dataset publication.Finally I’ll give an overview of the services that PURR offers to support users.
PURR is an institutional effort made possible by a partnership of the Libraries, Office of the Vice-President of Research, and Information Technology at Purdue.As such it has been designated as a Purdue Research Core, mainly a resource available broadly for use by Purdue PIs and their research teams. It is an instance of the Hubzero open source software platform for creating scientific websites developed here at Purdue for Purdue researchers. PURR is also one of three institutional repositories, e-pubs and e-archives being the other two, hosted by the Libraries and providing services to the entire Purdue research community.
PURR is a hub that provides a research data and collaboration solution for Purdue, faculty, graduates and staff.Research data includes spreadsheets, images, output from sensors and instruments, transcripts, surveys, software source code and tools, video, and observation logs**provide a real example or two of data to bring this home. Example: a spreadsheet containing geospatial data, lat-longs, coordinates for bounding boxes (perhaps this example occurs later on the project slide)PURR provides researchers with the tools for meeting the increasingly necessary DMP requirements of the grant application process in the form of DMP Consultations, online DMP overview guide, DMP self-assessment, DMP video workshop, DMP boilerplate text for inclusion in grant proposals and guide to writing a DMP for the NSF. PURR is an online collaborative project space where researchers are able to create a free trial account in order to collect, discuss, and share their research data. A project is a dedicated working space on PURR for you to collaborate and prepare data for publication and curation for a research project or study. This project space can be upgraded by registering a successful grant. A dataset publication is a collection of files and metadata chosen from a project to be disseminated with a Datacite Digital Object Identifier (DOI). Purdue is a member of Datacite, an international not-for-profit organization that advocates for access to research data on the internet. Of note Purdue is one of three DOI Registrars in the United States of America. A DOI uniquely identifies your data increasing its citation, ability to be tracked, verifiability, and potential reuse. Datasets published in PURR are preserved for 10 years and presented on the PURR site purr.purdue.edu and are searchable in the Libraries’ catalogue. At the end of that ten year period of initial commitment a digital archivist and your subject specialist librarian review the dataset to either select of deselect it for inclusion in the Library’s permanent collection.It is also important to note that PURR has been developed in accordance with the ISO 16363 Trusted Digital Repository certification process. An international standard that certifies the reliability of digital repositories.
DMPs are increasingly required by Grant funding agencies. All of the listed grant funders require DMPs with varying degrees of specificity and focus. For example as of January 15th, 2011, the National Science Foundation requires a 2 page Data Management Plan. PURR provides extensive Data Management Planning resources https://purr.purdue.edu/dmpas well as consultation on Data Management Planning. Included at https://purrb.purdue.edu/overviewis an NSF specific DMP guide I will discuss out in a moment.SEE: http://www.nsf.gov/pubs/2012/nsf12570/nsf12570.htmNIH Release and sharing of NIH supported studies must be documented, data sharing plan must be included with any applicationNEHRequires a 2 page DMPNASA Earth Sciences ProgramPromotes “full and open sharing of all data with the research and applications community”CDCRequires a statement of intent for the sharing of data, all data are released and or shared as soon as feasible w/o compromising privacy, fed/state confidentiality, proprietary interests, national security
The Overview section of research.hub.purdue.edu contains some example DMPs, a DMP self-assessment questionnaire that lays out some of the questions a researcher needs to ask when building his or her DMP. This section also includes a DMP workshop video that details the PURR DMP materials more thoroughly than I am able to today. The guides and tutorials section is also very useful as it contains PURR specific boilerplate text that any researcher can drop directly into their DMP.It also contains a very useful guide for creating a DMP specifically for the NSF. The guide outlines the required parts of an NSF DMP, lists possible questions that should be addressed and finally lists resources to provide further context. Information relating to PURR’s Storage and pricing as well as PURR’s Knowledge Base of frequently asked questions can also be found in the DMP Overview section of research.hub.purdue.edu/dmpContact Us is the place where you can ask questions or report an issue to one of PURR’s three partners…
***Inserting an actual data citation would be usefulDatacite Registrars – Office of Scientific and Technical Information (OSTI) & California Digital Library (CDL)Reliable citation Potential to increase the impact factor of your dataIncrease the likelihood of data review, validation and finallyIncreases the opportunities for data reuse.
PURR provides a long-term home for the P.I.’s dataset publications. PURR stores and presents dataset publications for 10 years. After this “end of initial commitment” period the P.I.’s dataset will be reviewed by the subject specialist librarian for your domain and a digital archivist. Together they will determine whether the dataset fits the Libraries collection policy and then select / deselect the dataset for inclusion in the PUL’s permanent collection. The take-away is that all dataset publications that are submitted to and accepted by PURR will be maintained and made accessible by the PUL at purr.purdue.edu and more importantly thru the Library’s catalogue for at least ten years.**PURR preservation policy
The key to PURR’s success is collaboration.
Send any PURR related questions, inquiries, feedback, or referrals my way.
Thank-you for your time today. If you have any questions we can tackle them now.