O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

The state of global research data initiatives: observations from a life on the road

24 visualizações

Publicada em

The state of global research data initiatives: observations from a life on the road

Publicada em: Educação
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

The state of global research data initiatives: observations from a life on the road

  1. 1. Data Management: all you need to know Sarah Jones Digital Curation Centre sarah.jones@glasgow.ac.uk Twitter: @sjDCC Link to come
  2. 2. What is Research Data Management? Image CC-BY-SA by Janneke Staaks www.flickr.com/photos/jannekestaaks/14411397343
  3. 3. What is Research Data Management? Create Document Use Store Share Preserve “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice
  4. 4. What is involved in RDM? • Data Management Planning • Data creation • Annotating / documenting data • Analysis, use, versioning • Storage and backup • Publishing papers and data • Preparing for deposit • Archiving and sharing • Licensing • Citing… Create Document Use Store Share Preserve
  5. 5. What is a data management plan? A brief plan written at the start of a project to define: • how the data will be created? • how it will be documented? • who will access it? • where it will be stored? • who will back it up? • whether (and how) it will be shared & preserved? DMPs are often submitted as part of grant applications, but are useful whenever researchers are creating data.
  6. 6. Typical coverage of a DMP 1. Description of data to be collected / created (i.e. content, type, format, volume...) 2. Standards / methodologies for data collection & management 3. Ethics and Intellectual Property (highlight any restrictions on data sharing e.g. embargoes, confidentiality) 4. Plans for data sharing and access (i.e. how, when, to whom) 5. Strategy for long-term preservation
  7. 7. Why make data available?
  8. 8. Sharing data increases citations! There are benefits for you. Want evidence? • Piwowar, Vision – 9% (microarray data) • Drachen, Dorch, et al – 25-40%, astronomy • Gleditch, et al – doubling to trebling (international relations) Open Data Citation Advantage http://sparceurope.org/open-data-citation-advantage
  9. 9. Why manage research data? Image Azgan Mjeshtri https://unsplash.com/photos/KgxawsqiAJs
  10. 10. To avoid problems • Data duplication • Data loss and security breaches • Versioning issues • Inability to reuse data Save time and effort to make your life easier!
  11. 11. To keep your options open Decisions you make early on will affect what you can do later: • Choice of file formats • Consent forms • Licence and consortium agreements Avoid having to renegotiate consent or being prevented from reusing data by keeping options open
  12. 12. 12 Don’t undervalue research data
  13. 13. DMPs can be helpful it helped us reflect on potential issues and decide how to address these as a project I find it very useful since, although I have an idea of what data I will collect in my project, this makes me reflect on the best format to present them, where to make them available, etc OpenAIRE & FAIR data Expert Group DMP survey. Report, dataset & infographic at: https://doi.org/10.5281/zenodo.1120245
  14. 14. Many global funders ask for DMPs Not comprehensive!
  15. 15. How to manage research data? Image Guille Alvarez https://unsplash.com/photos/P11Z-nILhCs
  16. 16. Follow RDM basics • Use common data formats • Use metadata standards and controlled vocabularies • Document your processes • Version your data – and code • Store securely • Back-up automatically • Deposit in repositories • Get a Persistent Identifier • Licence your data Create Document Use Store Share Preserve
  17. 17. 17 Choose where to store/backup? • Your own device (laptop, flash drive, server etc.) – And if you lose it? Or it breaks? • Departmental drives or university servers with automatic backup • “Cloud” storage – Do they care as much about your data as you do? The decision will be based on how sensitive your data are, how robust you need the storage to be, and who needs access to the data and when
  18. 18. CCimagebymomboleumonFlickr One copy = risk of data loss
  19. 19. Collaborative platforms e.g. OSF https://osf.io
  20. 20. Make data understandable Metadata • Standardised • Structured • Machine and human readable Metadata helps to cite & disambiguate data Documentation aids reuse Metadata Documentation
  21. 21. Metadata standards These can be general – such as Dublin Core Or discipline specific – Data Documentation Initiative (DDI) – social science – Ecological Metadata Language (EML) - ecology – Flexible Image Transport System (FITS) – astronomy Search for standards in catalogues like: http://rd-alliance.github.io/metadata-directory
  22. 22. Documentation Think about what is needed in order to evaluate, understand, and reuse the data. • Why was the data created? • Have you documented what you did and how? • Did you develop code to run analyses? If so, this should be kept and shared too. • Important to provide wider context for trust
  23. 23. ReadMe files We recommend that a ReadMe be a plain text file containing the following: • for each filename, a short description of what data it includes, optionally describing the relationship to the tables, figures, or sections within the accompanying publication • for tabular data: definitions of column headings and row labels; data codes (including missing data); and measurement units • any data processing steps, especially if not described in the publication, that may affect interpretation of results • a description of what associated datasets are stored elsewhere, if applicable • whom to contact with questions http://datadryad.org/pages/readme Example template: https://www.lib.umn.edu/datamanagement/metadata
  24. 24. Workflow tools e.g. MyExperiment www.myexperiment.org/workflows/16.html
  25. 25. Follow good practice http://biblioguias.cepal.org/gestion-de-datos-de-investigacion
  26. 26. Use available DMP tools DMPonline offers: • Example plans • Tailored guidance • Plan sharing & visibility controls • Institutional feedback and DMP review • Export to multiple formats • Online helpdesk
  27. 27. 27 How does DMPonline work? Pulls together requirements and guidance, tailored to your context Guidance and examples from funders, unis, research disciplines and others DMP Requirements from funders, institutions and others Create Share Review Export Update …..
  28. 28. Training: CODATA schools Raphael Cobe, NCC Marcela Alfaro Córdoba, University of Costa Rica
  29. 29. How to share your data? Image CC-BY-NC-ND by talkingplant www.flickr.com/photos/talkingplant/2256485110
  30. 30. Steps to make data open? 1. Choose your dataset(s) – What can you may open? You may need to revisit this step if you encounter problems later. 2. Apply an open license – Determine what IP exists. Apply a suitable licence e.g. CC-BY 3. Make the data available – Provide the data in a suitable format. Use repositories. 4. Make it discoverable – Post on the web, register in catalogues, ensure you cite… https://okfn.org
  31. 31. DCC how-to guide: www.dcc.ac.uk/resources/how-guides/license-research-data License research data openly
  32. 32. Deposit in a data repository http://databib.org www.re3data.org The Re3data catalogue can be searched to find a home for data www.fosteropenscience.eu /content/re3data-demo
  33. 33. National / domain repositories FAIRsharing portal of databases in life sciences and earth sciences www.re3data.org https://fairsharing.org
  34. 34. Zenodo is a multi-disciplinary repository that can be used for the long-tail of research data • An OpenAIRE-CERN joint effort • Multidisciplinary repository accepting – Multiple data types – Publications – Software • Assigns a Digital Object Identifier (DOI) • Links funding, publications, data & software www.zenodo.org Zenodo
  35. 35. Archiving code in Zenodo Get a DOI for each release https://guides.github.com/ activities/citable-code
  36. 36. Citing research data: why? http://ands.org.au/cite-data
  37. 37. How to cite data www.dcc.ac.uk/resources/briefing-papers/introduction- curation/data-citation-and-linking Key citation elements • Author • Publication date • Title • Location (= identifier) • Funder (if applicable)
  38. 38. How do you share data effectively? • Use appropriate repositories, this catalogue is a good place to start http://www.re3data.org • Document and describe it enough for others to understand, use and cite • http://www.dcc.ac.uk/resources/how- guides/cite-datasets • Licence it so others can reuse www.dcc.ac.uk/resources/how-guides/license- research-data
  39. 39. Thanks! Any questions? Sarah Jones Digital Curation Centre sarah.jones@glasgow.ac.uk Twitter: @sjDCC

×