Slides from Monday 30 July - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
1. Natasha Simons
#AM6 Data in the Scholarly
Communications Lifecycle
FORCE11 Scholarly Communications Institute
Monday 30 July – Friday 3 August 2018
San Diego, USA
2. Monday 30 July
Today’s course outline
• Welcome and intro to instructor
• Who’s who in the zoo - speed data dating
• #AM6 course overview
• Data in the Scholarly Communications Lifecycle
• Introduction to Research Data Management
Link to today’s slides: https://tinyurl.com/ydhmw58z
3. Welcome to #AM6 Data in the Scholarly
Communications Lifecycle!
Here’s what you can expect based on FSCI 2017…
13. Australia…there are no silly questions!
• What’s vegemite? Tim Tams?
• Are drop bears real?
• How many people get eaten by sharks every year?
• Can you see kangaroos in the main street?
• Do you eat kangaroos?
• Do you have…starbucks, mcdonalds, subway etc?
????????
14. Australian research landscape
Map source - The Australian Trade Commission
Universities
Research institutions
National Collaborative
Research Infrastructure
Strategy (NCRIS)
31. Course overview
Monday 30 July:
• Welcome to #AM6
• Data in the scholarly communications life cycle
• Introduction to Research Data Management
Tuesday 31 July:
• Making your research data FAIR
• Data Management Plans – theory, practice and tools
• Power if open data - stories
Wednesday 1 August:
• Managing working data – are you being FAIR to the future you?
• Hands-on with the Open Science Framework
• Communicating your data through visualization
32. Course overview
Thursday 2 August
• Licensing research data for reuse
• Managing personal and sensitive data
• Describing, publishing and sharing research data
Friday 3 August
• Citing research data, software and related materials
• “Hot topics” from the class
End of course presentation – Friday afternoon
Exercises, class discussions and breaks
Homework: journal your learnings!
33. Guest presenters
Stephanie Simms
California Digital Library
Data Management Plans
Tuesday
Rachael Samberg & Maria Gould
University of California Berkeley
Data rights and licensing
Thursday
Reid Otsuji
University of California San Diego
Open Science Framework
Wednesday
34. Friday “hot topics”
Last year:
• Martin Fenner (Technical Director, DataCite) on data
citation and statistics
• Gaurav Godhwani (Chapter Leader, Datakind and
Technical Lead, Centre for Budget Accountability in
India) on transforming India’s budgets into open linked
data
• Gustavo Durand (project manager at Dataverse
Harvard) on Dataverse
This year? It’s up to you…
• Library data services: models, experiences?
• The role of librarians in data management?
• Building a data savvy community with library carpentry?
• ???
37. Homework
Journal – method to reflect on your study with the intention
of developing your understanding, knowledge and behavior
38. Data in the Scholarly
Communications Lifecycle
39. What is scholarly communication?
Scholarly communication is the process of academics, scholars
and researchers sharing and publishing their research findings
so that they are available to the wider academic community
and beyond.
Source: University of Cambridge
40. Stakeholders, roles & responsibilities
Who are stakeholders in scholarly communications?
What roles and responsibilities do each of the stakeholders
play?
Why is research infrastructure key to scholarly
communications?
41. Changes in scholarly communication
Traditionally scholarly communication has occurred in the
formal literature - in journal articles, conference proceedings,
book chapters and books. However, the landscape is changing
dramatically.
Source: University of Cambridge
43. But there is a problem…
Scholarly communications in crisis
• Cost of scholarly publications outpacing library purchasing budgets
(“the serials crisis”)
• Increasing restrictions on the use and reuse of journal articles
• “Hidden costs” and the oligopoly
• Reduced access to literature
• ..and the data? Increasingly requested/mandated by journals but
does that mean greater access, quality etc?
The future: open access and innovation?
44. Data – no longer wasted
Recycled art by Jane Perkins
47. Research lifecycle – data infused
Image source: Bournemouth University
http://blogs.bournemouth.ac.uk/research/tag/rkeo/
Find data
Plan to manage data
Collect, store, analyse data
Publish data
Cite data
48. Data in scholarly communications
http://library.vanderbilt.edu/scholarly/data.php
49. Who knows what this is?
…and why it’s the subject of important research?
51. What is Research Data?
Research data means: data in the form of facts, observations, images, computer
program results, recordings, measurements or experiences on which an argument,
theory, test or hypothesis, or another research output is based. Data may be
numerical, descriptive, visual or tactile. It may be raw, cleaned or processed, and
may be held in any format or media – Australian National Data Service (website)
But this is only one definition
of many….
Photo by rawpixel on Unsplash
52. AGU FAIR def. of research data
By ‘research data’ we mean “the recorded factual material commonly accepted in
the scientific community as necessary to validate research findings, but not any of
the following: preliminary analyses, drafts of scientific papers, plans for future
research, peer reviews, or communications with colleagues. This “recorded”
material excludes physical objects (e.g., laboratory samples). Research data also do
not include: (i) Trade secrets, commercial information, materials necessary to be
held confidential by a researcher until they are published, or similar information
which is protected under law; and (ii) Personnel and medical information and
similar information the disclosure of which would constitute a clearly unwarranted
invasion of personal privacy, such as information that could be used to identify a
particular person in a research study.”, Uniform Guidance A-81, section 200.315,
effective December 26, 2013.
Consolidated Data Guideline Proposal for Journals in the Earth and Space Sciences
AGU FAIR Data: Targeted Adoption Group (TAG) B - Publishers in Earth and Space Sciences Team
53. What is Research Data?
Any definition of research data is likely to depend on the context in which the
question is asked.
http://www.ands.org.au/guides/what-is-research-data
Photo by h heyerlein on Unsplash
54. Your experience
• Have you ever used someone else’s data?
• Have you ever shared your data with someone else?
• Where (if anywhere) have you published your data?
• What do you consider to be the biggest challenge in managing
your data?
55. What’s Research Data Management?
Research Data Management covers the planning, collecting, organising, managing,
storage, security, backing up, preserving, and sharing your data. It ensures that
research data are managed according to legal, statutory, ethical and funding body
requirements. Source: UQ LibGuide
Any research will require some level of data management.
Photo by imgix on Unsplash
56. Why should you care about RDM?
Good data management can:
• Increase the efficiency of your research
• Help guarantee the quality and
authenticity of your data
• Enable the exposure of your research
outcomes through collaboration and
dissemination
• Provide for the reproducibility of
experimental and computational
outcomes
• Facilitate the validation and verification
of results.
Photo by Jaron Nix on Unsplash
57. AGU FAIR data
Today, a research publication is much more than a manuscript on a web site or in print. All
scholarly publications represent a network of interconnected resources and information that
are essential to the integrity, reusability, and value of that output for both scientific and
societal uses. Often, the data, software, experimental protocols and physical samples
connected to a publication provide additional and even greater value in their own right.
In the Earth, space, and environmental sciences, much data represent recordings of events or
the state of the Earth or solar system in time and space that can never be repeated.
Increasingly, these data, models, software, and samples provide essential societal, economic,
and research benefits. Given these connections, we recognize that ensuring the quality,
value, and integrity of the data and other resources connected to scholarly publications are
essential.
Leading principles and practices have been developed over the past few years to meet these
goals. Foremost among these are the FAIR Data Principles….
Commitment to Enabling FAIR Data in the Earth, Space, and Environmental Sciences
58. More publishers require data
“A condition of publication in a
Nature journal is that authors
are required to make materials,
data, code, and associated
protocols promptly available to
readers without undue
qualifications”.
59. More funders require data
Source: Digital Curation Centre http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
60. More government policies on data
https://obamawhitehouse.archives.gov/blog/2013/05/09/landmark-steps-liberate-open-data
Data.gov
61. More institutional policies on data
The University of Sydney RDM Policy -
http://sydney.edu.au/policies/showdoc.aspx?recnum=PDOC2013/337&RendNum=0
- and RDM Procedures -
http://sydney.edu.au/policies/showdoc.aspx?recnum=PDOC2014/366
62. More researchers care about data sharing
Figshare open data survey of researchers 2017:
• 82% aware of open data sets
• 80% willing to reuse open data sets in own research
• 60% routinely share their data (frequently or sometimes)
• 21% have never made a data set openly available
• 74% are now curating their data for sharing
• 77% value a data citation the same as an article
Science, Digital (2017): The State of Open Data 2017 Report - Infographic.
figshare.https://doi.org/10.6084/m9.figshare.5519155.v1 pp. 7-11
63. More researchers are sharing their data
More than two thirds of Wiley
researchers reported they are
now sharing their data.
Though this varies
geographically and across
research disciplines we are
seeing that more researchers
are sharing their data and
taking efforts to make it
reproducible.
Wiley Global Data Sharing
Infographic June 2017.
https://authorservices.wiley.c
om/author-resources/Journal-
Authors/licensing-open-
access/open-access/data-
sharing.html
65. Top 3 challenges in data management
Exercise
Your task:
1. Work in pairs to come up with your top 3 challenges in
managing data (3 mins)
2. Pair with another pair to agree on your top 3 (3 mins)
3. Pair with another quad to agree on your top 3 (3 mins)
4. Present list to class (10 mins)
67. With the exception of third party images or where otherwise indicated, this work is licensed under the Creative
Commons 4.0 International Attribution Licence.
ANDS, Nectar and RDS are supported by the Australian Government through the National Collaborative Research
Infrastructure Strategy Program (NCRIS).
Natasha Simons
Associate Director, Skilled Workforce| Australian Research Data Commons
Industry Fellow | The University of Queensland
T: +61 7 3346 9991 | E: natasha.simons@ands.org.au | W: ands.org.au
ORCID: https://orcid.org/0000-0003-0635-1998 Tw: @n_simons
Notas do Editor
Study by 2 librarians at the university of Utrecht
Found more than 500 innovations
Expressed as 101 innovations and characterised
Most initiatives led by academics but some publishers e.g. PLOS
Powerful message – don’t just have to sit back until publishers innovate
Figshare as an example of this
Moving away from 20th century tools e.g. PDF
Scholarly communication is frequently defined or depicted as a lifecycle documenting the steps involved in the creation, publication, dissemination and discovery of a piece of scholarly research.
Scholarly communication is frequently defined or depicted as a lifecycle documenting the steps involved in the creation, publication, dissemination and discovery of a piece of scholarly research.