Open data and data sharing present both opportunities and challenges for publishers according to this perspective from an open access publisher. Publishers are well-positioned to help maximize the impact of research by collecting, organizing and distributing knowledge, including data. However, issues remain around data citation, storage and linking data to publications. Solutions discussed include innovative article types for data publication, data awards to incentivize sharing, and guidance for sharing human subjects data while protecting privacy. Overall publishers aim to better serve science through transparency and addressing open data issues.
Ensuring Technical Readiness For Copilot in Microsoft 365
Iain Hrynaszkiewicz - Research Integrity: Integrity of the published record
1. Open data and the integrity of the published record – an open access publisher’s perspective JISC research integrity conference, 13 th September 2011 Iain Hrynaszkiewicz Journal Publisher, BioMed Central iain.hrynaszkiewicz @biomedcentral.com
2.
3.
4.
5.
6. Solution #1: Innovative journals and article types enabling data publication
8. Solution #2: Open Data Award “ We ... recognize researchers who have ... have demonstrated leadership in the sharing, standardization, publication, or re-use of biomedical research data.” http://www.biomedcentral.com/researchawards/opendata
9.
10. Solution #1: Integrated (cloud-based) data repository and journal http://www.gigasciencejournal.com “ GigaScience aims to revolutionize data dissemination, organization, understanding, and use. An online open-access open-data journal, we publish 'big-data' studies from the entire spectrum of life and biomedical sciences. To achieve our goals, the journal has a novel publication format: one that links standard manuscript publication with an extensive database that hosts all associated data and provides data analysis tools and cloud-computing resources.”
11. Solution #2: Comprehensive author information on available data repositories http://datacite.org/repolist http://www.biomedcentral.com/info/about/supportingdata
12.
13.
14. Solution #2: Submission integration with the Dryad repository
15. Problem: Ambiguous and suboptimal licensing that restricts data (re)use “ The data should be released in standardized formats without intellectual property constraints. ” Conway PH, VanLare JM: Improving Access to Health Care Data: The Open Government Strategy. JAMA 2010; 304 (9):1007-1008. http://pantonprinciples.org/ http://www.isitopendata.org/ “ [P]eople mis-use copyright licenses on uncopyrightable materials and data sets: the confusion of the legal right of attribution in copyright with the academic and professional norm of citation of one's efforts. ” John Wilbanks, VP, Science, Creative Commons, http://bit.ly/djl5Fa August 11, 2010
22. Solution #3: Incentivize, promote and share best practice and standards http://www.biomedcentral.com/bmcresnotes/series/datasharing http://biosharing.org/standards_view
23.
Notas do Editor
The Creative Commons license Authors/copyright owners irrevocably grant to anyone the right to use, reproduce or disseminate the research article in its entirety or in part in perpetuity provided that No substantive errors are introduced Authorship attribution is correct Citation details are provided Bibliographic details are unchanged
Electronic version of article is authoritative “ Additional files” not “Supplementary material” Additional files can be central to the reported findings of the paper
Efficient online publication processes can facilitate dataset publication Only a fraction of experimental data sets make it into the literature Many more datasets have the potential to be useful, but do not warrant a traditional publication For certain standard types of data, appropriate databases exist (e.g. nucleotide sequences) But if such databases do not exist, or if further description of the experimental context is required?
Publishers not best placed to run repositories for long term preservation of large datasets Mirrors of publisher content not able to accept arbitrary amounts of additional data Long term preservation presents a challenge with respect to continuity Redundant international mirrors with independent governance and funding could help to reduce risk BGI capable of sequencing ~2000 genomes per day (6 Tb/day = 2Pb/year)
Bioinformaticists have been rapid adopters of cloud computing (as they were of the web) Cloud computing can reduce the barriers to reproducibility Publications can include or refer to necessary datasets and the computational tools that can be fired up to carry out/reproduce the analysis Large datasets can live in cloud – take analysis to the data, rather than vice versa Deposited data sets assigned DOIs, as are data papers
Accession number system in genomics, for example Sometimes deposit data as part of institutional, funder requirements or for personal reasons
Dryad is a mechanism for enforcement of the joint data archiving policy – a community requirement in ecology/evolutionary biology. As part of a publisher’s service provision to these scientific communities we are implementing integration that enables accepted articles to be associated with data sets in Dryad. Dryad meets criteria for permanent linking to articles by assigning DOIs to data sets.
Data preservation and re-use maximises its value but restrictive licensing, IP etc are barriers to effective re-use and sharing