Mais conteúdo relacionado

Apresentações para você(20)

Similar a Digital Repositories: Essential Information for Academic Librarians(20)


Digital Repositories: Essential Information for Academic Librarians

  2. Outline • Terminology • Institutional Repositories • IRs in Colorado • IR software • Standard identifiers for digital objects in repositories • Digital preservation for IRs • Disciplinary repositories • Data repositories • OAI-PMH • The future
  3. Terminology • Institutional repository (IR) • Disciplinary repository (Subject repository) • Green open-access • Post-print • Author's accepted manuscript (AAM) • SPARC author addendum • Embargo period • Pre-print server • Sherpa Romeo • Dark archive
  4. Institutional Repositories [1] Directory:
  5. Institutional Repositories [2]: Local instances • Colorado / Wyoming Institutional Repositories (selected) • University of Colorado Boulder, University of Colorado Colorado Springs, Anschutz Medical Campus, Colorado School of Mines, Colorado Mesa University, and Colorado State University still using Digital Collections of Colorado • Wyoming Scholars Repository (Digital Commons) • University of Northern Colorado, Denver University and Colorado College and others use the Colorado Alliance's repository service, which is an Islandora implementation. • Fort Lewis College has Fort Works, an Eprints implementation
  6. Institutional Repositories [3]: Institutional Repository Software / Hosting • Digital Commons • DSpace • EPrints • Fedora • Islandora • Invenio / TIND • Greenstone • SobekCM
  7. Institutional Repositories [4]: Digital Preservation "The Academic Preservation Trust (APTrust) is committed to the creation and management of a sustainable environment for digital preservation. APTrust’s aggregated repository will solve one of the greatest challenges facing research libraries and their parent institutions – preventing the permanent loss of scholarship and cultural records being produced today." "The Digital Preservation Network (DPN) was formed to ensure that the complete scholarly record is preserved for future generations. DPN uses a federated approach to preservation. The higher education community has created many digital repositories to provide long-term preservation and access. By replicating multiple dark copies of these collections in diverse nodes, DPN protects against the risk of catastrophic loss due to technology, organizational or natural disasters."
  8. figshare
  9. DataCite
  10. Disciplinary Repositories • Directory of disciplinary repositories (Simmons College) = • Some major disciplinary repositories: • SSRN (Social Sciences Research Network) • RePEc (Research Papers in Economics) • E-LIS (Eprints in Library and Information Science) • PMC (PubMedCentral) • Ag Econ Search (University of Minnesota)
  11. Disciplinary repository screenshots
  12. Focus: PubMed Central (PMC) “PMC (PubMed Central) launched in 2000 as a free archive for full-text biomedical and life sciences journal articles. PMC serves as a digital counterpart to the NLM extensive print journal collection; it is a repository for journal literature deposited by participating publishers, as well as for author manuscripts that have been submitted in compliance with the NIH Public Access Policy and similar policies of other research funding agencies. Some PMC journals are also MEDLINE journals. For publishers, there are a number of ways to participate and deposit their content in this archive, explained on the NLM Web pages Add a Journal to PMC and PMC Policies. Journals must be in scope according to the NLM Collection Development Manual. Although free access is a requirement for PMC deposit, publishers and individual authors may continue to hold copyright on the material in PMC and publishers can delay the release of their material in PMC for a short period after publication. There are reciprocal links between the full text in PMC and corresponding citations in PubMed. PubMed citations are created for content not already in the MEDLINE database. Some PMC content, such as book reviews, is not cited in PubMed.”
  13. What is the Difference between PubMed Central and PubMed?
  14. Data Repositories Directories of Data Repositories • Data repositories (Simmons College, OA Directory) • Registry of Research Data Repositories • Databib "Databib is a searchable catalog registry / directory/ bibliography of research data repositories."
  15. Focus: Dryad Digital Repository • Works with journals • Requires use of the CC 0 license • Located at • Costs $90 “ is a curated general-purpose repository that makes the data underlying scientific publications discoverable, freely reusable, and citable. Dryad has integrated data submission for a growing list of journals; submission of data from other publications is also welcome” --
  16. Focus: GitHub • A collection of software repositories • Used for sharing code, programs, software • Has paid and free options; free option used for open source “GitHub is the largest code host on the planet with over 19.4 million repositories. Large or small, every repository comes with the same powerful tools. These tools are open to the community for public projects and secure for private projects.”
  17. DMP = Data management plan From the Wikipedia article, "Data management plan“ • Description of the data • How / When / Where data will be acquired • How the data will be processed • What file formats the data will be in, naming conventions • Version control • Metadata • Policies for access, sharing, and re-use • Long-term storage and data management • Budget
  18. Review of OAI-PMH • Open-Archives Initiative Protocol for Metadata Harvesting • Provides a way to create a "union catalog" of resources in digital repositories • The metadata is indexed in WorldCat (including WCL), updated quarterly
  19. Conclusion • Institutional repositories convert libraries into publishers, and this has many long-term legal, ethical, and financial implications. • Repositories exist in sort of a digital version of the Wild West • Repositories with strong digital preservation practices and that use and maintain standard identifiers for the digital objects they publish will stand out from others. • Most repositories will contain material of secondary or local-only importance, but a few “gems” will exist here and there. • Libraries are competing with scholarly publishers (Odlyzko , 2013).
  20. Coda “Investigate the possibility of constructing the world’s first all- scholarship repository (ASR). [...] Conversations are currently ongoing on this matter. The Department of Energy has authorized the Los Alamos National Laboratory (LANL) to build the prototype ASR.” SOURCE

Notas do Editor

  1. Title is ambiguous.
  2. An institutional repository is an OA repository that is sponsored by an institution, usually a university or college. Most of its content is open access, but some may be embargoed and some content may be dark archived. Green open-access refers to author self-archiving of a post-print of a published work (published in a toll-access journal) in an open-access repository. The repository can be institutional or disciplinary. The advantage to the author is that he or she gets to publish in a top toll-access journal and at the same time the content is freely available through the repository. There are many disadvantages to green OA. Because you sign over copyright to the publisher, you need their permission to post the content in the repository. If they grant this permission, they only grant it for the Word version which is not the version that they copyedit and not the version for which they enhance the images, tables, etc. Many also impose embargoes before the author can post the document, six months, one year, two years. Some publishers only allow green OA for institutional repositories, that is, disciplinary repositories are excluded. A post print is the author’s last version of the paper that he or she sends to the journal. It is usually a Word document and incorporates all the changes suggested by peer reviewers. The term author’s accepted manuscript (AAM) is synonymous. The SPARC author addendum “The form provides a templated request by authors to add to the copyright transfer agreement which the publisher sends to the author upon acceptance of their work for publication. Authors which use the form typically retain the rights to use their own work without restriction, receive attribution, and to self-archive. The form gives the publisher the right to obtain a non-exclusive right to distribute a work for profit and to receive attribution as the journal of first publication” From Wikipedia. arXive is a preprint server. This tradition started in the particle physics field. In the pre-internet days, because of the long lag time between submitting a manuscript and its eventual publication in a journal, physicists would create mimeographed copies of their manuscripts or pre-prints and share them with colleagues via the mail or at conferences. Eventually these became photocopies, and eventually they became available through telnet and gopher. I can remember helping set up a database at Harvard in 1991 or 2 that was called the Physics Preprint database, and it was metadata for all the preprints. Then the internet came and changed everything. Today the physics preprint database is known as arXive, and it’s still called a pre-print server, but many people are submitting papers to it and then never submitting them to any journal. So it’s morphed into a type of publisher. Similar initiatives are being started in other fields. The problem is that much of the content is not peer-reviewed. We know that the major publishers make articles available soon after they are accepted, generally using names like “articles in press” or something like that, and this is an attempt to compete with pre-print servers. Sherpa Romeo is a free database that collects green OA policy statements for journals. Authors can use it to determine what they can do with their post-prints. A dark archive is one that is not accessible at all generally, and may include embargoed material or material being stored for cooperative preservation.
  3. First we’ll talk about institutional repositories. They are often referred to as IRs. Open DOAR is a directory of them.
  4. To give some local context, I gathered information about IRs in this region.
  5. Here are some of the principal IR companies. Explain hosted versus software Some of these are open source. Explain TIND.
  6. There are two cooperatives for digital preservation for institutional repositories. Basically they work by having several other libraries host all your content in a dark archive on their servers, and you do the same in return. Academic Preservation Trust is based at UVA. Its members include: Columbia University Indiana University Johns Hopkins University North Carolina State University Penn State University Syracuse University University of Chicago University of Cincinnati University of Connecticut University of Maryland University of Miami University of Michigan University of North Carolina University of Notre Dame University of Virginia Virginia Tech The digital preservation network does not indicate where it is based but it gives a 434 area code for its telephone number, which is Lynchburg, Virginia, so it looks like Virginia is the hotspot for digital preservation. It has these members: Member Listing Arizona State University Brigham Young University Brown University California Institute of Technology Columbia University Cornell University Dartmouth College Duke University Emory University Harvard University Indiana University Iowa State University Johns Hopkins University Kansas State University Massachusetts Institute of Technology Michigan State University New York University Northwestern University North Carolina State University Ohio State University Pennsylvania State University Princeton University Purdue University Rutgers University Stanford University Syracuse University Texas A&M Texas Tech University Tufts University Tulane University University of Alabama University of Arizona University of Buffalo University of California San Diego University of Chicago University of Florida University of Illinois at Chicago University of Illinois at Urbana-Champaign University of Iowa University of Kansas University of Kentucky University of Maryland University of Miami University of Michigan University of Minnesota University of Nebraska University of New Mexico University of North Carolina University of Notre Dame University of Tennessee University of Texas University of Utah University of Virginia University of Washington University of Wisconsin Utah State University Vanderbilt University Virginia Polytechnic Institute and State University Yale University Texas Digital Library California Digital Library John D. Evans Foundation American Council on Education
  7. Figshre is unique because it markets to individual scholars. It does also market to institutions. It’s owned by Digital Science, which is owned by Macmillan Publishers Limited.
  8. There is an organization called DataCite that focuses on citing digital objects. They have something called the “Metadata Store” where you can buy DOIs and assign them to the digital objects in your repository. Increasingly, the quality of a repository will be judged by whether it provides DOIs for its objects and digital preservation for its content. The sponsors of repositories essentially become publishers, and publishers have responsibilities. Publishing is much more than just mounting PDFs or images on the internet; there are many activities that must be carried out to support publishing, if you want to do it right.
  9. Now let’s talk about disciplinary repositories. There is one directory of them that I know of, and it covers most fields, and it’s hosted on the Sommons College OA wiki. Some of the major subject repositories include these.
  10. Here are screenshots of SSRN and RePec, which I think is pronounced REE Peck. I don’t completely understand SSRN. It is starting to act more like a business than a repository. Indeed it’s owned by a company called Social Science Electronic Publishing, Inc. It may also do some publishing. It also hosts preprints. It uses number of downloads as a metric to measure individual researchers. RePEc is sponsored by the Research Division of the Federal Reserve Bank of St. Louis
  11. The basic difference is that PubMed is a database of metadata, and PMC is a database of full-text scholarly articles. The two databases are often confused. PMC has an HTML “reader” and a classic reader and in many cases the publisher’s PDFs are also available. Both PubMed and PMC are made available by the National Center for Biotechnology Information, NCBI, which is part of the U.S. National Library of Medicine. A lot of funding agencies in the bio-medical sciences require that research completed using their funding be made freely available, and PMC is one place where this is often done.
  12. Data repositories publish much more than just numerical or statistical data. They also publish genomic data, structured textual data, image data, and more.
  13. Mention CC 0 license Started in North Carolina with grant funding. One of the ideas is that people can use the published data to generate new research They can also re-do the experiments and see of they get the same results.
  14. It started at the University of Michigan. It doesn’t work well for items that are removed. ResourceSync is a prototype replacement. It aims to synchronize metadata with the objects they describe.
  15. 4th bullet point: I’ve heard the term “publications ghetto” used to refer to institutional repositories, specifically referring to green open access articles, which are Word versions of documents or a PDF derivative of such.
  16. This is an initiative of the National Science Communication Institute. It would be centralized and would make things like OAISTER obsolete. In other words, it would centralize all IR content rather than just the metadata.