This document summarizes Todd Carpenter's presentation on designing a roadmap for a new bibliographic information ecosystem. It discusses how MARC has been the lingua franca of bibliographic data for decades but was originally designed to be efficient due to limited and expensive computer storage. However, much computer technology now pre-dates MARC. There is also growing movement toward linked bibliographic data. The presentation notes challenges in moving away from MARC, including lack of demonstrable benefits from a new system. NISO's Bibliographic Roadmap Initiative aims to identify gaps, engage stakeholders, and provide an open process to help assure the right approaches are taken to improve services and facilitate adoption.
Scanning the Internet for External Cloud Exposures via SSL Certs
Future of Bibliographic Systems: Designing a Roadmap to a new Bibliographic Information Ecosystem
1. Whither Bibliographic Data?
Designing a roadmap
to a new
bibliographic information ecosystem
Todd A. Carpenter, Executive Director, NISO
ALA Annual Meeting – AVIAC Session
July 1, 2013
2. Our Dear Old Friend, MARC
01386cam 2200301 a
450000100080000000500170000800800410002503500210006690600450008795500270013201000170015902000150
017604000180019104300120020905000220022108200210024311000550026424503080031926000670062730000250
069444000540071950000290077365000600080265000580086265000730092071000430099399100480103638568531
9951219150001.4881118s1989 nju 000 0 eng 9(DLC) 88029610 a7bcbccorignewd1eocipf19gy-gencatlg
aCIP ver. pv04 12-06-95 a 88029610 a0887389538 aDLCcDLCdDLC an-us---00aZ674.8b.N44
198900a021.6/5/09732192 aNational Information Standards Organization (U.S.)10aInformation retrieval service and
protocol :bAmerican national standard for information retrieval service definition and protocol specification for library
applications /capproved January 15, 1988 by American National Standards Institute ; developed by the National
Information Standards Organization. aNew Brunswick, N.J., U.S.A. :bTransaction Publishers,cc1989. axii, 50 p. ;c26
cm. 0aNational information standards series,x1041-5653 a"ANSI/NISO Z39.50-1988." 0aLibrary information
networksxStandardszUnited States. 0aComputer network protocolsxStandardszUnited States. 0aInformation storage
and retrieval systemsxStandardszUnited States.2 aAmerican National Standards Institute. bc-GenCollhZ674.8i.N44
1989tCopy 1wBOOKS
3. Our Dear Old Friend, MARC
(formatted for your viewing pleasure)
4. MARC Components
Encoding Structure
Z39.2
ISO 2709:2008 -- Format for information exchange
Format structure
Anglo-American Cataloging Rules (2nd Edition) AACR2
Resource Description & Access
Exchange System
Z39.50
SRU/SRW
14. If you were building a network today
would you string copper everywhere?
15. If you building a metadata ecosystem,
would you start here?
01386cam 2200301 a
450000100080000000500170000800800410002503500210006690600450008795500270013201000170015902000150
017604000180019104300120020905000220022108200210024311000550026424503080031926000670062730000250
069444000540071950000290077365000600080265000580086265000730092071000430099399100480103638568531
9951219150001.4881118s1989 nju 000 0 eng 9(DLC) 88029610 a7bcbccorignewd1eocipf19gy-gencatlg
aCIP ver. pv04 12-06-95 a 88029610 a0887389538 aDLCcDLCdDLC an-us---00aZ674.8b.N44
198900a021.6/5/09732192 aNational Information Standards Organization (U.S.)10aInformation retrieval service and
protocol :bAmerican national standard for information retrieval service definition and protocol specification for library
applications /capproved January 15, 1988 by American National Standards Institute ; developed by the National
Information Standards Organization. aNew Brunswick, N.J., U.S.A. :bTransaction Publishers,cc1989. axii, 50 p. ;c26
cm. 0aNational information standards series,x1041-5653 a"ANSI/NISO Z39.50-1988." 0aLibrary information
networksxStandardszUnited States. 0aComputer network protocolsxStandardszUnited States. 0aInformation storage
and retrieval systemsxStandardszUnited States.2 aAmerican National Standards Institute. bc-GenCollhZ674.8i.N44
1989tCopy 1wBOOKS
18. MARC is useful.
It is efficient.
It is our lingua franca.
There are many reasons to
retain it.
But wait.....
19.
20.
21. Movement toward linked data
datahub.io - 5107 data stores
id.loc.gov
British National Bibliography (BNB)
VIAF
OCLC WorldCat Linked Data Store
Deutsche Nationalbibliografie (DNB) (Germany)
datos.bne.es (Spain)
W3C Library Linked Data Incubator Group
Many, many more...
23. Organizations will not move away from a legacy system
unless the new system:
a) Is demonstrably cheaper
b) Is demonstrably more effective in producing results
(discovery, use, etc.)
c) Will make the organization demonstrably more efficient
(staff, management, sales, etc.)
OR
d) The legacy system becomes entirely
non-interoperable with other, more important systems
OR
e) The legacy system breaks and cannot be repaired
24. Can we say a new
metadata management system
based on linked data
will be/do one of those things?
25.
26. It is in….
Adoption
(or rather, in its
absence)
The point at which most standards
fail is not prior to consensus
27. “You would be a fool to
design a system based
on an interchange
protocol.”
- Mark Bide, EDItEUR
34. What have we done?
In-person meeting on April 15-16
in Baltimore
An unconference on bibliographic data exchange
45 in-person
more than 40 more online
more than 200 subsequent viewers
36. The world
makes way for
the man who
knows where
he is going.
- Ralph Waldo Emerson
37. “If you don't know where
you're going, you might not
get there.”
-Yogi Berra
38. Thank you!
Todd Carpenter, Executive Director
tcarpenter@niso.org
National Information Standards Organization (NISO)
3600 Clipper Mill Road, Suite 302
Baltimore, MD 21211 USA
+1 (301) 654-2512
www.niso.org
Editor's Notes
For those of you used to speaking in angle brackets, you ’ ll notice that there isn ’ t one. Frightening? Z39.50
Z39.50
Data element set (MARC fields and tags) identifies and characterizes the specific pieces of data within a record to support its use and manipulation. Data is primarily defined outside of the format, both through content standards or general rule sets (e.g., AACR2, RDA, and others)
MARC was created in the mid1960s by Henriette Avram at Library of Congress to create these things? Anyone remember these things?
Why was MARC so efficient, it had to be. 1 KB * 290,000,000,000 = 290,000 MB Assuming there is something like 290 Billion MARC records or about 290 GB worth of raw MARC data in the world, that would equate to some $766 billion dollars of storage space in 1965. Today, I could go out today and buy more hard disk storage than would be necessary to store all of the library of congresses collections--not just its catalogue, but everything it holds--for about $2,500.
FORTRAN - released in 1957 by IBM COBOL - Drafted by (among others) Grace Hopper (pictured) in 1959. ASCII - First released in 1963 GPS - public release in 1967, but used by NAVY in 1963 First Internet Node at UCLA - 1968 Hypertext - 1968 - by Douglas Englebert
LinkedIn = W2K bug = It was a feature for long-term job security video By Comparison (GROUPS unless otherwise noted), FORTRAN programmers, 2,095; MARC21(skill), 2,100; XML professionals, 4,140; C++ developers, 14,600; iOS developers 37,000; Java Developers 156,000;
Metadata - the legacy infrastructure problem Far too much of our infrastructure was implemented as the systems were first developed. If you were connecting up a world of telephones today, would you use wires? Knowing what we know today, would we build a metadata ecosystem in the same way? The problem is we have more than a billion MARC records. Nearly every library around the world, from the smallest school library, to the largest national library and every size, shape and type of library in-between has a system built upon MARC. Old infrastructure isn't improved - it is maintained. Or it is replaced by something wholly new and a multiple factor more efficient. How do you assess the value of the opportunity costs of not dosing something? How do you measure the lost sales of undiscovered books? How do you compare that potential value against the real costs of improving your out-dated management systems that are "good enough"? How do you measure that potential, without investing today in the system of the future? Ebooks provide the community the best opportunity to get around the mistakes of the past. What are the infrastructure needs of an ebook world that make it inherently different from a print world? Unfortuantely, too much of our current thinking is either tied up in 1) get it out the door as quickly as possible (beta-shipping) or 2) replicating our old models and mistakes.
Old infrastructure isn't improved - it is maintained. Or it is replaced by something wholly new and a multiple factor more efficient.
There are
WorldCat facts and statistics72,000+ libraries from 170 countries 1.95 Billion holdings 289,963,654 bibliographic records
Here is just a partial look at how messy that world really is. Although we haven ’ t studied it, I know it is even more complicated when you begin adding data from the other print media, the recording industry, the television and movie industries. That data environment necessary to describe that information discovery flow is massive, complex and labyrinthine. I would venture to guess it is also horribly inefficient, fraught with duplication, and to a large extent not interoperable.
In 2009, NISO commissioned a study of the exchange environment of book data. It is not, surprisingly very, very messy. Since this particular project was focused on the exchange of MARC and ONIX data, other metadata communities are not described here, but they are equally relevant and equally challenged in interoperability terms
Is the semantic web the way to go? I give it a full-throated “ Possibly ” .
NISO received a modest amount of funding from the Andrew W. Mellon Foundation in October to launch an initiative to draw together a roadmap to help move us toward an environment that
Ralph Waldo Emerson, “ The world makes way for the man who knows where he is going. ” Unfortunately, the corollary quote by Yogi Berra is also equally true, perhaps more so: “ If you don't know where you're going, you might not get there. ”