1. Update on Bibliographic Framework
Robin Fay, ATCx3 Consortia, Digital Initiatives
@georgiawebgurl
2. Agenda
The need for Bibframe
FRBR /RDA
Extensible Markup Language (XML)
Semantic Web – metadata is
everywhere
Linked Data
Bibliographic Framework Initiative
(BIBFRAME)
Discussion
3. Getting to know you
Copy cataloging?
Original cataloging?
Catalog content or manage digital libraries, home grown databases
(nonMARC), institutional repositories?
Other technical services work (ordering, maintenance, authority work,
etc.)?
Runs system reports or bibliographic maintenance reports?
Familiar with WEMI?
Familiar with XML or other programming languages?
Knows what a triple (or triplet is)?
Can code MARC in their sleep?
4. Our history (MARC)
MARC is considered a Resource Description Standard
Developed at the Library of Congress during the 1960s by
Henriette Avram
Originally used as a way for LC to disseminate and print cards
more easily & quickly
MARC records largely look like electronic cards and function that
way as well (more difficult for machines to process “strings” of
data)
Designed more for human-readability than machine-readability
ISBD punctuation sometimes used to determine the meaning of a
subfield; indicators often used to provide meaning (e.g., 245 –
Title/statement of responsibility)
5. MARC, Indicators, subfields, ISBD & text
strings – oh my!
245 10 Calm energy : ‡b how people regulate mood with food and
exercise / ‡c Robert E. Thayer.
MARC Tag
Delimiter
2nd
indicator
1st
indicator
Tags represent textual
names
They’re divided by
hundreds: e.g., 100, etc.
We haven’t used
everything we could 000-
999
Indicators communicate
information to the system
What do 1st and 2nd
indicator communicate?Robin Fay, 2010
ISBD
ISBD (International Standard Bibliographic Description):
Standardized punctuation (colons, semicolons, slashes, dashes,
commas, and periods) is used to identify and separate the elements
and areas.
6. Lingering impact of the card catalog
Transition from card catalogs to electronic form of the card catalog
[MARC record] - duplication into an electronic format
“Space” an issue
• Need to save physical space on cards to reduce printing;
catalogers use abbreviations, etc.
• Early electronic records impacted by record and field limits –
remember flat files? Overwrites the whole file with a change!
• Current ILS are relational with field limits disappearing but
MARC records bibliographic length is 99999 characters
• Repeatable/split fields if needed, e.g., note fields; reliance on
abbreviations, conserve data
Focus on physical things - barcodes, circulation/patron records, starting
to shift as libraries have collected more digital items
7. Why now-Why RDA? Why FRBR?
The world has changed since we implemented AACR2 – mobile, social,
dynamic, online !
More flexibility in cataloging – more options
Designed with digital items in mind including those in museums and
archives (not just library-centric)
More collaborative cataloging – harvesting from PDFs or other
sources, outsourcing, sharing with each other (Transcription – should
make cataloging easier)
Considers how users use information – ability to bring together
different formats and versions of the same work together more easily
Expands concept of authorship to creator to encompass more roles –
we are all content creators
8. Why now-Why RDA? Why FRBR?
Looking forward to new systems – modern ILS are relational
databases – clusters of tables with fields that can be linked in
different ways
Less reliance on flat file structures –more data in relational
databases with minimal drain on resources
More keyword and fulltext searching, which uses words in
relevance ranking
Breaking apart data into defined concrete bits allows machines to
“think” and build content on the fly ; each word in fulltext (minus
stop words – articles, etc.) is typically included with relevance
weighting – the more it appears, the more relevant
9. Where the ILS is going…
More recently ILS software is built on relational
databases
More user functionality - ability to share via social media,
make “book bags” or reading lists
Ability to work with other types of metadata
Continuing to move away from the card catalog format
Focus on authorities and controlled vocabularies - free
text keyword searching newer functionality
Moving more towards relational databases (remember
FRBR)
We need to move forward to> thus FRBR and RDA.
11. FRBR (Functional Requirements for Bibliographic Records)
The FRBR report itself includes a description of the
conceptual model of the bibliographic universe:
that is, the entities, relationships, and attributes (or as
we’d call them today, the metadata) associated with
each of the entities and relationships, and it proposes
a national level bibliographic record for all of the
various types of materials.
It also reminds us of user tasks associated with the
bibliographic resources described in catalogs,
bibliographies, and other bibliographic tools.
-- Barbara Tillet, 2003
http://www.loc.gov/catdir/cpso/frbreng.pdf
14. RDA
RDA provides more flexibility in describing content,
especially digital content.
RDA emphasizes relationships, transcription, and
more data not less*.
While we typically use it in MARC, RDA can be
expressed in XML or RDF (a semantic web
framework).
(We’ll look at RDA examples in XML in a few)
We try to make MARC
work for us – by creating
machine actionable
fields and $e and $i for
building relationships
15. From University of North Texas Libraries’ catalog
Copyright date is a separate element in RDA
Role explicitly stated
Relationship
explicitly
stated
“pages” spelled out, not abbreviated p.
16. Semantic web is about data
Big data - exactly what it says –
large volumes or groups of data
Linked data – a link that connects
data together (relationships!) – it
can be used to create dynamic
displays (we already have linked
data behavior with our
authorized access points but no
true linked data). 100s link to
Authority records which then can
do things like pull together a list
of materials by an author
Open data is data that is
published with rights encouraging
usage and sharing
Social web
17. Allows us to have a customized experience on the web
using any device that has data and internet capabilities
(smart phones, tablets, laptops, ipods, desktops, etc.)
It allows us to have better search results –
personalized, with better relevance and filtering.
It works for us.
Semantic web – the user
experience (UX)
18. Internet of Things – rise of connected machines and data sharing – Web 4.0?
19. • Types of metadata:
• Descriptive
• Structural
• Administrative
• Many forms of metadata include elements of each of these;
however it is dependent upon the schema.
• A schema is a set of rules covering the elements and
requirements for coding. Examples of common schemas in
the library world include Dublin Core/DCMI, AD, and others.
Examples of schemas in the semantic web include Dublin
Core/DCMI, FOAF (Friend of a Friend), and many others.
Let’s talk about metadata on the
web in general
20. Metadata in the wild (and some libraries)
Structure – navigation,
files/pages included, etc
Description – A typical
catalog record also
information created by users
Administration/administrative
–who created it, how, file
type, rights will sometimes
fall here – DRM (Digital
Rights Management),
Provenance how you
became the
owner/repository for the item
25. Make your own metadata – RIMMF for
training
Free software to
create records for
training
(RDA in Many
Metadata formats)
using a form
RDA to RDF, RDA
to XML and even
MARC !
http://www.marcof
quality.com/wiki/ri
mmf3/
26. What do you see as the challenges of thinking about
bibliographic data in this way?
27. Semantic web & the social web
So, what about the “rest” of the web?
The social web/social media/social networking is about people.
Its focus has been less on standards, controlled vocabularies,
RULES…. After all, here comes everybody…
..but that is not exactly true. In order for blog posts to display
sequentially, in order for discussions to be threaded, there must
be an underlying order – a structure -- rules. Much of the
technology (databases, servers, etc.) already exists – at least
for the initial stages of the semantic web.
Metadata schemas, RDF, are being used by many projects, but
the web has only “standards” no rules.
28. Many terms associated with the semantic
web are used or based upon information
architecture, database, information science,
and library science fields – controlled vocabularies,
structural elements, etc.
•RDF = Resource Description Framework
•RDFS = Resource Description Framework
Schema
•OWL = Web Ontology Language – links ontologies
which are classification systems
•URI = Uniform Resource Identifier – we use these
already
A little semantic web terminology
29. The World Wide Web Consortium (W3C) standard, a simple metadata
data model, based on triples (subject-predicate-object) ; developed in
1999, with RF 1.0 specification published in 2004. (1.1, 2014)
It reflects the relationship of the items.
(in other words, an entity-relationship model).
It is meant to be neutral – vendor neutral and operating system
independent.
RDF is one way we could express bibliographic data in the future. We
share commonalities between FRBR and RDF already.
Examples
The University of Georgia (resource as subject) is located in
(predicate) Athens, Georgia(object).
The Raven is written by Edgar Allan Poe.
RDF: Resource Description Framework
Typically it is coded in XML.
30. Some RDA RDF properties
Identifier for the work
Date of work
Language of expression
Media type
Carrier type
Variant title
Preferred name for the person
Date of birth
Variant name for the corporate body
RDA expressed as RDF in XML – likely!
31. The Semantic Web is based upon more precise
utilization of data and is heavily dependent upon
The code
The metadata and its metadata schemas
(rules)
The ability for machines (including devices and
home appliances) to talk to each other and
make sense of that communication
Linking data makes this process easier since we do
not have to re-enter data, we can just link to it. This
can work for libraries, reducing input and also
maintenance.
Linked data may help solve issues
32. Linked data
Linked data is about connecting data and reusing data – rather than
see text in a record, we would just see a link
Linked data is: “about using the Web to connect related data that
wasn't previously linked, or using the Web to lower the barriers to
linking data.”
Benefits – maintenance work! Authority work! Less keying (possibly!)
34. The thesaurus consists of more than a million terms organized
into five controlled vocabularies: subjects, personal names,
organizations, geographic locations and the titles of creative
works (books, movies, plays, etc).” – NYT Blogs
Linked data may help solve issues
Projects such as the NYT Linked Open Data project and the Virtual Authority
File project are resources of controlled vocabularies. LCSH has been released
as linked data too!
http://id.loc.gov/
35. Bibliographic Framework Initiative
“…aims to re-envision and, in the long run, implement a
new bibliographic environment for libraries that makes "the
network" central and makes interconnectedness
commonplace.”
(read: is attempting to better position the library world for a
linked data environment)
Primer for BIBFRAME:
http://www.loc.gov/bibframe/pdf/marcld-report-11-21-
2012.pdf
36. Library of Congress
Consultants:
•Zepheira
Partners (Early Experimenters) and among others:
•British Library,
•Deutsche National bibliothek,
•George Washington University,
•National Library of Medicine,
•OCLC,
•and Princeton University
Initial participants
37. Officially launched by the Library of Congress in 2011
A new model for bibliographic data, that will be the basis
for an new encoding standard that will replace MARC
and will be XML-based.
Consists of the BIBFRAME Model is a
conceptual/practical model that contains 4 high-level
classes, or entities (Work, Instance, Authority, and
Annotation) and the BIBFRAME Vocabulary which has
a defined set of elements and attributes that describe
resources and their properties.
BIBFrame: BIBliographic FRAMEwork Initiative
38. Instead of bundling everything neatly as a “record” and potentially
duplicating information across multiple records, the BIBFRAME Model
relies heavily on relationships between resources (Work-to-Work
relationships; Work-to-Instance relationships; Work-to-Authority
relationships).
It manages this by using controlled identifiers for things (people, places,
languages, etc). MARC employs some of these ideas already
(geographic codes, language codes) but BIBFRAME seeks to make
these aspects the norm rather than the exception.
In short, the BIBFRAME Model is the library community’s formal entry
point for becoming part of a much larger web of data, where the links
between things are paramount.
(from BIBFRAME FAQs: http://www.loc.gov/bibframe/faqs/)
BIBFRAME
39. BIBFRAME Model
Core elements of the BIBFRAME model –
similar but not exactly the same as FRBR
WEMI but maps to WEMI
Work resource reflecting the conceptual
essence of the cataloged item
Instance Resource reflecting a material
embodiment of a BIBFRAME work
Authority Resource reflecting key authority
concepts that have defined relationships to
41. Links replace strings = Reduced maintenance
URIs = authority
264 1 New York: Neal Schuman Publishers, 2014
Instead of writing New York, replace it with
http://id.loc.gov.authorities/names/n92062246
Replace Neal Schuman Publishers with
http://id.loc.gov/authorities/names/n79007751
264 example
65. Click on “Save” at the bottom of the page to
generate a BIBFRAME view
66.
67. Will RDA elements be part of the BIBFRAME vocabulary?
Yes. RDA is an important source of elements in
the vocabulary for BIBFRAME, even though it
generally aims to be independent of any
particular set of cataloging rules. We also
expect community profiles to emerge which
will accommodate additional elements.
(from BIBFRAME FAQs: http://www.loc.gov/bibframe/faqs/)
69. And back to MARC….
We can’t abandon MARC immediately
No current viable replacement that works with current ILSes
(but getting closer with Bibframe!)
we have MILLIONS of records in MARC (need conversion tools or
systems that easily support multi metadata schemas – displaying
materials equally in an understandable way to users >“mapping”)
We need to change the way we think about cataloging (FRBR) to
a lesser extent how we implement cataloging (RDA) – print is no
longer the default material format and more importantly the
metadata that is out there on the web.
╫
70. 71
Lots of Bibframe resources & links
https://goo.gl/pih0Ui
georgiawebgurl@gmail.com