A discussion of the various subject classifications for published books and ebooks. This slide deck provides insight into the type of subject classification to include, and the depth needed for discovery purposes. Includes BISAC, BIC and Thema subject classifications.
Schema on read is obsolete. Welcome metaprogramming..pdf
PSP Subject Discovery
1. PSP Subject Discovery Panel
Michael Olenick
Business Analyst
michael.olenick@bowker.com
2. Self-Portrait with Classification
• BIO025000 BIOGRAPHY & AUTOBIOGRAPHY /
Editors, Journalists, Publishers
• LAN025030 LANGUAGE ARTS & DISCIPLINES /
Library & Information Science / Cataloging &
Classification
• BUS070060 BUSINESS & ECONOMICS / Industries /
Media & Communications
• LAN026000 LANGUAGE ARTS & DISCIPLINES / Public
Speaking
• OCC010000 BODY, MIND & SPIRIT / Mindfulness &
Meditation
1
3. 2
Assign the most specific, non-redundant
subject(s) from the list(s) you are using.
4. For example:
BISAC
• ART015030 ART / European
• ART015080 ART / History / Renaissance
• ART016000 ART / Individual Artists / General
• ART035000 ART / Subjects & Themes / Religious
• ARC016000 ARCHITECTURE / Buildings / Religious
Library of Congress
• Michelangelo Buonarroti, 1475-1564--Criticism and interpretation.
• Cappella Sistina (Vatican Palace, Vatican City)
• Bible. Old Testament--Illustrations.
• Mural painting and decoration, Italian--Vatican City.
• Mural painting and decoration, Renaissance--Vatican City.
3
7. Where My Work Goes…
• www.BooksInPrint.com: libraries, retailers, online
discovery
• ONIX/ASCII: wholesalers, retailers, textbook resellers
• Subject schemas identified
• Source identified (publisher-supplied or Bowker-
assigned)
• Partners decide which subjects they will use and which
source they prefer
6
9. Why Standardized Subjects?
• Uniformity of usage
• One place for all books on topic (standardization of
terminology)
• Specificity depends on the list
• Grouping data together for
– Analytical purposes
– Data retrieval
– Sales
– Shelving
8
10. Do I Need to Tell You What BISAC Subjects Are?
• North American book industry standard maintained by
BISG
• Issued annually in the Fall and available on BISG site
• 4499 codes/subjects grouped under 54 main sections
• Mappings also available (BIC and Thema)
• Subjects are hierarchical based on text
• Codes are not “smart” except that first three characters
reflect section
• Not as specific as LC subjects but this allows for
statistically significant groupings
9
12. • First global subject category system for the book trade
• Not a replacement for BISAC (possibly for BIC)
• Maintained by EDItEUR (free download available)
• Variable length, hierarchical codes
Introduction to Thema
11
13. Subject Heading
DNBS Biography: Sport
SFC Baseball
• First global subject category system for the book trade
• Not a replacement for BISAC (possibly for BIC)
• Maintained by EDItEUR (free download available)
• Variable length, hierarchical codes
• Consists of subject headings…
Introduction to Thema
12
14. Qualifier
1KBB-US-NAKC New York City
Subject Heading
DNBS Biography: Sport
SFC Baseball
• First global subject category system for the book trade
• Not a replacement for BISAC (possibly for BIC)
• Maintained by EDItEUR (free download available)
• Variable length, hierarchical codes
• Consists of subject headings…
• …modified by qualifiers (not repeated for each subject)
Introduction to Thema
13
modifies
15. Qualifier
1KBB-US-NAKC New York City
Subject Heading
DNBS Biography: Sport
SFC Baseball
• First global subject category system for the book trade
• Not a replacement for BISAC (possibly for BIC)
• Maintained by EDItEUR (free download available)
• Variable length, hierarchical codes
• Consists of subject headings…
• …modified by qualifiers (not repeated for each subject)
• Qualifiers used for fiction or nonfiction, adult or juvenile
Introduction to Thema
14
modifies
16. Code Heading
L Law
LA Jurisprudence & general issues
LN Laws of specific jurisdictions & specific areas of law
LNJ Entertainment & media law
LNJD Defamation law, slander & libel
More Detail (Attention Optional)
15
HIERARCHY
Subject Headings
The variable-length codes are hierarchical and
determine the order in which the subjects are
presented. The text of the subject or qualifier does not
contain all parts of the hierarchy and is meant to stand
on its own when output.
17. Geographical Qualifiers
1D Europe
1DS Southern Europe
1DST Italy
1DST-IT-T Central Italy
1DST-IT-TS Tuscany
1DST-IT-TSF Florence
Language Qualifiers
2H African languages
2HC Niger-Congo languages
Time Period Qualifiers
3K CE period up to c 1500
3KB c 1 CE to c 500 CE
3KBF 1st century, c 1 to c 99
Educational Purpose Qualifiers
4C For all educational levels
4CD For primary education
4CX For adult education
Interest Age & Special Interest
Qualifiers
5A Interest age / level
5AN Interest age: from c 12 years
Style Qualifiers
6B Styles (B)
6BA Baroque
6BB Barbizon school
Qualifiers: Six Different Types
16
18. What Publishers Should Provide
• At least 1 valid code
• From the latest version of that subject schema
• Most specific subjects(s) applicable
• Multiple, unique subjects when necessary
• Avoid subjects more general than others (especially from
the same section)
• All formats & editions consistently classified
• Subjects should not clash with each other or with other
metadata
17
19. How Many Subjects Does an ISBN Need?
• No correct number
• Do not force multiple subjects
• New information is valuable
• Redundant or general information is not
18
10
20. Sample Search Results Titles Pages
general fiction 63126 2254
Asian American fiction 412 15
What audience is this title
most relevant to?
Are subjects
consistently applied
across products?
Where would title be
more easily discovered?
21. To Map or Not to Map?
20
Benefits Drawbacks
Utilizes previously classified data Cannot replicate accuracy or detail of
direct assignment
Minimizes the number of schemas to be
manually assigned
Two things have to be correct rather than
one (original assignment and mapping)
Uniformity of usage (this always maps to
that)
Even if both things are “correct” you still
get results that could be improved
Can serve as a good starting point for fine-
tuning
Results are almost never reviewed
because that negates the time-saving
benefit
Lots of data classified quickly (once in
place)
Getting it in place may take more time in
the short-run
Works well if going from a very specific
schema to a more general schema
Works okay with schemas of equal
specificity; does not work from specific to
general (is not “reversible”)
Helpful for version upgrade (change
mapping vs. changing each ISBN)
Mapping must be maintained (both source
and target)
22. Links
• BISAC – US industry standard for over 20 years maintained by
the Book Industry Study Group
https://www.bisg.org/bisac/complete-bisac-subject-headings-
2015-edition
• BIC – the UK equivalent of BISAC maintained by Book
Industry Communication http://www.bic.org.uk/7/BIC-
Standard-Subject-Categories/
• Thema – an internationalized version of BIC maintained by
EDItEUR– not officially a replacement for BIC but if the uptake
is strong, it effectively will be (not a replacement for BISAC)
http://www.editeur.org/151/thema/
21
Notas do Editor
As a way of telling you about myself and what I do, I thought I’d start with a little attempt at self-classification…which turns out not to be that easy. I’ve known myself pretty much all my life and have been working with subjects for about half of it.
Long-time Bowker employee working on several projects including statistical analysis and subject classification. This is the best I could come up with…I really wanted to avoid BIO / General because later on I’m going to be telling you to avoid general codes…but I don’t really fit anywhere else. I do editorial work to some degree, and Bowker is kind of a publisher…or at least in the publishing industry.
Oversees the assignment of Bowker’s proprietary, LC-style subjects (over 400K ISBNs per year in the Books In Print database of 30 million ISBNs) as well as mappings to industry-standard subjects.
Participant in several industry committees dealing with standardized subjects, such as BISAC and Thema. There is a Publishing subject in the LANGUAGE ARTS & DISCIPLINES section of BISAC but in this case I’m trying to accentuate the business aspect.
Not really comfortable with ….despite….
On a basic level, the object of subject classification is to assign the most specific, non-redundant subject(s) from the subject heading list you are using to allow the user/client/purchaser to access books on that subject (and, in theory, just books on that subject). I don’t want to oversimplify…otherwise there would be no more slides…there’s more to it and it’s easier said than done. I guess it’s like saying all of economics is just supply and demand.
So for BISAC subjects you’re limited to the specificity that allows. If you’re using LC subjects, you can offer a lot more detail. But the idea is the same in both cases. You’re giving the most specific information you can give and even multiple codes within the same section are fine if the are giving additional info.
Abridged version of what appears on booksinprint.com just showing subjects.
We use a mapping…all the subjects here are a result of that. I will talk about the benefits and drawbacks of that later on. In this case it works pretty well (like I would show you an example where it didn’t) but I will say that the subject “Mural Painting and Decoration” creates a BISAC subject I would not have assigned manually.
Also LC Class, and Dewey (as received from LC) and publisher BISACs (Individual Artists).
So after all these amazing subjects get assigned what happens to them.
BISAC, BIC, Bowker, Sears….search by or link by subjects
Why not just keywords…don’t get me started on keywords…keywords vs. subjects is a whole ‘nother presentation.
The lend themselves to uniformity of usage (assuming people follow the guidelines for subject assignment the list they are using).
A subject authority list provides headings that serve as the one place to put all titles on that subject (often with cross-references for variant spellings and phrasing and for cases where a heading has been changed).
There are several terms commonly used for people of a certain age: Aged, Older People, Senior Citizens, the Elderly, Older Persons, Geezers (in the US) – Library of Congress used to use Aged and now uses Older People with references from the others. But the key is that there is one heading designated as the place for all books on that subject.
Or Sistine Chapel vs. Cappella Sistina – goes from 4 to 8 and it doubles!
Sistine Chapel (not statistically significant) vs. Religious Buildings.
The specificity you can achieve depends on the list. LC very specific, BISAC less so – both have their benefits.
Subjects are very useful for grouping data together for analytical purposes as well as data retrieval…and these days to a lesser extent, shelving.
BISAC is the subject language we all speak so I’m sure (or hope) all of you know about BISAC subjects.
For over 20 years, BISAC subjects have been the North American book industry standard for subject classification.
BISAC stands for Book Industry Standards and Communications – the former name for BISG – the subjects retained the name when BISG adopted that name.
They are maintained by the BISAC Subject Committee of BISG which meets monthly and new version is issued in the Fall.
4499 codes/subjects (we really should have added just one more last year) grouped under 54 main sections (cross-references are included in the Word and PDF versions).
The subject list is available on the BISG site at no cost for lookup purposes. You need to license the list (or join BISG) in order to download versions in Excel, PDF and Word for unlimited use and incorporation in your company's internal databases.
Here’s a sample of the beginning of the ARC section.
All codes begin with ARC
000000 first
Effectively random codes otherwise
ARC001000 way down the list
Hierarchy based on text
Xrefs within section and outside of section (in this case subject was moved)
Contributions from various national groups.
Unlike LC or Bowker….
The longer the code is, the farther down it is in the hierarchy and the more specific it is – only the most specific code is needed. This applies to subject codes and qualifier codes. While all codes are available for assignment, some are there more as headers than as codes that would be expected to be assigned.
Number indicates the type of qualifier.
Each set of qualifiers has its rules.
E.g., for Geographical qualifiers:
USE: to indicate the geographical scope or applicability of book content (such as the location of a travel guide, the setting of a novel, the jurisdiction to which laws apply, etc.).
DO NOT USE: to indicate the literary tradition of the work or to indicate the location or nationality of the author, publisher, etc. (e.g., literature of Peru).
There really aren’t a “correct” number of subjects to assign so the only answer I can think of is “as many as are needed to adequately describe a book without being redundant”.
You do not have to force multiple codes if one will suffice. If a book is about calculus, “MAT005000 MATHEMATICS / Calculus” pretty much covers it.
But if it is a study aid for an advance placement test, that can be conveyed by additional subjects from the STUDY AIDS section.
But something like a sports biography should get a BIO code and a SPO code.
As long as the subjects being assigned are conveying new information, they’re all equally valuable.
But it is generally advisable to put the primary code first because some systems only take one code.
Not sure what it is about publishing that people think it’s easier to find things in a more general category. If the fruit section is a supermarket just piled all the fruit in one pile instead of separating it…or even mixed all the different apples together and you were looking for a macoun or an empire…you’d probably shop somewhere else.
Just to be clear…a mapping is not a lookup function…it’s not “I am receiving a code and translating it to a text”.
It’s, “I am assigning a subject from one schema and want to auto-populate another schema”.