George Oates introduces himself as the new leader of the Open Library project. He discusses his first steps in the role, including listening to user feedback, streamlining processes, and assessing competition. Oates describes plans to better understand how different parts of the library interrelate and reach out to other networks. He addresses challenges like improving metadata and engaging more users. Oates envisions connecting library records to more external sources and allowing easier contribution to help the library grow.
3. First Steps
• Listen to people Answer help emails
• Meet team in person Met in San Francisco in June
• Streamline deploys 1 button!
• Redraw sitemap Refocus on core
• Dream a little
Ask silly questions, assess competition
Get acclimatized
5. Reach into the network
twitter.com/openlibrary
- we’ve also arranged a little Flickr integration, so if people take photos of books, they can
link them to Open Library records. We’re not using them yet.
- as you may have noticed last night, we also added a link from Internet ARchive book pages
into Open Library. We reckon that’s almost doubled our modest traffic. (About 250k unique
IPs per day)
6. Challenges
• Dense library metadata
• Designed for classic institutional
search/retrieve practice
• Data is “dry”, sometimes poor quality
• No insight into the community
• Distributed team US, India, UK
-so one thing I began was to start reading and answering enquiries that come to
info@openlibrary (this is a good thing to have new people do for a while)
- found that some questions repeated themselves and there was a key mismatch in
understanding what Open Library was about. e.g. people would write in to us asking us to
correct errors, not knowing they were able to do it themselves.
7. There are 4 Agatha Christies in this list, 2 of which appear to the eye to be identical.
Computers have trouble recognising that these Authors are the same woman. It’s easy for
humans to do. How could we build a UI to help people help us to merge these duplicates?
8. What have we got?
• Loads of data 23 million records
• Small user base < 20,000
• Small team 6 people
• Small architecture 12 servers
• Good framework infogami, web.py
Certainly there are challenges to trying to make use of a large but shallow dataset, but Open
Library has lots of advantages in terms of a small team & system being able to change
rapidly. This flexibility will hopefully help us.
9. Began experimenting with the data we have to try to see the catalog “landscape”. What do we
already have that we’re not showing to people yet? Look at all these subjects! These
timeframes! How can we make use of them?
10. Look at all these new links! ISBN -> Publisher names -> Show me all the books this publisher
has published... Show me all the subjects related to cheese... Add links and hey presto!
You’re bouncing around the catalog.
11. What if?
• Adjacent books
• Not efficiency, but effectiveness
(conversation broker, records improve
over time) - Shirky
• Not a purchasing engine, but a library
As an exercise, it’s fun to ask what might happen if there were no search box on Open
Library? Could you still use it?
12. Changing the look of the logo will hopefully encourage people to come inside and look
around. Break the conventional “library look” and try to warm it up a little... We are literally
open - both at the software level, but also all of Open Library’s records are editable, by
anyone.
13. Add a Book?
So, let’s take a look at one of the key UIs on Open Library - How to add a new record. This is
the current form. Basically just a web UI to a pretty dense, librarian-centric form. A lot of the
fields are difficult for not-librarians to complete - a definite barrier to entry for both adding
new records and editing existing things.
14. The idea is to break it into two steps. This is step 1.
The most important thing to do is to make it feel easy to add a record. This first step also
gathers enough info to allow us to do a decent search for any existing records. If we find a
match, we can direct people towards the Edit view of that record. If there’s no match, we
move on...
15. Step 2 is a massive form. There’s no way to hide that basically. All the fields are potentially
useful. What we can do is organize the info a little, so related things (the physical object,
pagination) are grouped together. We’re also going to try adding a tabbed view to try to
soften the blow a little. Also, hopefully, adopting a conversational tone with the form labels
might help direct people a little more about the sort of data we want.
16. It would be awesome if we could start to collect excerpts from books. A personal touch from
people about particular bits they’ve enjoyed and why. Also, these excerpts could be indexed
to help boost books in our search.
17. Links, links, links.... This “networked catalog” is all about how many things we can connect
books to. This is the principal of metadata giving records a sort of “surface tension” to keep
them from sinking into the depths.
18. Those first 3 tabs (About, Excerpts, Links) are about the Work level of our records. We’re
going to try this first version not worrying about exposing this slightly weird metadata-y
thing called Work to visitors, but still attempt to collect data at the Work-y level. There’s a
specific tab just for Editions too, that contain fields mainly about publishing info and the
physical (or virtual!) object itself.
19. Another experiment we’re looking forward to trying is about identifiers. We’re not particularly
concerned about canonical identifiers. Perhaps it’s a waste of time to wait for one, so instead,
we’re going to try and attach as many ID types to our records as we can. (This list is just a
braindump - not active yet.) The idea is that people could add a URL or actual identifier and
Open Library would just do the right thing. A suggestion (after this presentation was
delivered) was that people could ping Open Library with an identifier, not even knowing what
TYPE of ID it is. Perhaps Open Library could help “triangulate” this query towards a book
record. “Record laundering.”
20. Key Features
• History
• Activity, life, cause, effect
• Notifications thereof
• List(s)
• More small, ad hoc collections
• Public / private
• Exportable (ad hoc catalogs)
- Planning two features that play off the strengths of the underlying Wiki: History & Lists
- AD HOC (so, BookServer feeds should be expected to be ad hoc. No point in trying to agree
on a hierarchy etc for feeds. Waste of time.)
21. We’re excited about how we might improve the display and linkage from history of our
records. They are another source of connections into and around the catalog, so we should
“activate” them where we can to connect to people, subjects, publishers, even dates. “See
everything that happened on Open Library on May the 4th, 2009. Version 1 probably won’t be
quite this robust :)
24. Small Collections
http://flic.kr/p/34WGhL
• Catalogues to & from from book lovers who may or may not be professional
librarians
• Effective & Personal; Inefficient & Charming, Detailed
• Looking to integrate cool cataloging services like Koha, Delicious Monster -
Anyone??
• It was only last night I met a woman who is cataloguing a business’s library of
some 1,100 books. She had said she was looking on Open Library for a way to
upload a CSV file to us. We should do that, and note it on each edition’s history.
(*Note: Design that CSV and get it online!)
25. History
http://flic.kr/p/6NHecm
- there was some talk about timestamps yesterday. Being able to slice things by time will only
increase in importance as the web gets older, so, I’d suggest putting timestamps on anything
you can think of.
26. Substrate:
any surface on which a plant or animal lives or on
which a material sticks
http://flic.kr/p/4itJcB
27. What if we position library records
like that?
http://flic.kr/p/4itJcB
28. “Build it so anyone can
contribute any amount.”
Clay Shirky
30. http://flic.kr/p/6pmtQL
But, librarians are (very clever) humans too. And everyone who’s responsible for putting
books into a traditional catalogue must work within patterns. Patterns that have grown
semantically remarkable and deeply complex.
31. "But here’s a question for you, let’s say you
have an 856 URL to full text for a serial. And
you know what date ranges it covers. What
sub-field would you put that in? $3 or $z? I
see it in both."
Jonathan Rochkind, Bibliographic Wilderness
http://flic.kr/p/6pmtQL
I’m glad I don’t have to either ask or answer this question.
33. Hic sunt dracones.
http://www.lib.cam.ac.uk/exhibitions/Fantasy_to_Federation/Blaeu.jpg
A detail from a map of the East Indies showing, outlined in pink, the first European
discoveries along the Cape York Peninsula. Early in 1606, towards the northern tip of the
peninsula, Willem Jansz made here what was almost certainly the first landing by Europeans
in Australia. This map first appeared in 1635 and was reprinted unchanged until 1664.
34. Here be dragons.
http://www.lib.cam.ac.uk/exhibitions/Fantasy_to_Federation/Blaeu.jpg
A detail from a map of the East Indies showing, outlined in pink, the first European
discoveries along the Cape York Peninsula. Early in 1606, towards the northern tip of the
peninsula, Willem Jansz made here what was almost certainly the first landing by Europeans
in Australia. This map first appeared in 1635 and was reprinted unchanged until 1664.
35. http://www.lib.cam.ac.uk/exhibitions/Fantasy_to_Federation/Bellin1753.jpg
This is one of the few maps in the eighteenth century devoted entirely to Australia. Jacques
Bellin was hydrographer to the French King Louis XIV. He has added a hypothetical coast line
joining Australia, New Guinea and Tasmania - a note says that this is included without proof.
It is further suggested that New Zealand might be part of the great southern continent.
36. I wonder if librarians are trying to make catalogs look like this... Highly “accurate”; deeply
organized; the perfect information system...
37. http://flic.kr/p/38TZ
What if a catalog looks like this? Is crystalline?
From the artist of this iamge, Jared Tarbell: “Lines like crystals form at perpendicular angles
to existing lines. A complex form emerges.
1000 classic computational substrate, color palette stolen from Jackson Pollock: A simple
perpendicular growth rule creates intricate city-like structures. The simple rule, the complex
results, the enormous potential for modification; this has got to be one of my all time favorite
self-discovered algorithms. Lines likes crystals grow on a computational substrate.”
38. Deconstruction
http://flickr.com/photos/tupwanders/3356077817/
I’ve learned a wee bit about the history of library metadata... And museum metadata for that
matter.... It seems like the 1960s are a bit of a blight for human understanding, since that’s
the time when we got all excited about computers and their processing power, and seemingly
overwrote a lot of the crafty, poetic description and allusion that was done to describe cultural
works, in favour of the Tetris approach.
What happens if you blow it up?
39. 600
13 $a Marie Antoinette $c Queen, Consort of Louis XVI,
King of France $d
1755-1793
650
2 $a Queens $z France $v Biography
1 $a Queens $z France $x Biography
651
2 $a France $x History $y Louis XVI, 1774-1793
1 $a France $x History $y Revolution, 1789-1799
1 $a France $x Queens $x Biography
- I don’t want Open Library to jettison librarianship, or neglect to acknowledge the brilliant
hard work of librarians over the years...
- You could argue that this sort of computer-y librarianship (or any type of “educated
classification”) was (perhaps unintentionally) designed to obscure the personal... the
practical... the human
- How might we adapt or extend (or revert?) this librarians’ work to appeal to a broader
audience?
- Let’s see what happens when you explode Library of Congress Subject Headings. This data
isn’t even in Open Library - we borrowed it from loc.gov then pulled out the dynamite...
40. 600 (people)
13 $a Marie Antoinette $c Queen, Consort of Louis XVI,
King of France $d
1755-1793
650 (subjects)
2 $a Queens $z France $v Biography
1 $a Queens $z France $x Biography
651 (places)
2 $a France $x History $y Louis XVI, 1774-1793
1 $a France $x History $y Revolution, 1789-1799
1 $a France $x Queens $x Biography
These numbers are subsections of a thing called a MARC record - MAchine-Readable
Cataloging
Since librarianship is “diabolically rational” of course, everything is in it’s place, whether it’s a
reference to a person, a place, a thing, an author or, whatever...
41. (people)
Marie Antoinette, Louis XVI
(subjects)
Queens, France, Biography
(places)
France, History, Louis XVI, 1774-1793, Revolution,
1789-1799, Queens, Biography
So, if we get rid of all that machine readable gumpf, we start to have things that humans can
parse as well...
42. Marie Antoinette, Louis XVI, Queens, France, Biography,
History, 1774-1793, Revolution, 1789-1799
43. Marie Antoinette, Louis XVI, Queens, France, Biography,
History, 1774-1793, Revolution, 1789-1799
Then, make them into links, but retain their interconnection.
44.
45. Subject
Related subjects
Books about...
“Collections”
Publishing over
Related authors time
Information from If it’s a place,
the network show a map!
46. Subject
Related subjects
Books about...
“Collections”
Publishing over
Related authors time
openlibrary.org/subjects/places/bordeaux
Information from If it’s a place,
the network show a map!
Give it a URL
47. I used to use this image to represent contact networks on Flickr, but I think itʼs equally applicable as a visual for what a networked library
catalog might look like. How many things can we connect book records to? Not only identifiers, but blog posts, reviews, subjects, publishers,
booksellers etc etc
49. Connect
- exploring partnerships, connections
- reach into existing networks
- Library Thing, Good Reads, open source systems, etc
- open data, improve API
50. Observe
http://flickr.com/photos/odreiuqzide/3195647925/
- see what people do
- provide tools to let people see what everyone else is doing
- monitor activity, like popular records, top editors, sign ups per day etc
- and ABOVE ALL, participate!!!
51. • Navigation
Enhance Tidbits • Key Processes
• Branding
• Recognition
• Contribution
Gather Small Collections • Curated content
• Original clusters
• Content, content, content
Inhale Web Services • BookServer
• APIs in & out
• Workflow
Streamline Library Catalogs • Updates
• Expansion
• Respect
To summarise, here are the 4 levels of stuff we’re trying to focus on in the coming months...
52. Next Steps
• We’re hiring! SOLR, Sys Admin, Web Dev
• Find money! Want to join forces?
• Release the redesign And watch what happens...
Short term... Want to come and work on an awesome project playing with the very nature of a
library catalog? Let me know!