Developer Data Modeling Mistakes: From Postgres to NoSQL
IIIF Technology for VRA33, 14 March 2015, Denver, CO
1. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Image API
• The Pixels
• (Just Enough) Technical Metadata
• Server Capabilities
Presentation API
• Metadata Labels and Values
• Ordering and Arrangement of Images and Other Content
• Relationships to Related Resources
IIIF API Specifications
2. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Why Standardize APIs?
(API = Application Programming Interface)
3. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Without Standards We Have Silos
Application A
Server A
Application B
Server B
Application C
Server C
Application D
Server D
4. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Technology Becomes Interchangeable
Application A
Server D
Application B
Server C Server B
Application C
Server A
Application D
5. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Resources Become Shareable
Application A
Server D
Application B
Server C Server B
Application C
Server A
Application D
6. @jpstroop
VRA 33, 14 March 2015, Denver, CO
http(s)://{server}{/prefix}/{id}/info.json
http(s)://{server}{/prefix}/{id}/{region}/{size}/{rotation}/{quality}.{fmt}
Image API
7. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Features
• Metadata Labels and Values
• Ordering Arrangement of Images and
Other Content
• Object Structure and Layout
• Including Links to the Image API
• Relationships to Related Resources
• Attribution and Licensing
Collection
Manifest
Sequence
Canvas
Content
Presentation API
8. @jpstroop
VRA 33, 14 March 2015, Denver, CO
http://www.dlib.indiana.edu/~jenlrile/metadatamap/
9. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Presentation API Structure
Collection
Manifest
Sequence
Canvas
Content
10. @jpstroop
VRA 33, 14 March 2015, Denver, CO
(Shared) Canvas and Content
Collection
Manifest
Sequence
Canvas
Content
11. @jpstroop
VRA 33, 14 March 2015, Denver, CO
(Shared) Canvas
Collection
Manifest
Sequence
Canvas
Content
12. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Content
Transcription
Commentary
Collection
Manifest
Sequence
Canvas
Content
13. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Presentation API Structure
Collection
Manifest
Sequence
Canvas
Content
14. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Sequence
Collection
Manifest
Sequence
Canvas
Content
15. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Manifest
Collection
Manifest
Sequence
Canvas
Content
16. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Collections
Collection
Manifest
Sequence
Canvas
Content
17. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Descriptive
label
Name of the resource
description
Textual summary
thumbnail
Image summary
metadata
Pairs of Label and Value
Metadata Example:
label:"Created", value:"1300"
Collection
Manifest
Sequence
Canvas
Content
Properties
18. @jpstroop
VRA 33, 14 March 2015, Denver, CO
Rights
license
Link to license description
attribution
Text required to be displayed
logo
Image required to be displayed
Linking
service
Additional service endpoint
seeAlso
Semantic metadata resource
related
Resource to display to the user
Collection
Manifest
Sequence
Canvas
Content
Properties
20. @jpstroop
VRA 33, 14 March 2015, Denver, CO
JSON: Ease of Development
Linked Data: Plays Nicely with
Others
21. @jpstroop
VRA 33, 14 March 2015, Denver, CO
{
"@context":"http://iiif.io/api/presentation/2/context.json",
"@id":"http://www.example.org/iiif/book1/canvas/p1.json",
"@type":"sc:Canvas”,
"label":"p. 1”,
"height":1000,
"width":750,
"images": [
{"@type":"oa:Annotation”,
// annotation linking image to canvas …
}],
"otherContent": [
{"@type":"sc:AnnotationList",
// reference to list of non-image annotations …
}]
}
“{}s are the new <>s” -- Rob
22. @jpstroop
VRA 33, 14 March 2015, Denver, CO
• Authorization / Authentication
• Search within (text and annotations)
• Discovery of Manifest and Image Identifiers
• CRUD
Future Work / Work in Progress
Editor's Notes
As you've heard already IIIF has published two API specifications:
The Image API: for getting at images and relevant metadata
The PresentationAPI: images with relevant descriptive properties, in the context of related content included text transcriptions, annotation, and other related images.
An API is a contract:
Regardless of is going on behind the scenes on a server, it is going to expose a protocol or data structure that I can expect, rely upon, and make certain assumptions about
Using a car as a metaphor, regardless of what’s going on under the hood, if it’s diesel, unleaded or hybrid engine, we can expect a steering wheel, pedals, shift and as a driver move from car to car without difficulty
Why are APIs important? [next]
Without standards we can only have closed systems, servers clients that understand a particular, unique protocol.
APIs make technologies interchangeable, giving us choices between different technologies in the different roles within our application stack
This allows us to choose:
Best of breed tech (server and client)
Servers that play well in your existing environment/infrastructure
Clients that are most suitable to your resources and/or users
Finally, if it isn’t obvious, this also means we can share resources, as clients can speak to multiple servers; this is the heart of the IIIF vision.
With this in mind, let’s look at the image API.
[Bring up spec briefly: http://iiif.io/api/image/2.0/ ]
We’re not going to work through this line by line; I’m going to give you an overview by means of a demo.
The image API defines a URI syntax for two services:
one for getting images,
one for getting just enough technical metadata to drive a rich client
Going back to the car-API metaphor I was using earlier, we worked very hard to determine what the steering wheel, pedals, shift equivalents are.
There have been other attempts at this in the past, but the results were generally too complicated, and too servers-specific
We ultimately decided that region, size, rotation, quality, and format are in scope, but that things like color management and format-specific details are out, more more along the lines of (continuing the car metaphor), say, the type of engine in the car, or fuel to air ratio; these are important, but not to the average consumer
## Go to live demo, during which, be careful to point out:
While one can carefully craft URIs (as I'll do while demonstrating), it is generally expected and intended that URIs will be built using rich web-clients, some of which we’ll demonstrate a bit later on.
That said, having a tidy persistent URL for citations, annotations, web exhibitions, emailing, and other means of sharing can be quite useful, and they make web caches more efficient
It is required that servers apply each transformation from left to right, i.e. in the order specified by the API
Go to live demo:
Engelmann Chromolithography sample
Info:
http://libimages.princeton.edu/loris2/pudl0130%2F8555444%2F02%2F00000007.jp2/info.json
Img:
http://libimages.princeton.edu/loris2/pudl0130%2F8555444%2F02%2F00000007.jp2/full/full/0/default.jpg
Region:
http://libimages.princeton.edu/loris2/pudl0130%2F8555444%2F02%2F00000007.jp2/3930,60,1230,3600/full/0/default.jpg
Size, Qualities, Format
OSd:
http://libimages.princeton.edu/osd-demo/?feedme=pudl0130%2F8555444%2F02%2F00000007.jp2
Scroll, OSd
http://libimages.princeton.edu/osd-demo/?feedme=pudl0123%2F8172070%2F01%2F00000001.jp2
Presentation API: What it is:
A bit more complex, but easy to sum up:
When you have a bunch of content that taken in aggregate represents a real-world object, you need to create relationships between those bits of content, those resources, to make an accurate and useful representation.
This is what the Prezi API aims to do, by defining set of data structures that is focused on user experience:
Enough to drive a rich client
Facilitates, ordering/sorting, arranging, transcribing/annotating
Serialized as JSON-LD, a syntax that is friendly to web developers
Native to JavaScript
Most importantly, UI developers don’t need to understand, e.g., metadata semantics to draw a feature rich user interface
Presentation API: Not Yet Another Metadata Standard
Agnostic of content standards
No descriptive metadata semantics
Instead, as I’ve said, it’s an API:
middleware
As with the image API, just enough info to drive a rich web-client.
There are five core Parts in the Presentation API
They’re best explained by example, so what we’re going to do is walk our way up this graph, building up an object.
It’s a little easier to talk about Content and Canvas togethe
Canvas is the fundamental building block of the model. It represents the notion of a physical unit. You might not have an image; maybe you just know it exists
Following the shared canvas data model, and the Canvas metaphor; any content is “painted” onto the Canvas.
You can think of it like a PowerPoint slide
Content falls into two categories: Images and Annotation.
Here we see an image “painted” on to the canvas
And annotations, which can be assigned to regions on the canvas.
It’s important to note that you may not have an image, or you may have multiple images, or may have images of fragments that you need to arrange on a canvas, or you may only have a transcription, or you may have NOTHING but knowledge of this thing’s existence; that’s all OK, and exactly why the abstract notion of canvas is separate from any given representation,
Continuing our way up the model; so far we’ve painted a single image onto a canvas.
But, as it turns out, panel is part of a hexaptych; one of six that go together to form a map of Rome. The Sequence construct allows us to build up these relationships, put them in order, etc.
TODO: TALK ABOUT VIEWER IMPLICATIONS, not just for these panels, but page turners for books; note that Ranges also exist.
The API distinguishes between rtl, ltr, ttb, btt directionality
There are also features for, e.g. indicating that a page should be skipped
Continuous vs. paged viewingHints
Build Filmstrips or reference strips and pages of ordered thumbnails.
Next we have Manifests. As its name suggests, the Manifest is the package of all of the content, canvases, sequences that need to be drawn together to represent an object, along with metadata and other properties we’ll talk about momentarily.
All of these constituent parts are either contained in a JSON-LD document (we’ll talk about JSON-LD in a bit) that represents the Manifest, or are referenced via URIs in the Manifest.
The Manifest also contains a set of labels and values that may be used to represent descriptive metadata in a viewer; we’ll get back to that when we look at a sample manifest shortly.
And collections, not surprisingly, are groups of manifests. The only thing to note is that collections are recursive; i.e. they may contain other collections.
Moving on, there are a few properties that can be attached to most of the the nodes in the model. These take the form of simple key-value pairs, and, as I said earlier, there are content semantics attached; they’re just labels and values—we did not set out to create another metadata standard.
You can see how these properties are used in Mirador.
Also ranges
Just a quick word about serialization: Like the image API, the Prezi API uses JSON-LD, which is:
Easy for web developers to understand and consume
Without sacrificing the semantic of links data.
Sample manifest walkthrough if time…..
Talk a bit about each, what we mean, scope and current use cases.