SlideShare uma empresa Scribd logo
1 de 14
E-book metadata Presented by Emily Gibson on 3rd November 2011 www.corbas.co.uk [email_address]
What is an e-book? ,[object Object]
An industry standard Maintained by the  International Digital  Publishers Forum  (IDPF) What is ... Zip file consisting of: OCF OPF OPS For more information, see  http://idpf.org/epub/
Inside an EPUB ... and so on, one html file for each chapter ...
Inside OPF file (metadata)
Metadata ,[object Object],[object Object],[object Object],[object Object]
Why is metadata important? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
ONIX  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
ONIX fields
Dublin Core ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Dublin Core elements Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata elements:
EPUB 2 and EPUB 3 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Linking books via metadata http://www.bookcountry.com/books/Map/Default.aspx
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Destaque

The Respiratory System
The Respiratory SystemThe Respiratory System
The Respiratory System
Thessa817
 
Literature iii
Literature iiiLiterature iii
Literature iii
Nora Benso
 
Dt notes part 2
Dt notes part 2Dt notes part 2
Dt notes part 2
syedusama7
 

Destaque (13)

Strategic Implications of XML for your Publishing Business
Strategic Implications of XML for your Publishing BusinessStrategic Implications of XML for your Publishing Business
Strategic Implications of XML for your Publishing Business
 
EPUB 3 (non technical summary)
EPUB 3 (non technical summary)EPUB 3 (non technical summary)
EPUB 3 (non technical summary)
 
Sagitta Immobiliare Srl Relazione Progetto Riqualificazione 1
Sagitta Immobiliare Srl   Relazione Progetto Riqualificazione 1Sagitta Immobiliare Srl   Relazione Progetto Riqualificazione 1
Sagitta Immobiliare Srl Relazione Progetto Riqualificazione 1
 
Decreto quarto-conto-energia firmato
Decreto quarto-conto-energia firmatoDecreto quarto-conto-energia firmato
Decreto quarto-conto-energia firmato
 
Slideshow Studio
Slideshow StudioSlideshow Studio
Slideshow Studio
 
The Respiratory System
The Respiratory SystemThe Respiratory System
The Respiratory System
 
XML and EPUB
XML and EPUBXML and EPUB
XML and EPUB
 
Children's literature
Children's literatureChildren's literature
Children's literature
 
Content Development In The Modern Age
Content Development In The Modern AgeContent Development In The Modern Age
Content Development In The Modern Age
 
Presentazione Pirogassificazione Gmd Sagitta Finale
Presentazione Pirogassificazione Gmd Sagitta FinalePresentazione Pirogassificazione Gmd Sagitta Finale
Presentazione Pirogassificazione Gmd Sagitta Finale
 
Literature iii
Literature iiiLiterature iii
Literature iii
 
Umts interview qa
Umts interview qaUmts interview qa
Umts interview qa
 
Dt notes part 2
Dt notes part 2Dt notes part 2
Dt notes part 2
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Virtusales ebook metadata

Notas do Editor

  1. There are a lot of things about metadata that I don't know. And there's a good reason for this: most production artists limit their involvement to the metadata within the EPUB file. Every person involved in the process knows their little niche best. However, as systems begin to reflect the importance of metadata across all aspects of the publishing process, metadata systems need to be integrated and streamlined - so we have a single source for all content. As with all aspects of digital publishing, knowledge integration needs to work across departments.
  2. PDF HTML fileEPUBProprietary format (Kindle)App (e.g. iPad/phone, Android)Text file (Project Gutenberg)Etc. Problem is with multiple versions of the book … you want to link them but currently there’s no way, especially if you are working and archiving within a single source to multiple platform output workflow e.g. xml first How do you identify a book with its multiple manifestations? DOI, ISTC DOI – any digital entry, resolves to a permanent URI The ISTC provides sales analysis systems, retail websites, library catalogues and other bibliographic systems with a method of automatically linking together publications of the “same content” and/or “related content”, thus improving discoverability of products and efficiencies. An ISTC number is the link between a user’s search for a piece of content and the ultimate sale or loan of a publication.
  3. Consists of three specifications: Open Container Format (OCF) - describes the on-disk formatOpen Packaging Format (OPF) - describes the metadata and TOCOpen Publication Specification (OPS) - describes the content formats
  4. An EPUB contains the following files: - mimetype file: a single text file containing a single line “application/epub+zip” (missing in the image above) - META-INF folder: contains a single container.xml file; if you embed fonts or use DRM an encryption.xml file will also sit here - OPS folder: contains all the xhtml files for the book, a CSS file for the styles and image files if required - XHTML files (a stricter variant of HTML) - one file for each chapter (forces page breaks) and one for the cover - uses UTF-8 encoding - toc.ncx file: for navigational TOC, includes play order; an XHTML file - package.opf file: simple text file that describes the contents of the book; consists of metadata, manifest, spine and guide
  5. Multiple Creator elements Source is the original isbn for the print book
  6. It's a way of organising information: think of a library catalogue."The association of standardized descriptive metadata with networked objects has the potential for substantially improving resource discovery capabilities by enabling field-based (e.g., author, title) searches, permitting indexing of non-textual objects, and allowing access to the surrogate content that is distinct from access to the content of the resource itself." (Weibel and Lagoze, 1997) Quality of metadata is a concern
  7. "deliver rich product information into the supply chain in a standard form, to wholesalers and distributors, to larger retailers, to data aggregators, and to affiliate companies""a template for the content and structure of a product record, ONIX has helped to stimulate the introduction of better internal information systems, capable of bringing together all the “metadata” needed for the description and promotion of new and backlist titles. The same core data can also be used to produce advance information sheets, catalogues and other promotional material." automates: (publisher not retailer control; lower risk of error; time saving)
  8. - this is likely the content of your database - three main standards: ONIX for Books, ONIX for Serials and ONIX for Publications Licenses (licenses for libraries' digital resources) - “not a data model” means that it’s not meant to be a plan for the architecture of your content - ONIX for Books is in version 3.0, which is not backwards compatible with previous versions of ONIX
  9. Remember: not a data model not enough for discoverability
  10. "Dublin Core has as its goals: Simplicity of creation and maintenance The Dublin Core element set has been kept as small and simple as possible to allow a non-specialist to create simple descriptive records for information resources easily and inexpensively, while providing for effective retrieval of those resources in the networked environment. Commonly understood semantics Discovery of information across the vast commons of the Internet is hindered by differences in terminology and descriptive practices from one field of knowledge to the next. The Dublin Core can help the "digital tourist" -- a non-specialist searcher -- find his or her way by supporting a common set of elements, the semantics of which are universally understood and supported. For example, scientists concerned with locating articles by a particular author, and art scholars interested in works by a particular artist, can agree on the importance of a "creator" element. Such convergence on a common, if slightly more generic, element set increases the visibility and accessibility of all resources, both within a given discipline and beyond. International scope The Dublin Core Element Set was originally developed in English, but versions are being created in many other languages, including Finnish, Norwegian, Thai, Japanese, French, Portuguese, German, Greek, Indonesian, and Spanish. The DCMI Localization and Internationalization Special Interest Group is coordinating efforts to link these versions in a distributed registry. Although the technical challenges of internationalization on the World Wide Web have not been directly addressed by the Dublin Core development community, the involvement of representatives from virtually every continent has ensured that the development of the standard considers the multilingual and multicultural nature of the electronic information universe. Extensibility While balancing the needs for simplicity in describing digital resources with the need for precise retrieval, Dublin Core developers have recognized the importance of providing a mechanism for extending the DC element set for additional resource discovery needs. It is expected that other communities of metadata experts will create and administer additional metadata sets, specialized to the needs of their communities. Metadata elements from these sets could be used in conjunction with Dublin Core metadata to meet the need for interoperabilbility. The DCMI Usage Board is presently working on a model for accomplishing this in the context of "application profiles."
  11. These can be used inside an epub. Of these 3 are required: identifier, title and author. Is this good enough for discoverability? What’s missing? Genre for e.g.? Subject area? There is a freeform field called meta that has two attributes: name and content, you could use this for additional data but it’s not consistent enough to reliably use for discovery
  12. very rich relationships in metadata now - can express information about relationships for example link element that allows you to reference metadata by url e.g. DOI or URL for the ONIX feed meta in epub 3 is 1 identifier element, title element, language element, plus a meta element giving the last modified date.allows you to express the vocabulary (e.g onix, dublin core, etc) and a relationship to another metadata element as well as the content and namethe example they give is that you could use meta to identify that a piece of content is a video clip and then refine it with another meta tag to give the durationgood example from the spec: <dc:creator id="creator">Haruki Murakami</dc:creator> <meta refines="#creator" property="role" scheme="marc:relators" id="role">aut</meta>that says that Haruki Murakami is the creator and the refinement says that the he's the authorthat's the new way to do <dc:creator opf:role='aut'>Haruki Murakami</dc:creator> btwbut the basis is that you can express relationships in the metadataone good thing - the new metadata standard should stop that horrible problem with having to create a dummy author with the names of all of the authorsanother example of useful metadata (for this audience): <link rel="onix-record" href=" http://example.org/onix/12389347 "/>
  13. The metadata database for these books is all keyed individually, and for this particular website, but what about putting a genre field into your epub to make your content more discoverable? http://www.bookcountry.com/books/Map/Default.aspx