1st Workshop of Transfer Information for Innovation · 3rd November 2011 · Valencia. Robbio, Antonella De; Tsakonas, Giannis. "Open Bibliographic Data and E-Lis: marrying good intentions"
DevoxxFR 2024 Reproducible Builds with Apache Maven
Tsakonas-Robbio·Open Bibliographic Data E-Lis
1. Open Bibliographic Data and E-LIS Antonella De Robbio
Università degli Studi di
marrying good intentions Padova, Italy,
E-LIS Executive Board
Giannis Tsakonas
University of Patras, Greece,
E-LIS Executive Board
first international workshop for transfer of information for innovation, November 3, 2011, Valencia, Spain
2. Background
- Exchanging bibliographic datasets is a traditional task in the world of
libraries. It has mainly the form of collaborative cataloguing.
- to avoid duplication of effort
- The openness of bibliographic data concerns a number of organizations
and individuals, such as libraries and library consortia, indexing
services, funding agencies and publishers and many more.
2
3. Why open bibliographic data?
- Freeing access to bibliographic information.
- Making bibliographic data dynamic entities.
- Identifying quality issues in bibliographic datasets.
- Easing publication of small bibliographic datasets, as well as assembling
them at will.
- Facilitating the uploading of bibliographic data in the Linked Open Data
cloud.
- Advancing collaboration with other bibliographic organizations and
systems.
- Transforming bibliographic data to research, by mapping scholarly
research and activity.
3
4. Any complications?
- “Closed” attitudes that perceive bibliographic data as static property.
- Large coordinating organizations that follow rigid models or view
the situation in a reluctant fashion.
- Cues of provenance that have been lost in the paths of cooperative
schemata, such as WorldCat.
4
5. Bibliographic data as linked data
- We remind that:
- openness can facilitate the uploading of bibliographic data on the
Linked Open Data cloud.
- to forward reusability of bibliographic structures, such as
catalogues, taxonomies, vocabularies, etc.
5
6. Navigation in linked bibliographic data
- VIAF: The Virtual International Authority File
- http://viaf.org/
6
7. An example: libris
- Libris released the Swedish National Bibliography
as Linked Data in 2008.
- It used related ontologies, such as FOAF for
individuals, SKOS for subjects, BibO for book’s
parts.
- Focused on availability (see “re-usability”) instead
perfect representation of the MARC records.
- Provides external links to Wikipedia, DBPedia, LC
Authorities (names & subjects) and VIAF. 7
8. An example: libris - the license
- From site:* “... We see the investment in Open Data
as a strategic one and one that is needed to ensure
long term sustainability and competition when it
comes to the services needed by libraries and their
users as well as the right to control over their
collections. The license chosen is CC0 which waives
any rights the National Library have over the
National Bibliography and the authority data... ”
* http://bit.ly/qeGJTb
8
9. How we make bibliographic data open?
- Today we encounter a few noteworthy initiatives
for the opening of bibliographic data.
- The effort is currently spearheaded by Open
Bibliographic Principles, a “product” of the Open
Knowledge Foundation.
- a hub of software solutions, policy texts,
application guidelines, as well as a
communication node among interesting
parties.
9
11. Open: - Science, - Government & -
Bibliographic Data
Open Science Open Government
Scientific research Public data:
data environment,
Raw data public policies,
Research output as Open Bibliographic laws…
content
Data
Open Content
- Open Government Data Venn Diagram* by justgrimes
11
* http://www.flickr.com/photos/notbrucelee/5241176871/
12. Open Definition
- The Open (Knowledge) Definition sets out principles to define the ‘open’
in open knowledge.
- The term knowledge is used broadly and it includes all forms of data,
content such as music, films or books as well any other types of
information.
12
13. Open Definition’s requirements
- Open bibliographic data can be licensed conforming to the Open
Definition requirements. In a nutshell Open Definition states:
- “A piece of content or data is open if anyone is free to use, reuse, and
redistribute it — subject only, at most, to the requirement to attribute and
share-alike.”
13
14. Which licenses to apply?
- Open or closed licenses?
- Data and content may have separate rights
- We must distinguish between the “Database” and its “Contents”
- homogeneous DB (no need to distinguish “Database” & “Contents” )
- non-homogenous DB (need to distinguish “Database and Contents”)
- Creative Commons CC Licenses are for content
- Open Data Commons are for data
14
15. Creative Commons licenses
- CC licenses are expressed in 3 different formats:
- the metadata (machine readable code).
- the Commons Deed (human-readable code),
- the Legal Code (lawyer-readable code);
- The key terms of the core suite of Creative
Commons licenses are four
- Attribution (Attribution stacking, usually non
for LIS papers)
- Non-Commercial (What counts as commercial?
A&I tasks)
- Share alike (Reduces interoperability)
- No Derivates (severely restrict use)
15
16. CC: six regularly used licenses
- Mixing and matching these conditions produces sixteen possible
combinations
- The combination of CC tools by communities is a vast and growing
digital commons, a pool of content that can be copied, distributed,
edited, remixed, and built upon, all within the boundaries of copyright
law.
Attribution alone (by) Attribution + ShareAlike
(by-sa)
Attribution + Attribution +
Noncommercial (by-nc) Noncommercial +
NoDerivatives (by-nc-nd)
Attribution + Attribution +
NoDerivatives (by-nd) Noncommercial +
ShareAlike (by-nc-sa)
16
17. CC PDF Converter
- CC PDF Converter is a free open source program that allows users to
convert documents into PDF files on Microsoft Windows operating
systems, while embedding a Creative Commons license, which uses the
following open source projects:
- Redmon, a port monitor redirector (slightly patched)
- GhostScript, used to create PDF files from the print PostScript output
(with a minor addition)
- libPNG and zlib, to display PNG images and to put license images
into the document
- XMLite, a simple XML parser by Kyung-min Cho, slightly modified
- and SQLite, an lightweight embedded database engine to access the
local license database
- The CC PDF Converter and its source code is licensed under GPL.
17
18. CC tools
- Creative Commons Rights Expression Language (CC REL) is a
specification describing how license information may be described
using RDF and how license information may be attached to works.
- Besides licenses, CC also offers a way to release material into the public
domain through CC0 a legal tool for waiving as many rights as legally
possible, worldwide
18
19. Conformant content licenses
License Domain By SA
Creative Commons Attribution Content Y N
Creative Commons Attribution Share-Alike Content Y Y
Content,
Creative Commons CCZero N N
Data
GNU Free Documentation License
Comment: Only conformant subject to certain Content Y Y
provisos
Content,
UK PSI Public Sector Information Y N
Data
Free Art License Content Y Y
Code,
MirOS License Y N
Content 19
20. Conformant data licenses
License Domain By SA
Open Data Commons Public Domain Dedication
and Licence (PDDL) Data N N
Dedicate to the Public Domain (all rights waived)
Open Data Commons Attribution License
Data Y N
Attribution for data(bases)
Open Data Commons Open Database License
(ODbL) Data Y Y
Attribution-ShareAlike for data(bases)
Creative Commons CCZero Content,
N N
Dedicate to the Public Domain (all rights waived) Data
20
21. Non-conformant Licenses
- Creative Commons No-Derivatives (by-nd-*)
violate principle 3., “Reuse”, as they do not allow
works, in part or in whole, to be re-used in
derivative works.
- Creative Commons NonCommercial licenses (by-
nc-*) do not support the Open Knowledge
Definition principle 8., “No Discrimination
Against Fields of Endeavor”, as they exclude
usage in commercial activities.
21
22. Stars and clouds
- MacKenzie Smith - in the frame of the LODLAM initiative - proposed a
ranking for the openness of data from informational and cultural
organizations, similar to the Linked Data ranking.*
Public Domain (CC0 / ODC PDDL / Public Domain Mark)
Attribution License (CC-BY / ODC-BY)
Attribution License (CC-BY / ODC-BY) - method specified
Attribution Share-Alike License (CC-BY-SA/ODC-ODbL)
* http://bit.ly/kPWKHA
22
23. The case of E-LIS
- In February 2011 the Executive Board received an invitation by OKF to
endorse OBD.
- The invitation was discussed in the EB and it was integrated in the
Acropolis Strategy document.
- The EB after a thorough discussion decided -in July 2011- to adopt the
ODbL license, a one star license.
- Therefore E-LIS (meta)data are freely available to anyone, but
attribution and share-alike is required. Practically users can:
- use the metadata by anyone for any purpose
- use the metadata by providing an attribution to E-LIS
- and re-distribute the data (as is or in combination to other datasets)
in the same sense.
23
24. The rationale
- E-LIS gets an attribution whenever the data is used
- protecting and promoting the value added work of our community
- E-LIS joins a coalition of other share-alike organizations and
institutions.
- the share-alike requirement aims at securing an open
redistribution.
- E-LIS is aware that the share-alike requirement may prove a constrain
in extended combination of its data.
- however it complies with the least of requirements: it is a clear and
explicit statement.
24
25. Is E-LIS alone in the OB ecology?
- The New Zealand National Library provides the
national bibliography as MARC/MARCXML sets
(approx. 350,000 records) licensed under a
Creative Commons Attribution license.
- From site:* “... The records were originally created
by the National Library of New Zealand, with a
small number of contributions from the libraries of
New Zealand through Te Puna. Bibliographic
records and book cover images used under license
by the National Library have been excluded from
this dataset release... ”
* http://www.natlib.govt.nz/services/data
25
26. This and something more
- E-LIS publishes JITA, a taxonomy of documents in
Library and Information Science (currently under
revision), as a Linked Open Dataset.
- JITA is available through DataHub,* alternatively
known as the Comprehensive Knowledge Archive
Network, which is a registry of open knowledge
datasets and projects.
* http://ckan.net/dataset/jita
26
27. What now?
- check again the “ecology” figure to identify where you stand.
- find the proper tools that you need to move on to the direction you
want
- exploit them
- check the open bibliographic data guide* to find related use cases
- don’t forget national applicability laws
http://obd.jisc.ac.uk/ 27
28. Thank you for your attention! / ¡Gracias por su atención!
Creative Commons License - Attribution 1.0 Generic