Boost PC performance: How more available memory can improve productivity
Biblissima’s Choices of Tools and Methodology for Interoperability Purposes
1. Biblissima’s Choices of Tools and Methodology
for Interoperability Purposes
Eduard FRUNZEANU
Régis ROBINEAU
Équipex Biblissima
http://biblissima-condorcet.fr
5th HÉLOÏSE WORKSHOP
Madrid, 19-21 October 2015
3. Partners and challenges
Around 40 databases with DATA and IMAGES
codicology
catalography
manuscript transmission
Esprit
des
livres
(ENC)
Codicologia
(IRHT)
Bibale
(IRHT)
BnF
Archives
et
Manuscrits
Pinakes
(IRHT)
Reliures
(BnF)
4. Partners and challenges
Around 40 databases with DATA and IMAGES
iconography prosopography
Mandragore
(BnF)
IniBale
(IRHT)
textual corpora
Bibliothèque
Virtuelles
Humanistes
(CESR)
Prosopographie
des
inventaires
(MRSH
Caen)
BUDE
(IRHT)
Sermones
(CIHAM)
5. Solutions and tools to handle and build
interoperability of DATA & IMAGES
❖ Ontology (based on CIDOC-CRM and FRBRoo)
❖ Thesaurus/ Authority File (Ginco / BaseX)
❖ Viewer (Mirador)
❖ Semantic Web Application Framework (CubicWeb)
6. Building the Thesaurus / Authority File
• Thesaurus:
– Types of data:
• geographical names
• iconographical descriptors
• specialised terminology (codicology, palaeography)
• languages, etc.
• Standard / Tool: SKOS / Ginco
• Authority File:
– Types of data:
• persons and corporate bodies
• works
– Standard / Tool: XML-TEI / BaseX
7. Geographical Thesaurus
Types of geographical data:
• descriptors: geographical places identified in miniatures
(historical, disappeared, fictional, non-identified, current)
• places of origin: city or abbey where an item (manuscript or
printed work) was copied / edited / painted
• holding institutions: archives, libraries, museums
Structure and format of geographical data:
• hierarchical thesaurus or flat lists for the descriptors
• places of origin associated with the relevant provinces,
countries and geographical areas
• Country / City / Repository for the holding institutions
8. Starting point for Biblissima’s GeoThesaurus
2 datasets: Mandragore (BnF) & Initiale (IRHT)
Linked Data repositories & methods used for alignment:
• automatic alignment, checked & manually corrected, to
geonames.org , data.bnf.fr (Map Department & Rameau),
dbpedia.org
• manual alignment to specialised repositories: pleiades.stoa.org,
trismegistos.org, bibelwissenschaft.de
SKOS properties used to label the alignment and organise the
thesaurus: prefLabel, altLabel, broader, narrower, exactMatch,
closeMatch, relatedMatch
11. Hierarchy of Biblissima’s GeoThesaurus
I. General notions
II. Political geography (based on feature codes of Geonames)
A. Geographical areas (= Dewey classification)
1. Countries
a) Counties
(1) Cities
2. Ancient cities and provinces
III. Physical geography (based on feature codes of Geonames)
A. Continents B. Islands & Peninsulas C. Deserts & Oasis
D. Rivers, Lakes, Seas E. Mountains & Volcanos F. Forests & Parks
IV. Human constructions (based on feature codes of Geonames)
A. Monasteries B. Castles & palaces C. Religious sites
D. Bridges E. Towers & fortresses
V. Fictional places
VI. Non-identified places
VII. Disappeared places
12. Integrating a geographical thesaurus into
CubicWeb application used to build the
Biblissima portal
MNMT, OBS, RLG
Geonames feature codes
18. Authority File
Data about persons:
- Personal Name Heading
- Alternative Name Forms
- Gender
- Date of birth / death
- Place of birth / death
- Titles / Relators
- Works
- Alignments with linked data repositories: data.bnf.fr, viaf.org
Relationships between persons (to be modelled in the near
future):
- academic (master of / student of)
- genealogical (father of / husband of)
- institutional (friar of / member of)
- intellectual (translator of / copyist of / editor of / illuminator of)
- socio-cultural (dedicatee of / donor of / patron of / sponsor of)
22. Curate and enrich datasets
Operations:
• identify identical items with different graphical forms
• align with other linked data repositories:
– data.bnf.fr - dbpedia.org
– viaf.org - geonames.org
• extract complementary information
• dispatch the complementary information to the original datasets
Tools:
• OpenRefine
• GoogleXML
• PHP scripts
27. SPARQL query to retrieve the personal name
heading for an author via data.bnf.fr/sparql
Alignment
of
the
database
graphical
form
Abbo
Floriacensis
=
hLp://data.bnf.fr/ark:/12148/cb12584637x
28. SPARQL query to retrieve alternative forms of
an author’s name via data.bnf.fr/sparql
Alignment
of
the
database
graphical
form
Abbo
Floriacensis
=
hLp://data.bnf.fr/ark:/12148/cb12584637x
29. SPARQL query to retrieve URIs from other
linked data repositories
Alignment
of
the
database
graphical
form
Abbo
Floriacensis
=
hLp://data.bnf.fr/ark:/12148/cb12584637x
41. PHP script
• Input (CSV): list of places aligned with data.bnf
URIs
Get lat/long (Sparql query in the loop):
SELECT
?concept
?spatialThing
?long
?lat
WHERE
{
?concept
skos:closeMatch
<".$uri.">
.
?concept
foaf:focus
?spatialThing
.
?spatialThing
geo:long
?long
.
?spatialThing
geo:lat
?lat
.
}
• Output (CSV): source data enriched with
latitude/longitude coordinates
42. Using technical solutions to build a prototype
based on Initiale & Mandragore data
demos.biblissima-‐condorcet.fr/prototype/
One of the prototype’s main objectives: to build interoperability between two datasets
from the iconographical databases Initiale (IRHT) and Mandragore (BnF)
43. Aligning different forms of a name
• Titus Livius / Database Mandragore (BnF) http://mandragore.bnf.fr
hLp://data.bnf.fr/ark:/12148/cb11886799m
44. Aligning different forms of a name
• Livius / Database Initiale (IRHT) http://initiale.irht.cnrs.fr
hLp://data.bnf.fr/ark:/12148/cb11886799m
45. Find relevant results for data from two
different datasets in the same interface
IniBale
Mandragore
46. Find relevant results in a web search engine
by searching the URI
BnF
Biblissima
=
Titus
Livius
47. New visualisation tools to enhance research
Introducing Mirador:
• IIIF-compatible web viewer (Shared Canvas / OA)
• Zoom, compare, annotate, share
• multi-window workspace, cross-repository
(interoperability)
iiif.ioprojectmirador.org
48. Autograph handwriting and personal identity
Mirador and its potential uses:
• Trace, identify and index an author’s personal annotations. Ex.:
Marginalia by Florus of Lyon on St Petersburg, National Library of Russia,
Lat.F.papyr. I.1, b (annotated in Mirador)
49. Autograph handwriting and personal identity
Mirador and its potential uses:
• Create a database of autographs in order to better identify
scribes and writers
Note about an
autograph letter
by Jean Hervin in
the manuscript
Paris, BnF
Français 17708, f.
210r
50. Autograph handwriting and personal identity
BnF Français 17708, f. 210r, annotated in Mirador viewer:
http://demos.biblissima-condorcet.fr/mirador/?json=56241541e4b01190df3c263d
51. Stylistic features and personal identity
Mirador and its potential uses:
Compare stylistic features to better identify artists (e.g. Willem
Vrelant in Initiale)
Paris, Bibliothèque Sainte-Geneviève,
ms. 0811, f. 005
Reddition de Valenciennes à Herman,
comte de Mons
Attribution: Willem Vrelant (entourage)
Paris, Bibliothèque Sainte-Geneviève,
ms. 0809, f. 317
Siège de Mayence par les Romains
Attribution: Willem Vrelant (entourage)
52.
53. Restoring the relationship between an artistic
work and its textual context
The interpretation of a miniature is dependent on the original
textual context.
However, there are many damaged manuscripts:
• around 280 notes about cut miniatures in Initiale database
• very few of these cut miniatures have been located
Example: Châteauroux BM, ms. 5, Grandes Chroniques de France
http://demos.biblissima-condorcet.fr/chateauroux/
54. What’s next?
• Integrate all partner databases using a common XML
format to simplify the ingestion of data into the CubicWeb
application
• Integrate the Mirador viewer within the web portal (with
new functionalities)
• Enhance the search engine and navigation within the portal
• Propose new visual representations of data