3. WW1 Discovery Project
• Exemplar illustrating principles
of the JISC Discovery initiative
• Discovery about advocating
‘open’ and ‘aggregating’
• Make digital content more
discoverable by people and
machines
• Building WW1 aggregation API www.discovery.ac.uk
and discovery layer
4. WW1 Discovery: How?
• Aggregate data from
existing APIs – IWM
and NMM
• Help others with
example APIs – BL,
MCR Archives, Welsh
Voices, LSE
• Formats: SOLR, RSS,
OpenSearch, OAI-
PMH, CSV
8. Some Challenges
• Difficulties merging data
– varied content
• Lack of content
– images
– geo-data
• Content licenses not open
• Lack of APIs
9. Linking Lives
• Linking Lives is a JISC-funded project to create
an end-user interface based on Linked Data
• A biographical interface, providing information
about individuals that is taken from a variety
of sources
• It enables us to place archival descriptions
within a much broader context
16. Event: Birth of Skinner, Beverley, 1938-1999, artist and
Death of Skinner, Beverley, 1938-1999, artist
17. Martha Beatrice Webb
Image Works
Life dates: 1858-1943
Epithet: social reformer and Our Partnership
historian My Apprenticeship
Family name: Webb The case for the factory acts
Beatrice Webb’s diaries; edited by Margaret Cole
The Diary
Place of birth: Gloucester,
England
Place of death: Liphook, Knows
Hampshire, England
Biographical Notes
from: Beatrice Webb letters
Beatrice Webb (1858 - 1943). Fabian Socialist, social reformer, writer,
historian, diarist. Wife, collaborator and assistant of Sidney Webb,
later Lord Passfield. Together they contributed to the radical ideology
first of the Liberal Party and later of the Labour Party. http://dbpedia.org/page/George_Bernard_Shaw
from: Beatrice Webb, A summer holiday in Scotland, 1884.
Beatrice Webb (1858-1943), nee Potter, social reformer and diarist.
Married to Sidney Webb, pioneers of social science. She was involved
in many spheres of political and social activity including the Labour
Party, Fabianism, social observation, investigations into poverty,
development of socialism, the foundation of the National Health
Service and post war welfare state, the London School of http://dbpedia.org/page/Sidney_Webb,_1st_Bar
on_Passfield
21. Hub data inconsistencies
• Winston Leonard Churchill
• Sir Winston Leonard Spencer Churchill
• Churchill, Sir, Winston Leonard Spencer, 1874-
1965, knight, prime minister and historian
• Churchill, Winston Leonard, 1874-1965, prime
minister
• Churchill, Sir Winston, 1874-1965, knight,
statesman and historian
22. We can start to say things that go beyond what is known within our
own space…and if we use the same URIs we can link data sources
much more easily
http://data.archiveshub.ac.uk/id/person/nra/webb
marthabeatrice1858-1943socialreformer
<is the same as>
http://viaf.org/viaf/86607236/
<is the same as>
http://bnb.data.bl.uk/doc/person/WebbBeatrice1858-
1943
26. Conclusions
• APIs offer useful tools & lightweight approach
• WW1: creating APIs for institutions
• Locah: Linked Data for over 200 institutions
• Linked Data makes use of the Web architecture
• Uses HTTP URIs to represent resources
• Navigate through things via URIs – Web of Data
• Can make APIs more ‘Webby’
27. Conclusions
• Linked Data opens data up to wide variety of
uses
• Can de-reference classes and properties
• Need to think more about the end user
• Need more tools
• Need to collaborate
WW1 Discovery WebsiteThis is home for our WW1 API project. Specifically looked at lightweight APIs.Say that an API stands forKind of back for machines to get access websites and services’ content
About the WW1 Discovery project
Using APIs and helping institutions set up APIs.
Challenges include the lack of APIs available to aggregate data.
Linking Lives takes a different approach – using Linked Data rather than merging data through APIs.
The core data comes from the Archives Hub, UK aggregator of archival descriptions
The Archives Hub provides classic search/retrieve of ISAD(G) archival finding aids
Descriptions are often rich in content – often extensive scope & content and biographical history. Controlled index terms – subjects and names. Every descriptions has an HTTP URI.
Often multi-level descriptions.
The Locah project transformed Archives Hub EAD (XML) descriptions into RDF. The homepage provides access to an RDF dump, a Sparql endpoint and the XSLT stylesheet.
Entities such as repositories have a number of triples describing relationships, including spatial data.
Triples relating to individuals. Include modelling of birth and death dates as events.
Mock-up of the LInking Lives interface shows the way data is brought together.
External data is key to linked data. We link to VIAF and through that to DBPedia. We are looking at linking to the BNB.
But matching strings is not easy, e.g. matching subjects in the Hub with subjects in LCSH.
Our solution for matching names in the hub to names in VIAF was quite manual – it cannot be used when we up-scale. Many names are hard to match – may not have life dates.
Names are often entered into the Hub in different ways, despite the use of Rules.
The power of linked data is the way we can match up data from different datasets. We could consider using the same URIs, e.g. VIAF URIs for names.
One of the challenges of doing LInked Data is the plethora of vocabularies. It is hard to decide what we should use.
Shared RDF model could really help – still deal with different domain models (ontologies) – eg. BL data and Locah data are quite close but different models. So not just a ‘perfect fit’ – and others may be modelled more differently. Common ‘meta model’. Otherwise pulling in metadat from umpteen different metadata formats (cos not all using RDF). Eg. pull in TEI, MODS, EAD, MARC data.
Something like the BL model could be used as the basis for other bibliographic data.
– the way websites work is the way Web APIs should work, but not always the case – doesn’t have links in. Often you have to know how to create the links. They don’t use the power of the Web as a whole.API often chunk of XML with numbers in and you have to know what to do with themThere are substantial benefits to participating in the existing network of URIs, including linking, bookmarking, caching, and indexing by search engines, and there are substantial costs to creating a new identification system that has the same properties as URIs.
Linked data offers a great deal but maybe not thinking enough about end-users yet.
the legislation.gov.uk site is a good example of a site using linked data effectively.
It provides effective searches…but these benefit from an API on top of the data…
So it is an exmaple of the benefits of utilising both linked data and APIs.
It is probably not constructive to think in terms of APIs vs. Linked Data