SlideShare a Scribd company logo
1 of 31
Linked Data – the Future for Open Repositories?


                               Kultivate Linked Data Workshop
                                                   London, UK
                                           12th December 2011


                                                         Adrian Stevenson
                                UKOLN, University of Bath, UK (until end Dec 2011)
 Mimas, Libraries and Archives Team, University of Manchester, UK (from Jan 2012)
LOCAH and Linking Lives Projects

• Linked Open Copac and Archives Hub
   – Funded by #JiscEXPO 2/10 ‘Expose’ call
      • 1 year project. Started August 2010
   – Partners & Consultants:
      • UKOLN, Mimas, Eduserv, Talis, OCLC, Ed Summers
   – http://blogs.ukoln.ac.uk/locah/


• Linking Lives
   – JISC funded for ‘Mimas Enhancements’
   – 11 month project. Started Sept 2011
   – http://archiveshub.ac.uk/linkinglives/
Archives Hub and Copac
• UK National Data Services based at Mimas
• Archives Hub is an aggregation of archival
  descriptions from archive repositories
  across the UK
  – http://archiveshub.ac.uk
• Copacprovides access to the merged library
  catalogues of libraries throughout the
  UK, including all national libraries
  – http://copac.ac.uk
LOCAH Outputs
• Expose Archives Hub &Copac
  data as Linked Data
• Create a prototype visualisation
• Report on opportunities and
  barriers
How do we expose the Linked Data?

1.   Model our ‘things’ into RDF
2.   Transform the existing data into RDF/XML
3.   Enhance the data
4.   Load the RDF/XML into a triple store
5.   Create Linked Data Views
6.   Document the process, opportunities and
     barriers on LOCAH Blog
Modelling ‘things’ into RDF
• Archives Hub data in ‘Encoded Archival
  Description’ EAD XML form
  – http://www.loc.gov/ead/

• Copacdata in ‘Metadata Object Description
  Schema’ MODS XML form
  – http://www.loc.gov/standards/mods/

• Take a step back from the data format
  – What is EAD or MODS document “saying” about
    “things”?
  – What questions do we want to answer about those
    “things”?
URI Patterns
• Need to decide on patterns for URIs we generate
• Following guidance from W3C ‘Cool URIs for the Semantic
  Web’ and UK Cabinet Office ‘Designing URI Sets for the UK
  Public Sector’
   http://data.archiveshub.ac.uk/id/findingaid/gb1086skinner
      ‘thing’ URI
      redirects to …
   http://data.archiveshub.ac.uk/doc/findingaid/gb1086skinner
      document URI


    http://www.w3.org/TR/cooluris/
    http://www.cabinetoffice.gov.uk/resource-library/designing-uri-sets-uk-public-sector
Vocabularies

• Using existing RDF vocabularies
  – DC, SKOS, FOAF, BIBO, WGS84 Geo, Lexvo, ORE,
    LODE, Event and Time Ontologies
• Define additional RDF terms where required
  – hub:ArchivalResource
  – copac:Creator
• It can be hard to know where to find and how
  to use vocabularies and ontologies
Archives Hub Model
                                                                                                                   in
  Finding                              maintainedBy/       Repository          administeredBy/    Place                           Postcode
    Aid                                maintains            (Agent)            administers                                          Unit
   hasPart/           encodedAs/
   partOf             encodes              EAD
                                         Document
                                                                 accessProvidedBy/
                                                                                                    Level
Biographical                  hasBiogHist/     topic/
                                                                 providesAccessTo
   History                    isBiogHistFor    page
                                                                                       level      Language
                                                                Archival              language                          at time
   topic/
   page
                                origination    hasPart/         Resource
                                                                                     product of   Creation                        Temporal
                                               partOf
                                                           associatedWith
                                                                                                                                    Entity
                                                                                      extent
                                              inScheme
                                                                                                    Extent
   Agent                         Concept                        Concept
                                 Scheme
                                                                                                    representedBy
        Is-a                                       foaf:focus
                                                                                                                                   Object
                                                                                     Is-a         associatedWith
  Person                        Family             Organisation                 Place
                                                                                                                                    Book

            participates in

   Birth                        Death                                                                 Genre                       Function


                          at time
                                                            Temporal
                                                              Entity
CopacModel
Transforming into RDF/XML

• Transform EAD and MODS to RDF/XML based
  on our models
  – Hub: XSLT Stylesheet
  – Copac: created in-house Java transformation
    program


• Load RDF/XML into a triple store
We’re Linking Data!
• If something is identified, it can be linked to
• We take items from our datasets and link them
  to items from other datasets


      BBC
                                        Copac
                    VIAF



    DBPedia
                                         GeoNames
                      Archives Hub
Enhancing our data
• Already have some links:
   – Time - reference.data.gov.ukURIs
   – Location - UK Postcodes URIs and Ordnance
     Survey URIs
   – Names - Virtual International Authority File
      • VIAF matches and links widely-used authority
        files - http://viaf.org/
   – Names - DBPedia
• Also looking at:
   – Subjects - Library Congress Subject Headings
     and DBPedia
http://data.archiveshub.ac.uk/
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
http://data.copac.ac.uk/ (to be released very soon!)
Visualisation Prototype
Using Timemap –
 – Googlemaps and
   Simile
 –   http://code.google.com/p/time
     map/

Early stages with this
Will give location and
‘extent’ of archive.
Will link through to
Archives Hub
Linking Lives Project




               http://archiveshub.ac.uk/linkinglives/
BBC Music




http://www.bbc.co.uk/music/artists/f0ed72a3-ae8f-4cf7-b51d-2696a2330230
Key Benefit of Linked Data
• API based mashupswork against a fixed
  set of data sources
   • Hand crafted by humans
   • Don’t integrate well
• Linked Data promises an unbound global
  data space
   • Easy dataset integration
   • Generic ‘mesh-up’ tools
Challenges
Data Modelling
• Steep learning curve
  – RDF terminology “confusing”
  – Lack of archival examples
• Complexity
  – Archival description is hierarchical and
    multi-level
• ‘Dirty’ Data
Linking Subjects
Linking Places
Scalability / Provenance




• Same issue with attribution
• Solutions: Named graphs? Quads?
                                    Example by Bradley Allen, Elsevier at
• Best Practice                     LOD LAM Summit, SF, USA, June 2011
Licensing

• Ownership of data often not clear
• Difficult to track attribution and
  provenance
• CC0 for Archives Hub and Copac
  test datasets
Sustainability
• Can you rely on data sources long-term?
• Ed Summers at the Library of Congress
  created http://lcsh.info
• Linked Data interface for LOC subject
  headings
• People started using it
Library of Congress Subject Headings
Linked Data the Future for Open
            Repositories?
• Enables ‘straightforward’ integration of
  wide variety of data sources
• Repository data can ‘work harder’
• New channels into your data
• Researchers are more likely to discover
  sources
• ‘Hidden' collections become of the Web
Attribution and CC License
• Sections of this presentation adapted from materials created
  by other members of the LOCAH & Linking Lives Projects
• This presentation available under creative commonsNon
  Commercial-Share Alike:

  http://creativecommons.org/licenses/by-nc/2.0/uk/

More Related Content

Viewers also liked

Drupal debugging tips
Drupal debugging tipsDrupal debugging tips
Drupal debugging tips
Adolfo Nasol
 
Introduction to Drupal Basics
Introduction to Drupal BasicsIntroduction to Drupal Basics
Introduction to Drupal Basics
Juha Niemi
 

Viewers also liked (11)

드루팔이란
드루팔이란드루팔이란
드루팔이란
 
What is Drupal Ladder?
What is Drupal Ladder?What is Drupal Ladder?
What is Drupal Ladder?
 
Why I Hate Drupal
Why I Hate DrupalWhy I Hate Drupal
Why I Hate Drupal
 
Hello Drupal
Hello DrupalHello Drupal
Hello Drupal
 
Drupal debugging tips
Drupal debugging tipsDrupal debugging tips
Drupal debugging tips
 
Drupal 7x Installation - Introduction to Drupal Concepts
Drupal 7x Installation - Introduction to Drupal ConceptsDrupal 7x Installation - Introduction to Drupal Concepts
Drupal 7x Installation - Introduction to Drupal Concepts
 
Group - Drupalcamp London 2016
Group - Drupalcamp London 2016Group - Drupalcamp London 2016
Group - Drupalcamp London 2016
 
Beginners Guide to Drupal
Beginners Guide to DrupalBeginners Guide to Drupal
Beginners Guide to Drupal
 
Introduction to Drupal for Absolute Beginners
Introduction to Drupal for Absolute BeginnersIntroduction to Drupal for Absolute Beginners
Introduction to Drupal for Absolute Beginners
 
Introduction to Drupal Basics
Introduction to Drupal BasicsIntroduction to Drupal Basics
Introduction to Drupal Basics
 
Managing drupal views in code
Managing drupal views in codeManaging drupal views in code
Managing drupal views in code
 

Similar to Linked Data - the Future for Open Repositories. Kultivate Workshop

Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Anne Nicolas
 
Presentation distro recipes-2013
Presentation distro recipes-2013Presentation distro recipes-2013
Presentation distro recipes-2013
olberger
 
Flexible Resources In 3 6 And E4
Flexible Resources In 3 6 And E4Flexible Resources In 3 6 And E4
Flexible Resources In 3 6 And E4
szbra
 

Similar to Linked Data - the Future for Open Repositories. Kultivate Workshop (20)

Linked Data - the Future for Open Repositories?
Linked Data - the Future for Open Repositories?Linked Data - the Future for Open Repositories?
Linked Data - the Future for Open Repositories?
 
2014 06-04-presentation-mdn-2014
2014 06-04-presentation-mdn-20142014 06-04-presentation-mdn-2014
2014 06-04-presentation-mdn-2014
 
RTÉ Content Discovery Project - Christophe Debruyne
RTÉ Content Discovery Project - Christophe DebruyneRTÉ Content Discovery Project - Christophe Debruyne
RTÉ Content Discovery Project - Christophe Debruyne
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
Sword Bl 0903[1]
Sword Bl 0903[1]Sword Bl 0903[1]
Sword Bl 0903[1]
 
Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
 
Alexandria winer20100623
Alexandria winer20100623Alexandria winer20100623
Alexandria winer20100623
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal
 
Leyline: A provenance-based desktop search
Leyline: A provenance-based desktop searchLeyline: A provenance-based desktop search
Leyline: A provenance-based desktop search
 
The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0
 
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAPKnowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
 
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
 
Presentation distro recipes-2013
Presentation distro recipes-2013Presentation distro recipes-2013
Presentation distro recipes-2013
 
Eun lre brussels_winer20100616
Eun lre brussels_winer20100616Eun lre brussels_winer20100616
Eun lre brussels_winer20100616
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
Pundit @ amsterdam textual scholarship
Pundit @ amsterdam textual scholarshipPundit @ amsterdam textual scholarship
Pundit @ amsterdam textual scholarship
 
Flexible Resources In 3 6 And E4
Flexible Resources In 3 6 And E4Flexible Resources In 3 6 And E4
Flexible Resources In 3 6 And E4
 

More from Adrian Stevenson

The Winner Takes it All? -APIs and Linked Data Battle It Out
The Winner Takes it All? -APIs and Linked Data Battle It OutThe Winner Takes it All? -APIs and Linked Data Battle It Out
The Winner Takes it All? -APIs and Linked Data Battle It Out
Adrian Stevenson
 

More from Adrian Stevenson (20)

Tools for Data Manipulation - UKAD Open Refine Workshop
Tools for Data Manipulation - UKAD Open Refine WorkshopTools for Data Manipulation - UKAD Open Refine Workshop
Tools for Data Manipulation - UKAD Open Refine Workshop
 
Exploring British Design
Exploring British DesignExploring British Design
Exploring British Design
 
SEO Matters
SEO MattersSEO Matters
SEO Matters
 
Linking Data with sameAs: Challenges and Solutions - Workshop
Linking Data with sameAs: Challenges and Solutions - WorkshopLinking Data with sameAs: Challenges and Solutions - Workshop
Linking Data with sameAs: Challenges and Solutions - Workshop
 
“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data
“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data
“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data
 
Wrapping and Unwrapping History: What’s Gained and What’s Lost
Wrapping and Unwrapping History: What’s Gained and What’s LostWrapping and Unwrapping History: What’s Gained and What’s Lost
Wrapping and Unwrapping History: What’s Gained and What’s Lost
 
Very Gentle Linked Data Workshop
Very Gentle Linked Data WorkshopVery Gentle Linked Data Workshop
Very Gentle Linked Data Workshop
 
Digital Humanities and the First World War
Digital Humanities and the First World WarDigital Humanities and the First World War
Digital Humanities and the First World War
 
Lessons from ‘Linking Lives’ and ‘WW1 Discovery’ Projects
Lessons from ‘Linking Lives’ and ‘WW1 Discovery’ ProjectsLessons from ‘Linking Lives’ and ‘WW1 Discovery’ Projects
Lessons from ‘Linking Lives’ and ‘WW1 Discovery’ Projects
 
The Winner Takes it All? -APIs and Linked Data Battle It Out
The Winner Takes it All? -APIs and Linked Data Battle It OutThe Winner Takes it All? -APIs and Linked Data Battle It Out
The Winner Takes it All? -APIs and Linked Data Battle It Out
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
GLAM Rocks! London Semantic Web Meetup
GLAM Rocks! London Semantic Web MeetupGLAM Rocks! London Semantic Web Meetup
GLAM Rocks! London Semantic Web Meetup
 
High and Lows of Library Linked Data
High and Lows of Library Linked DataHigh and Lows of Library Linked Data
High and Lows of Library Linked Data
 
2 minutes on LOCAH Linking Lives at Europeana Tech 2011
 2 minutes on LOCAH Linking Lives at Europeana Tech 2011 2 minutes on LOCAH Linking Lives at Europeana Tech 2011
2 minutes on LOCAH Linking Lives at Europeana Tech 2011
 
Linked Open Data: Opportunities & Barriers for Archives
Linked Open Data: Opportunities & Barriers for ArchivesLinked Open Data: Opportunities & Barriers for Archives
Linked Open Data: Opportunities & Barriers for Archives
 
Locah Project Show and Tell
Locah Project Show and TellLocah Project Show and Tell
Locah Project Show and Tell
 
Report on the International Linked Open Data for Libraries, Archives and Muse...
Report on the International Linked Open Data for Libraries, Archives and Muse...Report on the International Linked Open Data for Libraries, Archives and Muse...
Report on the International Linked Open Data for Libraries, Archives and Muse...
 
Aggregation Using Linked Data – LOCAH Project Experiences
Aggregation Using Linked Data – LOCAH Project ExperiencesAggregation Using Linked Data – LOCAH Project Experiences
Aggregation Using Linked Data – LOCAH Project Experiences
 
LOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data ApproachesLOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data Approaches
 
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked DataDo the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Linked Data - the Future for Open Repositories. Kultivate Workshop

  • 1. Linked Data – the Future for Open Repositories? Kultivate Linked Data Workshop London, UK 12th December 2011 Adrian Stevenson UKOLN, University of Bath, UK (until end Dec 2011) Mimas, Libraries and Archives Team, University of Manchester, UK (from Jan 2012)
  • 2. LOCAH and Linking Lives Projects • Linked Open Copac and Archives Hub – Funded by #JiscEXPO 2/10 ‘Expose’ call • 1 year project. Started August 2010 – Partners & Consultants: • UKOLN, Mimas, Eduserv, Talis, OCLC, Ed Summers – http://blogs.ukoln.ac.uk/locah/ • Linking Lives – JISC funded for ‘Mimas Enhancements’ – 11 month project. Started Sept 2011 – http://archiveshub.ac.uk/linkinglives/
  • 3. Archives Hub and Copac • UK National Data Services based at Mimas • Archives Hub is an aggregation of archival descriptions from archive repositories across the UK – http://archiveshub.ac.uk • Copacprovides access to the merged library catalogues of libraries throughout the UK, including all national libraries – http://copac.ac.uk
  • 4. LOCAH Outputs • Expose Archives Hub &Copac data as Linked Data • Create a prototype visualisation • Report on opportunities and barriers
  • 5. How do we expose the Linked Data? 1. Model our ‘things’ into RDF 2. Transform the existing data into RDF/XML 3. Enhance the data 4. Load the RDF/XML into a triple store 5. Create Linked Data Views 6. Document the process, opportunities and barriers on LOCAH Blog
  • 6. Modelling ‘things’ into RDF • Archives Hub data in ‘Encoded Archival Description’ EAD XML form – http://www.loc.gov/ead/ • Copacdata in ‘Metadata Object Description Schema’ MODS XML form – http://www.loc.gov/standards/mods/ • Take a step back from the data format – What is EAD or MODS document “saying” about “things”? – What questions do we want to answer about those “things”?
  • 7. URI Patterns • Need to decide on patterns for URIs we generate • Following guidance from W3C ‘Cool URIs for the Semantic Web’ and UK Cabinet Office ‘Designing URI Sets for the UK Public Sector’ http://data.archiveshub.ac.uk/id/findingaid/gb1086skinner ‘thing’ URI redirects to … http://data.archiveshub.ac.uk/doc/findingaid/gb1086skinner document URI http://www.w3.org/TR/cooluris/ http://www.cabinetoffice.gov.uk/resource-library/designing-uri-sets-uk-public-sector
  • 8. Vocabularies • Using existing RDF vocabularies – DC, SKOS, FOAF, BIBO, WGS84 Geo, Lexvo, ORE, LODE, Event and Time Ontologies • Define additional RDF terms where required – hub:ArchivalResource – copac:Creator • It can be hard to know where to find and how to use vocabularies and ontologies
  • 9. Archives Hub Model in Finding maintainedBy/ Repository administeredBy/ Place Postcode Aid maintains (Agent) administers Unit hasPart/ encodedAs/ partOf encodes EAD Document accessProvidedBy/ Level Biographical hasBiogHist/ topic/ providesAccessTo History isBiogHistFor page level Language Archival language at time topic/ page origination hasPart/ Resource product of Creation Temporal partOf associatedWith Entity extent inScheme Extent Agent Concept Concept Scheme representedBy Is-a foaf:focus Object Is-a associatedWith Person Family Organisation Place Book participates in Birth Death Genre Function at time Temporal Entity
  • 11. Transforming into RDF/XML • Transform EAD and MODS to RDF/XML based on our models – Hub: XSLT Stylesheet – Copac: created in-house Java transformation program • Load RDF/XML into a triple store
  • 12. We’re Linking Data! • If something is identified, it can be linked to • We take items from our datasets and link them to items from other datasets BBC Copac VIAF DBPedia GeoNames Archives Hub
  • 13. Enhancing our data • Already have some links: – Time - reference.data.gov.ukURIs – Location - UK Postcodes URIs and Ordnance Survey URIs – Names - Virtual International Authority File • VIAF matches and links widely-used authority files - http://viaf.org/ – Names - DBPedia • Also looking at: – Subjects - Library Congress Subject Headings and DBPedia
  • 16. http://data.copac.ac.uk/ (to be released very soon!)
  • 17. Visualisation Prototype Using Timemap – – Googlemaps and Simile – http://code.google.com/p/time map/ Early stages with this Will give location and ‘extent’ of archive. Will link through to Archives Hub
  • 18. Linking Lives Project http://archiveshub.ac.uk/linkinglives/
  • 20. Key Benefit of Linked Data • API based mashupswork against a fixed set of data sources • Hand crafted by humans • Don’t integrate well • Linked Data promises an unbound global data space • Easy dataset integration • Generic ‘mesh-up’ tools
  • 22. Data Modelling • Steep learning curve – RDF terminology “confusing” – Lack of archival examples • Complexity – Archival description is hierarchical and multi-level • ‘Dirty’ Data
  • 25. Scalability / Provenance • Same issue with attribution • Solutions: Named graphs? Quads? Example by Bradley Allen, Elsevier at • Best Practice LOD LAM Summit, SF, USA, June 2011
  • 26. Licensing • Ownership of data often not clear • Difficult to track attribution and provenance • CC0 for Archives Hub and Copac test datasets
  • 27. Sustainability • Can you rely on data sources long-term? • Ed Summers at the Library of Congress created http://lcsh.info • Linked Data interface for LOC subject headings • People started using it
  • 28. Library of Congress Subject Headings
  • 29. Linked Data the Future for Open Repositories? • Enables ‘straightforward’ integration of wide variety of data sources • Repository data can ‘work harder’ • New channels into your data • Researchers are more likely to discover sources • ‘Hidden' collections become of the Web
  • 30.
  • 31. Attribution and CC License • Sections of this presentation adapted from materials created by other members of the LOCAH & Linking Lives Projects • This presentation available under creative commonsNon Commercial-Share Alike: http://creativecommons.org/licenses/by-nc/2.0/uk/

Editor's Notes

  1. Copac a union catalogueBoth successful JISC services running for many years nowLocah is a research project – will have to see if go into service with LD interface
  2. Encoded Archival Description is an XML standard for encoding archival finding aidsThe Object Description Schema (MODS) is an XML-based bibliographic description schemaMODS - Metadata Object Description Schema (MODS) is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications.EAD - Things” include concepts and abstractions as well as material objects We want location – archives physical things so location importantAlso wanted event data, partly steered by the visualisation prototypeAlso ‘extent’ data – number of boxes
  3. 303 and Content Neg from ‘Cool URIs for the Semantic Web’
  4. Open Data Commons Public Domain DedicationCreative Commons CC0 license
  5. In hypertext web sites it is considered generally rather bad etiquette not to link to related external material. The value of your own information is very much a function of what it links to, as well as the inherent value of the information within the web page.  So it is also in the Semantic Web.Remember, this is about machines linking – machines need identifiers; humans generally know when something is a place or when it is a person. BBC + DBPedia + GeoNames + Archives Hub + Copac + VIAF = the Web as an exploratory spaceUsers very interested in related materials acc to Terry Catapano at SAA 2011. LD can really help with this.
  6. Can get XSLT stylesheet here too!
  7. Note that it is machine readable interface as well as the human interfaceCurrently have a few hundred in Locah. There are 25,000 EAD records on theHub srevice. We’re Intending to put about 2,000 up for Linking Lives Project.
  8. More aggregation
  9. Data can be integrated from many diff sourcesUsers very interested in related materials acc to Terry Catapano at SAA 2011. LD can really help with this.
  10. Steep learning curve: - RDF Linked Data modelingterminology - Lack of archive domain examples – though you now have LOCAH! - Certain level of expertise neededDirty Data - Joe Bloggs and others’ rather than just a name, or where the access points do not have rules or a source associated with them. - Extent data highly variableComplexity - “lower level” units interpreted in context of the higher levels of description - Arguably “incomplete” without the contextual data.Relations are asserted, e.g. member-of/component-ofBut there is no requirement or expectation that data consumers will follow the links describing the relations
  11. Ex