SlideShare uma empresa Scribd logo
1 de 41
Family History and Linked Data
Free UK Genealogy Open Data
Conference, 30 January 2016
Richard Light
Lights
Kerridges
Kerridges + Lights!
Kerridge and Light
• … and Weissbeck
• relatively uncommon names
• How can FreeBMD and FreeCen help?
Other people were here first …
• Lots of Kerridge research
• Lights actually feature in a book: Common
People (Alison Light)
Kerridge
Light
Pooling results
• Do we want to do it? (Not everyone does …)
• If so, how can it be done?
• How do you say that you’re both talking about
the same person?
Current FreeUKGen search facilities
• BMD search is sophisticated and flexible
• Only one result type: people who match
• Census search has same approach, with links
to individual households
BMD search
Register search
Census search
Limitations of current search
• Limit of 3000 hits per BMD search
• Difficult to get to household info
• Result pages can’t be bookmarked
– http://www.freecen.org.uk/cgi/search.pl
• Main problem: searches all return HTML!
Getting machine-processible data
• Save FreeBMD HTML results page
• Copy table of results
• Paste into spreadsheet
• Save as CSV file
• Convert to XML and load into Modes
BMD data in Modes
Limitations
• Imprecision
– temporal, e.g. BMD ‘after the event’ and grouped by
quarter
– geographical: BMD only specifies District; Census ->
Parish
– names: variations in spelling
– copying/transcription errors
• Incompleteness
– overseas births/deaths
– non-registration
– transcription backlog
Encoding a BMD entry as XML
Indexed search, e.g. places
Inference of birth data
Speculative matching death -> birth
Working with census data
• Initial efforts ‘broke’ FreeCen!
• Data had to be loaded from a full dump
• Loaded all Districts, Pieces and Households
• Selectively loaded Light and Kerridge records
• Then loaded all people registered in one of
these Light or Kerridge households
• Shows up Lights/Kerridges as servants, in
institutions, etc.
Districts
Pieces
Households
Census data: co-contextuality
• Each ‘household’ records relationships
between people
• Binary links between ‘Head’ and others, but
other family relationships can be inferred
• Nothing like the completeness of FreeBMD,
but more can be done with the data that is
there
Household summaries
Occupations - Kerridge
Occupations of Kerridges (>1)
KERRIDGE Scholar KERRIDGE - KERRIDGE Ag Labr
KERRIDGE Agricultural Labourer(Em'ee) KERRIDGE Farmer's Son KERRIDGE Farm Labourer(Em'ee)
KERRIDGE Farmer(Em'er) KERRIDGE Labourer(Em'ee) KERRIDGE Domestic Servant
KERRIDGE Farm Labr KERRIDGE Agricultural Laborer(Em'ee) KERRIDGE Brickmaker(Em'ee)
KERRIDGE Farm Labourer (Em'ee) KERRIDGE Retired Ag Labr
Occupations - Light
Occupations of Lights (>1)
LIGHT Scholar LIGHT Ag Lab LIGHT Ag Laborer LIGHT Labourer LIGHT Copper Miner
LIGHT Female Servant LIGHT Miner LIGHT Pauper LIGHT Sawyer LIGHT Tin Miner(Em'ee)
LIGHT - LIGHT Butcher(Em'ee) LIGHT Coal Miner(Em'ee) LIGHT Cordwainer LIGHT Gardener
LIGHT General Servant LIGHT Independent LIGHT Mariner LIGHT Milliner LIGHT Miner Copper
Cross-linking census data to BMD
• Census records include place of birth and age
• Can use same inference techniques to match
against BMD data
An Open Data FreeUKGen API …
• … could be HTTP-based; RESTful
• would support a wide variety of information
needs
• would deliver a variety of machine-processible
formats
• would allow re-use of the data
The problem of identity
• All my data files use invented primary keys for
people, places, … which are only significant
within my database
• In general, how do we assert that two
statements are about the same person?
• None of these is sufficient on its own:
– Name
– Date of birth/death
– Place of birth/death
Linked Data
• One step beyond Open Data
• Combines idea of machine-processible data
with a persistent identity for each concept
• Uses content negotiation to return RDF, XML,
JSON, … for each URL
• Allows programmatic access to data;
processing chains (‘follow your nose’)
• Requires suitably open licensing
Linked Data example: Wordsworth
Trust
Museum catalogue data as RDF
Everything comes from the same URL
http://collections.wordsworth.org.uk/Object/WTcoll/id/GRMDC.C144.9
By default, return HTML:
http://collections.wordsworth.org.uk/Object/WTcoll/id/html/GRMDC.C144.9
When RDF requested (in Accept header), redirect to a variant URL:
http://collections.wordsworth.org.uk/Object/WTcoll/id/rdf/GRMDC.C144.9
Can support lots of variant formats, e.g. XML, JSON, … This approach
relies on a technique called Content Negotiation
Linked Data URLs are unique; persistent; dereferenceable
What FreeUKGen resources could we
publish as Linked Data?
• Can only assign identifiers to data we have
– BMD registration events
– Census return events
– Pieces, Districts etc.
• Can’t assign identifiers to people
• Problem: current database update strategy
generates identifiers afresh each time
– Conflicts with need for persistent identifiers
Potential Linked Data projects
• Produce authorities which can be integrated
into current approach:
– Geographical units: Districts, Parishes, Pieces,
named places. Link to Geonames, OS Gazetteer
– Occupations: potential for useful groupings (e.g.
Ag Lab and variants). Link to SIC, SHIC?
• Generate persistent identifiers for the primary
references published by FreeUKGen
– e.g. a page within the BMD index
Let the computer work harder!
• Current approach makes very little use of the
computer as a data-processing tool
• FreeUKGen resources as Open Data would
support new types of research and simplify
e.g. Single Name Studies
• FreeUKGen resources as Linked Data would
give the community a common frame of
reference for its work
Cultural Heritage Linked Data
Thank you!
Richard Light
FreeUKGen Trustee
@richardofsussex
richard@light.demon.co.uk

Mais conteúdo relacionado

Mais procurados

LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data StrategyRichard Wallis
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending BenefitsRichard Wallis
 
Schema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibrarySchema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibraryRichard Wallis
 
Brief Introduction to Linked Data
Brief Introduction to Linked DataBrief Introduction to Linked Data
Brief Introduction to Linked DataRobert Sanderson
 
Telling the World and Our Users What We Have
Telling the World and Our Users What We HaveTelling the World and Our Users What We Have
Telling the World and Our Users What We HaveRichard Wallis
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending InfluenceRichard Wallis
 
Linked data for Ebook discovery
Linked data for Ebook discoveryLinked data for Ebook discovery
Linked data for Ebook discoveryRichard Wallis
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.orgJoshua Shinavier
 
WorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgRichard Wallis
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our OpportunityRichard Wallis
 
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...Marc Davis
 
Participation reports webinar May 2020
Participation reports webinar May 2020Participation reports webinar May 2020
Participation reports webinar May 2020Crossref
 
Semantic Web and Schema.org
Semantic Web and Schema.orgSemantic Web and Schema.org
Semantic Web and Schema.orgrvguha
 
Entification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataEntification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataRichard Wallis
 

Mais procurados (19)

LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data Strategy
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending Benefits
 
Schema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibrarySchema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your Library
 
Brief Introduction to Linked Data
Brief Introduction to Linked DataBrief Introduction to Linked Data
Brief Introduction to Linked Data
 
Telling the World and Our Users What We Have
Telling the World and Our Users What We HaveTelling the World and Our Users What We Have
Telling the World and Our Users What We Have
 
Extending Schema.org
Extending Schema.orgExtending Schema.org
Extending Schema.org
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending Influence
 
ePADD
ePADDePADD
ePADD
 
Linked data for Ebook discovery
Linked data for Ebook discoveryLinked data for Ebook discovery
Linked data for Ebook discovery
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.org
 
WorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.org
 
Clark - Metadata is the Message
Clark - Metadata is the MessageClark - Metadata is the Message
Clark - Metadata is the Message
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our Opportunity
 
Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
 
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...
2013-05-09 Marc Davis on Metaphors and Models of Personal Data - Implications...
 
Participation reports webinar May 2020
Participation reports webinar May 2020Participation reports webinar May 2020
Participation reports webinar May 2020
 
Semantic Web and Schema.org
Semantic Web and Schema.orgSemantic Web and Schema.org
Semantic Web and Schema.org
 
Entification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataEntification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library Data
 
Pdfsamplefile
PdfsamplefilePdfsamplefile
Pdfsamplefile
 

Semelhante a Open data and Free UK Genealogy

It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveJanifer Gatenby
 
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleCassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleDataStax Academy
 
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'bakers84
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities Getaneh Alemu
 
Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Richard Urban
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBlehresman
 
Peter Chan CURATEcamp
Peter Chan CURATEcampPeter Chan CURATEcamp
Peter Chan CURATEcampJuliaYKim
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGGRatko Mutavdzic
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in RomaniaVlad Posea
 
Linked Data and RDA: Looking at Next-Generation Cataloging
Linked Data and RDA: Looking at Next-Generation CatalogingLinked Data and RDA: Looking at Next-Generation Cataloging
Linked Data and RDA: Looking at Next-Generation CatalogingJenn Riley
 
APIs and the Semantic Web: publishing information instead of data
APIs and the Semantic Web: publishing information instead of dataAPIs and the Semantic Web: publishing information instead of data
APIs and the Semantic Web: publishing information instead of dataDimitri van Hees
 
Internal meeting: An introduction to the civil registry & LINKS
Internal meeting: An introduction to the civil registry & LINKSInternal meeting: An introduction to the civil registry & LINKS
Internal meeting: An introduction to the civil registry & LINKSRick Mourits
 
Envisioning Social Applications of Library Linked Data
Envisioning Social Applications of Library Linked DataEnvisioning Social Applications of Library Linked Data
Envisioning Social Applications of Library Linked DataUldis Bojars
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsFabrizio Fortino
 
MW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataMW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataDavid Henry
 
Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2Richard Urban
 

Semelhante a Open data and Free UK Genealogy (20)

Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleCassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
 
Metadata
MetadataMetadata
Metadata
 
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
 
What's New in RDF 1.1?
What's New in RDF 1.1?What's New in RDF 1.1?
What's New in RDF 1.1?
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDB
 
Peter Chan CURATEcamp
Peter Chan CURATEcampPeter Chan CURATEcamp
Peter Chan CURATEcamp
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
Linked Data and RDA: Looking at Next-Generation Cataloging
Linked Data and RDA: Looking at Next-Generation CatalogingLinked Data and RDA: Looking at Next-Generation Cataloging
Linked Data and RDA: Looking at Next-Generation Cataloging
 
APIs and the Semantic Web: publishing information instead of data
APIs and the Semantic Web: publishing information instead of dataAPIs and the Semantic Web: publishing information instead of data
APIs and the Semantic Web: publishing information instead of data
 
Internal meeting: An introduction to the civil registry & LINKS
Internal meeting: An introduction to the civil registry & LINKSInternal meeting: An introduction to the civil registry & LINKS
Internal meeting: An introduction to the civil registry & LINKS
 
Envisioning Social Applications of Library Linked Data
Envisioning Social Applications of Library Linked DataEnvisioning Social Applications of Library Linked Data
Envisioning Social Applications of Library Linked Data
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data Relationships
 
MW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataMW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open Data
 
Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2
 

Último

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 

Open data and Free UK Genealogy

  • 1. Family History and Linked Data Free UK Genealogy Open Data Conference, 30 January 2016 Richard Light
  • 5. Kerridge and Light • … and Weissbeck • relatively uncommon names • How can FreeBMD and FreeCen help?
  • 6. Other people were here first … • Lots of Kerridge research • Lights actually feature in a book: Common People (Alison Light)
  • 9. Pooling results • Do we want to do it? (Not everyone does …) • If so, how can it be done? • How do you say that you’re both talking about the same person?
  • 10. Current FreeUKGen search facilities • BMD search is sophisticated and flexible • Only one result type: people who match • Census search has same approach, with links to individual households
  • 14. Limitations of current search • Limit of 3000 hits per BMD search • Difficult to get to household info • Result pages can’t be bookmarked – http://www.freecen.org.uk/cgi/search.pl • Main problem: searches all return HTML!
  • 15. Getting machine-processible data • Save FreeBMD HTML results page • Copy table of results • Paste into spreadsheet • Save as CSV file • Convert to XML and load into Modes
  • 16. BMD data in Modes
  • 17. Limitations • Imprecision – temporal, e.g. BMD ‘after the event’ and grouped by quarter – geographical: BMD only specifies District; Census -> Parish – names: variations in spelling – copying/transcription errors • Incompleteness – overseas births/deaths – non-registration – transcription backlog
  • 18. Encoding a BMD entry as XML
  • 22. Working with census data • Initial efforts ‘broke’ FreeCen! • Data had to be loaded from a full dump • Loaded all Districts, Pieces and Households • Selectively loaded Light and Kerridge records • Then loaded all people registered in one of these Light or Kerridge households • Shows up Lights/Kerridges as servants, in institutions, etc.
  • 26. Census data: co-contextuality • Each ‘household’ records relationships between people • Binary links between ‘Head’ and others, but other family relationships can be inferred • Nothing like the completeness of FreeBMD, but more can be done with the data that is there
  • 28. Occupations - Kerridge Occupations of Kerridges (>1) KERRIDGE Scholar KERRIDGE - KERRIDGE Ag Labr KERRIDGE Agricultural Labourer(Em'ee) KERRIDGE Farmer's Son KERRIDGE Farm Labourer(Em'ee) KERRIDGE Farmer(Em'er) KERRIDGE Labourer(Em'ee) KERRIDGE Domestic Servant KERRIDGE Farm Labr KERRIDGE Agricultural Laborer(Em'ee) KERRIDGE Brickmaker(Em'ee) KERRIDGE Farm Labourer (Em'ee) KERRIDGE Retired Ag Labr
  • 29. Occupations - Light Occupations of Lights (>1) LIGHT Scholar LIGHT Ag Lab LIGHT Ag Laborer LIGHT Labourer LIGHT Copper Miner LIGHT Female Servant LIGHT Miner LIGHT Pauper LIGHT Sawyer LIGHT Tin Miner(Em'ee) LIGHT - LIGHT Butcher(Em'ee) LIGHT Coal Miner(Em'ee) LIGHT Cordwainer LIGHT Gardener LIGHT General Servant LIGHT Independent LIGHT Mariner LIGHT Milliner LIGHT Miner Copper
  • 30. Cross-linking census data to BMD • Census records include place of birth and age • Can use same inference techniques to match against BMD data
  • 31. An Open Data FreeUKGen API … • … could be HTTP-based; RESTful • would support a wide variety of information needs • would deliver a variety of machine-processible formats • would allow re-use of the data
  • 32. The problem of identity • All my data files use invented primary keys for people, places, … which are only significant within my database • In general, how do we assert that two statements are about the same person? • None of these is sufficient on its own: – Name – Date of birth/death – Place of birth/death
  • 33. Linked Data • One step beyond Open Data • Combines idea of machine-processible data with a persistent identity for each concept • Uses content negotiation to return RDF, XML, JSON, … for each URL • Allows programmatic access to data; processing chains (‘follow your nose’) • Requires suitably open licensing
  • 34. Linked Data example: Wordsworth Trust
  • 36. Everything comes from the same URL http://collections.wordsworth.org.uk/Object/WTcoll/id/GRMDC.C144.9 By default, return HTML: http://collections.wordsworth.org.uk/Object/WTcoll/id/html/GRMDC.C144.9 When RDF requested (in Accept header), redirect to a variant URL: http://collections.wordsworth.org.uk/Object/WTcoll/id/rdf/GRMDC.C144.9 Can support lots of variant formats, e.g. XML, JSON, … This approach relies on a technique called Content Negotiation Linked Data URLs are unique; persistent; dereferenceable
  • 37. What FreeUKGen resources could we publish as Linked Data? • Can only assign identifiers to data we have – BMD registration events – Census return events – Pieces, Districts etc. • Can’t assign identifiers to people • Problem: current database update strategy generates identifiers afresh each time – Conflicts with need for persistent identifiers
  • 38. Potential Linked Data projects • Produce authorities which can be integrated into current approach: – Geographical units: Districts, Parishes, Pieces, named places. Link to Geonames, OS Gazetteer – Occupations: potential for useful groupings (e.g. Ag Lab and variants). Link to SIC, SHIC? • Generate persistent identifiers for the primary references published by FreeUKGen – e.g. a page within the BMD index
  • 39. Let the computer work harder! • Current approach makes very little use of the computer as a data-processing tool • FreeUKGen resources as Open Data would support new types of research and simplify e.g. Single Name Studies • FreeUKGen resources as Linked Data would give the community a common frame of reference for its work
  • 41. Thank you! Richard Light FreeUKGen Trustee @richardofsussex richard@light.demon.co.uk