SlideShare uma empresa Scribd logo
1 de 44
A Publication Approach to
Linked Data in Archaeology
A Publication Approach to
Linked Data in Archaeology
Eric C. Kansa
UC Berkeley / OpenContext.org
Unless otherwise indicated, this work is licensed under a Creative Commons Attribution
3.0 License <http://creativecommons.org/licenses/by/3.0/>
• Started in 2007
• Open access / open data
publishing for archaeology
• Archiving by California
Digital Library
• Referenced by NSF and
NEH for grant data
management
• Started in 2007
• Open access / open data
publishing for archaeology
• Archiving by California
Digital Library
• Referenced by NSF and
NEH for grant data
management
My Precious DataMy Precious Data
?
Data Sharing as Publication
• Several projects studying
editorial + publishing
workflows
• Current Funding: ACLS,
NEH, Sloan, EOL
Data Sharing as Publication
• Several projects studying
editorial + publishing
workflows
• Current Funding: ACLS,
NEH, Sloan, EOL
Web of DataWeb of Data
Cross-discipline Connections
Open Context links with
humanities data (CIDOC,
Pleiades, British Museum), and
natural sciences (EOL, UBERON)
Pelagios API
EOL Computable Data
Challenge
(Ben Arbuckle, Sarah Kansa,
Eric Kansa)
EOL Computable Data
Challenge
1. 15 different sites
2. 34 zooarchaeologists
3. Publishing: decoding, cleanup,
metadata documentation
4. Linked Data annotation (EOL,
UBERON, biometrics)
5. Collaborative analysis
6. Reuse itself studied by
DIPIR.org (U. Michigan
ISchool)
EOL Computable Data
Challenge
1. 15 different sites
2. 34 zooarchaeologists
3. Publishing: decoding, cleanup,
metadata documentation
4. Linked Data annotation (EOL,
UBERON, biometrics)
5. Collaborative analysis
6. Reuse itself studied by
DIPIR.org (U. Michigan
ISchool)
Data Publishing
Google / Open Refine
1. Check consistency
2. Edit functions
3. All changes logged, can be
rolled back
Google / Open Refine
1. Check consistency
2. Edit functions
3. All changes logged, can be
rolled back
Bibliography
• Bibliographic references
expressed as Linked Data
(modeled after S. Heath)
• Associates publication
citation with Open Access
variants
Bibliography
• Bibliographic references
expressed as Linked Data
(modeled after S. Heath)
• Associates publication
citation with Open Access
variants
Why UBERON?
1. Expresses relevant expert knowledge,
tremendous effort. Why ignore or
duplicate this effort?
2. Anatomic entities related to
embryology, genetic networks. New
research opportunities for zooarch?
3. Zooarchaeology gains stakeholders
(biometric data of wide interest)
Why UBERON?
1. Expresses relevant expert knowledge,
tremendous effort. Why ignore or
duplicate this effort?
2. Anatomic entities related to
embryology, genetic networks. New
research opportunities for zooarch?
3. Zooarchaeology gains stakeholders
(biometric data of wide interest)
“Ovis aries”
http://eol.org/pages/311906/
Code: 14
Domestic
sheep
Code: 70
Code: 16
Ovis aries
Code: 15
Sheep
O. aries
Schaf
Sh.
“Distal epiphysis unfused”
http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058
dist.
unfused
d. uf.
30
uf. dist.,
f. prox.
Distal epiph.
unfused
Distal end unf.
Sheep/Goat Distal Femur FusionSheep/Goat Distal Femur Fusion
Karain B Cave (N=53) Pınarbaşı (N=3) Çukuriçi Höyük (N=13)
Suberde (N=0) Domuztepe (N=28) Ulucak (N=15)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Unfused
Fused
“Distal epiphysis unfused”
http://opencontext.org/vocabularies/open-context-zooarch/zoo-0058
DIPIR: Data Documentation PracticesDIPIR: Data Documentation Practices
I use an Excel spreadsheet…which I … inherited from my research
advisers. …my dissertation advisor was still recording data for each
specimen on paper when I was in graduate school so that's what I
started …then quickly, I was like, "This is ridiculous.“… I just started
using an Excel spreadsheet that has sort of slowly gotten bigger and
bigger over time with more variables or columns…I've added …color
coding…I also use…a very sort of primitive numerical coding system,
again, that I inherited from my research advisers…So, this little book
that goes with me of codes which is sort of odd, but …we all know
that a 14 is a sheep.” (CCU13)
A long way to go before we
get usable, intelligible data
CC-BY (Eduardo Otubo)
http://www.flickr.com/photos/otubo/5091378744
SPARQL endpoint easy to break (too big of a graph
to query).
Needed a work-around, so I also use the normal
(“plain web”) index to query the British Museum.
(1) Keyword
search for
relevant term.
(2) Scrape results
(blech!) for item
identifiers
(“objectid”
parameter in
URLs)
(3) Use ObjectIDs
in SPARQL queries
(limits size of
graph queried, so
server doesn’t
die).
SELECT ?s ?oPart ?oThes ?oLab
WHERE
{
?s
<http://collection.britishmuseum.org/id/c
rm/bm-extensions/codex_id>
'$objectID';
<http://collection.britishmuseum.org/id/c
rm/P46F.is_composed_of> ?oPart.
?oPart
<http://collection.britishmuseum.org/id/c
rm/P45F.consists_of> ?oThes.
?oThes
<http://www.w3.org/2004/02/skos/core#
prefLabel> ?oLab.
} LIMIT 10
Why is linked
data important?
Why is linked
data important?
1. Improve data quality, expert
curation of concepts +
vocabularies
2. Develop ties with other
research communities (can
feedback to collect new /
different data)
3. Increasingly sophisticated
open source tools, support
services
4. Part of the Web, not just on
the Web
1. Improve data quality, expert
curation of concepts +
vocabularies
2. Develop ties with other
research communities (can
feedback to collect new /
different data)
3. Increasingly sophisticated
open source tools, support
services
4. Part of the Web, not just on
the Web
… but
participating
in Linked Data
requires
effort!
… but
participating
in Linked Data
requires
effort!
Why is linked
data important?
Why is linked
data important?
Image Credit: Copyright Newline Cinema
One does not simply
share usable data…
Data are challenging
1. “Raw data” often problematic,
even with documentation (10X
effort needed with decoded data)
2. Tension between modeling needs
and familiarity with tools (Excel)
3. More work needed modeling
research methods (esp. sampling,
see DIPIR.org outcomes)
4. You’re never going to be done!
Data are challenging
1. “Raw data” often problematic,
even with documentation (10X
effort needed with decoded data)
2. Tension between modeling needs
and familiarity with tools (Excel)
3. More work needed modeling
research methods (esp. sampling,
see DIPIR.org outcomes)
4. You’re never going to be done!

Mais conteúdo relacionado

Mais procurados

Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
ICZN
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
Crossref
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
Hilmar Lapp
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
CEDAR: Center for Expanded Data Annotation and Retrieval
 

Mais procurados (20)

Linking Data, Linking People
Linking Data, Linking PeopleLinking Data, Linking People
Linking Data, Linking People
 
Finding sci tech grey literature information
Finding sci tech grey literature informationFinding sci tech grey literature information
Finding sci tech grey literature information
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Texas sla presentation finding sci tech grey literature information
Texas sla presentation  finding sci tech grey literature informationTexas sla presentation  finding sci tech grey literature information
Texas sla presentation finding sci tech grey literature information
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
 
Mcb database resources workshop 2013
Mcb database resources workshop 2013Mcb database resources workshop 2013
Mcb database resources workshop 2013
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
Building the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeBuilding the new open linked library: Theory and Practice
Building the new open linked library: Theory and Practice
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and Medicine
 
Finding and accessing human genome data with Repositive
Finding and accessing human genome data with RepositiveFinding and accessing human genome data with Repositive
Finding and accessing human genome data with Repositive
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDM
 
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
 
Top 10 web
Top 10 webTop 10 web
Top 10 web
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
Introduction to FundRef Webinar
Introduction to FundRef WebinarIntroduction to FundRef Webinar
Introduction to FundRef Webinar
 
DAS game: how a programmer thinks
DAS game: how a programmer thinksDAS game: how a programmer thinks
DAS game: how a programmer thinks
 
Open Annotation Model
Open Annotation ModelOpen Annotation Model
Open Annotation Model
 

Semelhante a #LAWDI Open Context, publishing linked data in archaeology

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011
Ross Singer
 

Semelhante a #LAWDI Open Context, publishing linked data in archaeology (20)

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
An Open Context for Archaeology
An Open Context for ArchaeologyAn Open Context for Archaeology
An Open Context for Archaeology
 
It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011
 
Idcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckleIdcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckle
 
Data Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from ArchaeologyData Sharing as Publication: A View from Archaeology
Data Sharing as Publication: A View from Archaeology
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...
 
Content Mining at Wellcome Trust
Content Mining at Wellcome TrustContent Mining at Wellcome Trust
Content Mining at Wellcome Trust
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
Reference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and RemedyReference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and Remedy
 
Maximising your communication impact – making altmetrics workss
Maximising your communication impact – making altmetrics workssMaximising your communication impact – making altmetrics workss
Maximising your communication impact – making altmetrics workss
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

#LAWDI Open Context, publishing linked data in archaeology

  • 1. A Publication Approach to Linked Data in Archaeology A Publication Approach to Linked Data in Archaeology Eric C. Kansa UC Berkeley / OpenContext.org Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
  • 2. • Started in 2007 • Open access / open data publishing for archaeology • Archiving by California Digital Library • Referenced by NSF and NEH for grant data management • Started in 2007 • Open access / open data publishing for archaeology • Archiving by California Digital Library • Referenced by NSF and NEH for grant data management
  • 3. My Precious DataMy Precious Data ?
  • 4.
  • 5. Data Sharing as Publication • Several projects studying editorial + publishing workflows • Current Funding: ACLS, NEH, Sloan, EOL Data Sharing as Publication • Several projects studying editorial + publishing workflows • Current Funding: ACLS, NEH, Sloan, EOL
  • 6.
  • 7.
  • 8. Web of DataWeb of Data Cross-discipline Connections Open Context links with humanities data (CIDOC, Pleiades, British Museum), and natural sciences (EOL, UBERON)
  • 10. EOL Computable Data Challenge (Ben Arbuckle, Sarah Kansa, Eric Kansa)
  • 11. EOL Computable Data Challenge 1. 15 different sites 2. 34 zooarchaeologists 3. Publishing: decoding, cleanup, metadata documentation 4. Linked Data annotation (EOL, UBERON, biometrics) 5. Collaborative analysis 6. Reuse itself studied by DIPIR.org (U. Michigan ISchool) EOL Computable Data Challenge 1. 15 different sites 2. 34 zooarchaeologists 3. Publishing: decoding, cleanup, metadata documentation 4. Linked Data annotation (EOL, UBERON, biometrics) 5. Collaborative analysis 6. Reuse itself studied by DIPIR.org (U. Michigan ISchool)
  • 12. Data Publishing Google / Open Refine 1. Check consistency 2. Edit functions 3. All changes logged, can be rolled back Google / Open Refine 1. Check consistency 2. Edit functions 3. All changes logged, can be rolled back
  • 13.
  • 14. Bibliography • Bibliographic references expressed as Linked Data (modeled after S. Heath) • Associates publication citation with Open Access variants Bibliography • Bibliographic references expressed as Linked Data (modeled after S. Heath) • Associates publication citation with Open Access variants
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Why UBERON? 1. Expresses relevant expert knowledge, tremendous effort. Why ignore or duplicate this effort? 2. Anatomic entities related to embryology, genetic networks. New research opportunities for zooarch? 3. Zooarchaeology gains stakeholders (biometric data of wide interest) Why UBERON? 1. Expresses relevant expert knowledge, tremendous effort. Why ignore or duplicate this effort? 2. Anatomic entities related to embryology, genetic networks. New research opportunities for zooarch? 3. Zooarchaeology gains stakeholders (biometric data of wide interest)
  • 23.
  • 24.
  • 25. “Ovis aries” http://eol.org/pages/311906/ Code: 14 Domestic sheep Code: 70 Code: 16 Ovis aries Code: 15 Sheep O. aries Schaf Sh.
  • 26.
  • 27.
  • 29. Sheep/Goat Distal Femur FusionSheep/Goat Distal Femur Fusion Karain B Cave (N=53) Pınarbaşı (N=3) Çukuriçi Höyük (N=13) Suberde (N=0) Domuztepe (N=28) Ulucak (N=15) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Unfused Fused
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. DIPIR: Data Documentation PracticesDIPIR: Data Documentation Practices I use an Excel spreadsheet…which I … inherited from my research advisers. …my dissertation advisor was still recording data for each specimen on paper when I was in graduate school so that's what I started …then quickly, I was like, "This is ridiculous.“… I just started using an Excel spreadsheet that has sort of slowly gotten bigger and bigger over time with more variables or columns…I've added …color coding…I also use…a very sort of primitive numerical coding system, again, that I inherited from my research advisers…So, this little book that goes with me of codes which is sort of odd, but …we all know that a 14 is a sheep.” (CCU13) A long way to go before we get usable, intelligible data
  • 37. SPARQL endpoint easy to break (too big of a graph to query). Needed a work-around, so I also use the normal (“plain web”) index to query the British Museum.
  • 38. (1) Keyword search for relevant term. (2) Scrape results (blech!) for item identifiers (“objectid” parameter in URLs) (3) Use ObjectIDs in SPARQL queries (limits size of graph queried, so server doesn’t die).
  • 39. SELECT ?s ?oPart ?oThes ?oLab WHERE { ?s <http://collection.britishmuseum.org/id/c rm/bm-extensions/codex_id> '$objectID'; <http://collection.britishmuseum.org/id/c rm/P46F.is_composed_of> ?oPart. ?oPart <http://collection.britishmuseum.org/id/c rm/P45F.consists_of> ?oThes. ?oThes <http://www.w3.org/2004/02/skos/core# prefLabel> ?oLab. } LIMIT 10
  • 40. Why is linked data important? Why is linked data important? 1. Improve data quality, expert curation of concepts + vocabularies 2. Develop ties with other research communities (can feedback to collect new / different data) 3. Increasingly sophisticated open source tools, support services 4. Part of the Web, not just on the Web 1. Improve data quality, expert curation of concepts + vocabularies 2. Develop ties with other research communities (can feedback to collect new / different data) 3. Increasingly sophisticated open source tools, support services 4. Part of the Web, not just on the Web
  • 41. … but participating in Linked Data requires effort! … but participating in Linked Data requires effort! Why is linked data important? Why is linked data important?
  • 42. Image Credit: Copyright Newline Cinema
  • 43. One does not simply share usable data…
  • 44. Data are challenging 1. “Raw data” often problematic, even with documentation (10X effort needed with decoded data) 2. Tension between modeling needs and familiarity with tools (Excel) 3. More work needed modeling research methods (esp. sampling, see DIPIR.org outcomes) 4. You’re never going to be done! Data are challenging 1. “Raw data” often problematic, even with documentation (10X effort needed with decoded data) 2. Tension between modeling needs and familiarity with tools (Excel) 3. More work needed modeling research methods (esp. sampling, see DIPIR.org outcomes) 4. You’re never going to be done!