SlideShare a Scribd company logo
1 of 21
A Hint of Mint Peter Sefton [email_address] Duncan Dickinson [email_address]
Funding by ANDS
Background: The ReDBox application ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Motivation ,[object Object]
Repository metadata is all strings Photo by http://www.flickr.com/photos/easement/
"Aggregated Assault" Some real IR metadata, aggregated using dc:type Journal Article (184)  PeerReviewed (105)  Article (75)  Thesis (66)  Book chapter (65)  NonPeerReviewed (62)  Conference Paper (35)  Journal Articles (Refereed Article) (30)  c1 (28)  techreport (27)  Full-text link or file (26)  Conference or Workshop Item (DEST Category E) (21)  PhD Doctorate (20)  Article (DEST Category C) (19)  journal article (18)  Book Chapter (17)  text (14)  Book Section (10)  Report (9)  Conference Publications (Full Written Paper - Refereed) (8)  Conference or Workshop Item (8)  e1 (8)  Book Chapters (7)  b1 (7)  Book Chapter (DEST Category B) (5) 
I'm tagged but what's my identity?
RSPCA name: Wayne At the vet: Bootsy Sefton Local council: Bootsy Sefton ID 555-888-888 RSPCA ID: 555-555-555 (owner <name-withheld>) RDFID tag: 555-777-777 At the park: Bootsy
The Mint's Misson URI's for ( lost ) dogs?
The mint is a practical tool ,[object Object]
The Mint's mission ,[object Object],[object Object],[object Object],[object Object],[object Object]
Mint features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What can you put in it?
Behind the scenes - lookup ,[object Object]
Mint Features: The data day spa* ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data love going to the day spa
Day Spa example:  Name matching
The compulsory architecture diagram
Links ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thanks   Questions?

More Related Content

What's hot

What's hot (20)

Data Vault vs Data Lake: What's the difference?
Data Vault vs Data Lake: What's the difference?Data Vault vs Data Lake: What's the difference?
Data Vault vs Data Lake: What's the difference?
 
DSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIREDSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIRE
 
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
DataTalks #4: Построение хранилища данных на основе платформы hadoop / Игорь ...
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Semantic web and Drupal: an introduction
Semantic web and Drupal: an introductionSemantic web and Drupal: an introduction
Semantic web and Drupal: an introduction
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinking
 
Warcbase: Building a Scalable Platform on HBase and Hadoop - Part Two, Histor...
Warcbase: Building a Scalable Platform on HBase and Hadoop - Part Two, Histor...Warcbase: Building a Scalable Platform on HBase and Hadoop - Part Two, Histor...
Warcbase: Building a Scalable Platform on HBase and Hadoop - Part Two, Histor...
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
The RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple CountThe RDF Report Card: Beyond the Triple Count
The RDF Report Card: Beyond the Triple Count
 
Dash UCCSC 2016
Dash UCCSC 2016Dash UCCSC 2016
Dash UCCSC 2016
 
05 SPARQL queries over Open Land Use, Open Transport Net and Smart Points Of ...
05 SPARQL queries over Open Land Use, Open Transport Net and Smart Points Of ...05 SPARQL queries over Open Land Use, Open Transport Net and Smart Points Of ...
05 SPARQL queries over Open Land Use, Open Transport Net and Smart Points Of ...
 
IEEE IRI 16 - Clustering Web Pages based on Structure and Style Similarity
IEEE IRI 16 - Clustering Web Pages based on Structure and Style SimilarityIEEE IRI 16 - Clustering Web Pages based on Structure and Style Similarity
IEEE IRI 16 - Clustering Web Pages based on Structure and Style Similarity
 
Clustering output of Apache Nutch using Apache Spark
Clustering output of Apache Nutch using Apache SparkClustering output of Apache Nutch using Apache Spark
Clustering output of Apache Nutch using Apache Spark
 
Basic Analytic Techniques - Using R Tool - Part 1
Basic Analytic Techniques - Using R Tool - Part 1Basic Analytic Techniques - Using R Tool - Part 1
Basic Analytic Techniques - Using R Tool - Part 1
 
DSpace-CRIS technical level introduction
DSpace-CRIS technical level introductionDSpace-CRIS technical level introduction
DSpace-CRIS technical level introduction
 
Insight_150115_Demo
Insight_150115_DemoInsight_150115_Demo
Insight_150115_Demo
 
DSpace-CRIS ORCID Integration
DSpace-CRIS ORCID IntegrationDSpace-CRIS ORCID Integration
DSpace-CRIS ORCID Integration
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 

Similar to A hint of_mint

Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
Jay Luker
 

Similar to A hint of_mint (20)

Exploring and using the Semantic Web - SSSW09 tutorial
Exploring and using the Semantic Web - SSSW09 tutorialExploring and using the Semantic Web - SSSW09 tutorial
Exploring and using the Semantic Web - SSSW09 tutorial
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
160606 data lifecycle project outline
160606 data lifecycle project outline160606 data lifecycle project outline
160606 data lifecycle project outline
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Open Access Publishing on the Semantic Web
Open Access Publishing  on the  Semantic WebOpen Access Publishing  on the  Semantic Web
Open Access Publishing on the Semantic Web
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
Danbri Drupalcon Export
Danbri Drupalcon ExportDanbri Drupalcon Export
Danbri Drupalcon Export
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
 
LarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - IntroductionLarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - Introduction
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)
 
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsKoalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
Repeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticRepeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data Agnostic
 
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 

Recently uploaded

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Recently uploaded (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 

A hint of_mint

  • 1. A Hint of Mint Peter Sefton [email_address] Duncan Dickinson [email_address]
  • 3.
  • 4.
  • 5. Repository metadata is all strings Photo by http://www.flickr.com/photos/easement/
  • 6. &quot;Aggregated Assault&quot; Some real IR metadata, aggregated using dc:type Journal Article (184)  PeerReviewed (105)  Article (75)  Thesis (66)  Book chapter (65)  NonPeerReviewed (62)  Conference Paper (35)  Journal Articles (Refereed Article) (30)  c1 (28)  techreport (27)  Full-text link or file (26)  Conference or Workshop Item (DEST Category E) (21)  PhD Doctorate (20)  Article (DEST Category C) (19)  journal article (18)  Book Chapter (17)  text (14)  Book Section (10)  Report (9)  Conference Publications (Full Written Paper - Refereed) (8)  Conference or Workshop Item (8)  e1 (8)  Book Chapters (7)  b1 (7)  Book Chapter (DEST Category B) (5) 
  • 7. I'm tagged but what's my identity?
  • 8. RSPCA name: Wayne At the vet: Bootsy Sefton Local council: Bootsy Sefton ID 555-888-888 RSPCA ID: 555-555-555 (owner <name-withheld>) RDFID tag: 555-777-777 At the park: Bootsy
  • 9. The Mint's Misson URI's for ( lost ) dogs?
  • 10.
  • 11.
  • 12.
  • 13. What can you put in it?
  • 14.
  • 15.
  • 16. Data love going to the day spa
  • 17. Day Spa example:  Name matching
  • 19.
  • 20.

Editor's Notes

  1. It&apos;s not just parties that have multiple IDs. Take this example of the range of ways different IRs in Australian Unis fill out the resource type in OAI-PMH.
  2. Parties typically have multiple id&apos;s issued by multiple parties. We have to accept this and work with it - and try to provide matching services where we can (within the limits of privacy legislattion).
  3. Not really. But 
  4. TBL&apos;s linked data was practical advice about how to start building the semantic web. If you stop and think for too long you can end up paralysed by complexity, whether or not such and such URI is for or about the dog - but the  bottom line is if people cant copy and paste from their browser or have the system assign URIs for them in the background then we will not build the semantic web.
  5. I wanted to note that while we build data cleansing features into the Mint from the beginning, these were the hardest for our partner university to make happen. Library staff need to fight for IT resources, and getting the library resources to do data cleaning lined up takes time an commitment.
  6. The team built a name matching system which can import data from the IR and match it against authoritative people-records from the research office system. It uses simple text searches in Solr/Lucene to find publications that might be by a particular researcher, but a human has to inspect every record. There is a similar system developed i Australia called NicNames but it does not have the import and export or APIs needed for The Mint.