SlideShare uma empresa Scribd logo
1 de 14
The Other Side of Linked Data:
Managing Metadata Aggregation
ALCTS Metadata Interest Group
ALA Midwinter 2014
Where Are We Now?
• Major projects so far focused on exposing
selected portions of their data for
‘experimentation’
– Who’s using this data?
– Can LOD for libraries succeed on that basis?
• LOD is not just outputs, needs actual use to
inform practice
– A more complete view of the environment and
workflow should help
Outline
• Limitations of the traditional database strategy
– Including records, normalization, de-duplication, etc.
• Components of a fuller view
– Workflow
– Inputs, outputs
– Data cache and services
– Need for automated orchestration
– The maintenance conundrum
Substituting a Cache for a Database
• Supports multiple streams of data
• Allows detailed provenance to be carried over
time
• Separates services from data storage
• Allows more extensive automation (and
orchestration of services)
• Focuses valuable human effort where it’s
needed: analysis, design and implementation
of improvement services
Workflow
• Obtain data (possibly as ‘records’)
• Store data as statements in cache
• Evaluate data by source or collection
• Improve data using specific services, as
determined by evaluation
• Publish improved data
• [Rinse, repeat]
Yellow=Data we use now
Green=Data we’re adding
Yellow=Data we share now
Orange=Data we propose to share
Green=Data categories we can share
Developing and Defining Services
• Small single purpose services are easier to
develop and maintain
– What services you need are determined by goals,
evaluation results, etc.
– ‘Orchestration’ of services applies them to specific
kinds of data, in order
– Services can be described, and linked, to expose
who, what, when and how to downstream users
Developing Automated Interaction
• Rule: Use humans for things requiring human
understanding and decision making
– Use machines for everything else
– A manual process for something a machine can do as
well or better is a failure
• Improvement services can be granular, invoked in
prescribed order, and report results for later use
– Continuous improvement necessary to respond to
continuous change
Data Maintenance
• Improved data returns as statements to the data
cache, with provenance attached
• Statement strategy avoids overwriting of new data
over ‘improved’ data
• Each new statement adds to what is known about a
described resource
• Statements can be cherry picked and exposed to others in
statements or records, in ‘flavors’ or as a ‘everything we
have’
Contact
Information
Diane Hillmann
metadata.maven@gmail.com
Gordon Dunsire
gordon@gordondunsire.com
Jon Phipps
jonphipps@gmail.com
The First MetadataMobile

Mais conteúdo relacionado

Mais procurados

data_blending
data_blendingdata_blending
data_blending
subit1615
 
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
Angela Boyd
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven Approach
Timothy Valihora
 
2015-10-01 Structured Data Archiving InfoGovCon
2015-10-01 Structured Data Archiving InfoGovCon2015-10-01 Structured Data Archiving InfoGovCon
2015-10-01 Structured Data Archiving InfoGovCon
Donda L. Young, CIP
 
12 mdm strategy
12 mdm strategy12 mdm strategy
12 mdm strategy
PiLog
 

Mais procurados (17)

data_blending
data_blendingdata_blending
data_blending
 
Warehouse Planning and Implementation
Warehouse Planning and ImplementationWarehouse Planning and Implementation
Warehouse Planning and Implementation
 
Managed support services- abacasys.com
Managed support services- abacasys.comManaged support services- abacasys.com
Managed support services- abacasys.com
 
SAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUSSAS MDM TRAINING ,SAS MDM SYLLABUS
SAS MDM TRAINING ,SAS MDM SYLLABUS
 
Global IT Outsourcing case study
Global IT Outsourcing case studyGlobal IT Outsourcing case study
Global IT Outsourcing case study
 
Augury Introduction V2 1
Augury Introduction V2 1Augury Introduction V2 1
Augury Introduction V2 1
 
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
Data Modeling, Meta Data and Data Lineage Demo - Highlights from 2016 Data Mo...
 
Introduction to the Update-driven Approach
Introduction to the Update-driven ApproachIntroduction to the Update-driven Approach
Introduction to the Update-driven Approach
 
2015-10-01 Structured Data Archiving InfoGovCon
2015-10-01 Structured Data Archiving InfoGovCon2015-10-01 Structured Data Archiving InfoGovCon
2015-10-01 Structured Data Archiving InfoGovCon
 
12 mdm strategy
12 mdm strategy12 mdm strategy
12 mdm strategy
 
Lean Data Lineage
Lean Data LineageLean Data Lineage
Lean Data Lineage
 
The Future of Standards
The Future of StandardsThe Future of Standards
The Future of Standards
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 
Enterprise integration Data Resource consideration
Enterprise integration Data Resource considerationEnterprise integration Data Resource consideration
Enterprise integration Data Resource consideration
 
Healthcare IT Meaningful Use
Healthcare IT Meaningful UseHealthcare IT Meaningful Use
Healthcare IT Meaningful Use
 
3 Ways Tableau Improves Predictive Analytics
3 Ways Tableau Improves Predictive Analytics3 Ways Tableau Improves Predictive Analytics
3 Ways Tableau Improves Predictive Analytics
 
Business Intelligence System in MIS
Business Intelligence System in MIS Business Intelligence System in MIS
Business Intelligence System in MIS
 

Destaque

Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
Aerospike, Inc.
 
6 basic steps of software development process
6 basic steps of software development process6 basic steps of software development process
6 basic steps of software development process
Riant Soft
 

Destaque (11)

British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014
 
OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012
 
Visualize Learn Improve With Agile
Visualize Learn Improve With AgileVisualize Learn Improve With Agile
Visualize Learn Improve With Agile
 
Site selection for the MilkIT project: Example from the EADD project
Site selection for the MilkIT project: Example from the EADD projectSite selection for the MilkIT project: Example from the EADD project
Site selection for the MilkIT project: Example from the EADD project
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Promoting knowledge sharing in projects
Promoting knowledge sharing in projectsPromoting knowledge sharing in projects
Promoting knowledge sharing in projects
 
Surprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusinessSurprising failure factors when implementing eCommerce and Omnichannel eBusiness
Surprising failure factors when implementing eCommerce and Omnichannel eBusiness
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
 
Omnichannel Customer Experience
Omnichannel Customer ExperienceOmnichannel Customer Experience
Omnichannel Customer Experience
 
Oracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons LearnedOracle R12 Upgrade Lessons Learned
Oracle R12 Upgrade Lessons Learned
 
6 basic steps of software development process
6 basic steps of software development process6 basic steps of software development process
6 basic steps of software development process
 

Semelhante a The Other Side of Linked Open Data: Managing Metadata Aggregation

Sabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large EnterpriseSabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large Enterprise
Orchestra Networks
 
Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann
 
Creating data-driven-org
Creating data-driven-orgCreating data-driven-org
Creating data-driven-org
jay_grossman
 
Data management plan template
Data management plan templateData management plan template
Data management plan template
501 Commons
 

Semelhante a The Other Side of Linked Open Data: Managing Metadata Aggregation (20)

Agility for big data
Agility for big data Agility for big data
Agility for big data
 
Sabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large EnterpriseSabre: Master Reference Data in the Large Enterprise
Sabre: Master Reference Data in the Large Enterprise
 
The Rise of Self -service Business Intelligence
The Rise of Self -service Business IntelligenceThe Rise of Self -service Business Intelligence
The Rise of Self -service Business Intelligence
 
Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315Ashley Ohmann--Data Governance Final 011315
Ashley Ohmann--Data Governance Final 011315
 
Creating data-driven-org
Creating data-driven-orgCreating data-driven-org
Creating data-driven-org
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
 
Applying a User-Centered Design Approach to Improve Data Use in Decision Making
Applying a User-Centered Design Approach to Improve Data Use in Decision MakingApplying a User-Centered Design Approach to Improve Data Use in Decision Making
Applying a User-Centered Design Approach to Improve Data Use in Decision Making
 
Cff data governance best practices
Cff data governance best practicesCff data governance best practices
Cff data governance best practices
 
Data Cleaning
Data CleaningData Cleaning
Data Cleaning
 
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
 
How to Structure the Data Organization
How to Structure the Data OrganizationHow to Structure the Data Organization
How to Structure the Data Organization
 
Itilv3
Itilv3Itilv3
Itilv3
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
Dwbasics
DwbasicsDwbasics
Dwbasics
 
Data Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with ClouderaData Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with Cloudera
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Data management plan template
Data management plan templateData management plan template
Data management plan template
 
KIT601 Unit I.pptx
KIT601 Unit I.pptxKIT601 Unit I.pptx
KIT601 Unit I.pptx
 
DATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptxDATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptx
 
Applying Big Data Superpowers to Healthcare
Applying Big Data Superpowers to HealthcareApplying Big Data Superpowers to Healthcare
Applying Big Data Superpowers to Healthcare
 

Mais de Diane Hillmann

Mais de Diane Hillmann (20)

RDA and Linked Data: where's the beef
RDA and Linked Data: where's the beefRDA and Linked Data: where's the beef
RDA and Linked Data: where's the beef
 
RDA: Alive and Well and Still Speaking MARC
RDA: Alive and Well and Still Speaking MARCRDA: Alive and Well and Still Speaking MARC
RDA: Alive and Well and Still Speaking MARC
 
Vocabulary Development for Local Use: A DIY Introduction
Vocabulary Development for Local Use: A DIY IntroductionVocabulary Development for Local Use: A DIY Introduction
Vocabulary Development for Local Use: A DIY Introduction
 
What Can We Do About Our Legacy Data?
What Can We Do About Our Legacy Data?What Can We Do About Our Legacy Data?
What Can We Do About Our Legacy Data?
 
Moving to an open world
Moving to an open worldMoving to an open world
Moving to an open world
 
Why change?
Why change?Why change?
Why change?
 
Versioning for Authorities, presentation at Midwinter Chicago 2015
Versioning  for Authorities, presentation at Midwinter Chicago 2015Versioning  for Authorities, presentation at Midwinter Chicago 2015
Versioning for Authorities, presentation at Midwinter Chicago 2015
 
RDA as linked data (RDA Forum)
RDA as linked data (RDA Forum)RDA as linked data (RDA Forum)
RDA as linked data (RDA Forum)
 
What's goin' on?
What's goin' on?What's goin' on?
What's goin' on?
 
Playing with Jane
Playing with JanePlaying with Jane
Playing with Jane
 
What is an RDA Record?
What is an RDA Record?What is an RDA Record?
What is an RDA Record?
 
The RDA Vocabularies: What They Are, How They Work
The RDA Vocabularies: What They Are, How They WorkThe RDA Vocabularies: What They Are, How They Work
The RDA Vocabularies: What They Are, How They Work
 
Oregon State visit 2011
Oregon State visit 2011Oregon State visit 2011
Oregon State visit 2011
 
RDA & the New World of Metadata
RDA & the New World of MetadataRDA & the New World of Metadata
RDA & the New World of Metadata
 
Mapmakers
MapmakersMapmakers
Mapmakers
 
A Consideration of Library Holdings in the World Beyond MARC
A Consideration of Library Holdings in the World Beyond MARCA Consideration of Library Holdings in the World Beyond MARC
A Consideration of Library Holdings in the World Beyond MARC
 
Maps & gaps: strategies for vocabulary design and development
Maps & gaps: strategies for vocabulary design and developmentMaps & gaps: strategies for vocabulary design and development
Maps & gaps: strategies for vocabulary design and development
 
NISO Bibliographic Roadmap Meeting Proposal
NISO Bibliographic Roadmap Meeting ProposalNISO Bibliographic Roadmap Meeting Proposal
NISO Bibliographic Roadmap Meeting Proposal
 
Challenges for a new era
Challenges for a new eraChallenges for a new era
Challenges for a new era
 
Lossless MARC Mapping
Lossless MARC MappingLossless MARC Mapping
Lossless MARC Mapping
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

The Other Side of Linked Open Data: Managing Metadata Aggregation

  • 1. The Other Side of Linked Data: Managing Metadata Aggregation ALCTS Metadata Interest Group ALA Midwinter 2014
  • 2. Where Are We Now? • Major projects so far focused on exposing selected portions of their data for ‘experimentation’ – Who’s using this data? – Can LOD for libraries succeed on that basis? • LOD is not just outputs, needs actual use to inform practice – A more complete view of the environment and workflow should help
  • 3. Outline • Limitations of the traditional database strategy – Including records, normalization, de-duplication, etc. • Components of a fuller view – Workflow – Inputs, outputs – Data cache and services – Need for automated orchestration – The maintenance conundrum
  • 4. Substituting a Cache for a Database • Supports multiple streams of data • Allows detailed provenance to be carried over time • Separates services from data storage • Allows more extensive automation (and orchestration of services) • Focuses valuable human effort where it’s needed: analysis, design and implementation of improvement services
  • 5. Workflow • Obtain data (possibly as ‘records’) • Store data as statements in cache • Evaluate data by source or collection • Improve data using specific services, as determined by evaluation • Publish improved data • [Rinse, repeat]
  • 6.
  • 7. Yellow=Data we use now Green=Data we’re adding
  • 8.
  • 9. Yellow=Data we share now Orange=Data we propose to share Green=Data categories we can share
  • 10. Developing and Defining Services • Small single purpose services are easier to develop and maintain – What services you need are determined by goals, evaluation results, etc. – ‘Orchestration’ of services applies them to specific kinds of data, in order – Services can be described, and linked, to expose who, what, when and how to downstream users
  • 11. Developing Automated Interaction • Rule: Use humans for things requiring human understanding and decision making – Use machines for everything else – A manual process for something a machine can do as well or better is a failure • Improvement services can be granular, invoked in prescribed order, and report results for later use – Continuous improvement necessary to respond to continuous change
  • 12.
  • 13. Data Maintenance • Improved data returns as statements to the data cache, with provenance attached • Statement strategy avoids overwriting of new data over ‘improved’ data • Each new statement adds to what is known about a described resource • Statements can be cherry picked and exposed to others in statements or records, in ‘flavors’ or as a ‘everything we have’

Notas do Editor

  1. If LOD exists in multiple versions, and nobody uses it, does it make noise?
  2. Evaluation using statistical analysis tool, from http://dcpapers.dublincore.org/pubs/article/view/744, Analyzing Metadata for Effective Use and Re-Use Naomi Dushay, Diane I. Hillmann
  3. Revised diagram from: Orchestrating metadata enhancement services: Introducing Lenny Jon Phipps, Diane I. Hillmann, Gordon Paynter. Note that XForms in this context means ‘Transforms’—was well before an XForms standard that means something specific. http://dcpapers.dublincore.org/pubs/article/view/803