SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
DSNotify - Detecting and Fixing
        Broken Links in Linked Data Sets


        WebS ’09 @ DEXA 2009
        Linz, 02/09/2009
        Bernhard Haslhofer and Niko Popitsch
Bernhard Haslhofer, Niko Popitsch
Summary




Bernhard Haslhofer, Niko Popitsch   2
<mo:MusicGroup rdf:about="/music/artists/084308bd-1654-436f-ba03-df6697104e19#artist">
 <foaf:name>Green Day</foaf:name>
 <owl:sameAs rdf:resource="http://dbpedia.org/resource/Green_Day" />
 <mo:image rdf:resource="/music/images/artists/7col_in/084308bd-1654-436f-ba03-
df6697104e19.jpg" />


 <foaf:page rdf:resource="/music/artists/084308bd-1654-436f-ba03-df6697104e19.html" />
 <mo:musicbrainz rdf:resource="http://musicbrainz.org/artist/084308bd-1654-436f-ba03-
df6697104e19.html" />
 <mo:homepage rdf:resource="http://www.greenday.com/" />
 <mo:fanpage rdf:resource="http://www.greendayvideos.com/" />
 <mo:fanpage rdf:resource="http://www.greenday.net" />
 <mo:imdb rdf:resource="http://www.imdb.com/name/nm1554564/" />
 <mo:myspace rdf:resource="http://www.myspace.com/greenday" />
  ...
...
<rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day">
      <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Green Day
      is an American rock trio formed in 1987. The band has consisted of Billie Joe Armstrong
      (vocals, guitar), Mike Dirnt, and Tré Cool for the majority of its existence...
      </dbpprop:abstract>
</rdf:Description>
...
<rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day">
      <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="de">Green Day
      [gɹiːn deɪ] ist eine US-amerikanische Punk-Rock-Band, mit der Anfang der 1990er das Punk-
      Revival begann. Die Band wurde 1987 von Billie Joe Armstrong und Mike Dirnt zusammen
      mit dem Schlagzeuger John Kiffmeyer alias Al Sobrante als The Sweet Children....
      </dbpprop:abstract>
</rdf:Description>
...
...but...




Bernhard Haslhofer, Niko Popitsch   8
Some numbers...

        •     Events between DBpedia 3.2 (10/2008) and 3.3
              (05/2009)
             •     # resources created: 29449

             •     # resources removed: 4789

             •     # resources moved: 729




Bernhard Haslhofer, Niko Popitsch           9
Link Integrity...
        •     is a qualitative property that is given when all links
              within and between a set of data sources are valid and
              deliver the result intended by the link creator.

        •     cf. referential integrity in RDBMS

        •     demands a solution that
             •     detects broken links between resources

             •     provides support for fixing broken links


Bernhard Haslhofer, Niko Popitsch          11
Types of broken links
        •     Removed link targets
             •     e.g., resource deleted, server not available anymore, etc.

        •     Moved link targets
             •     available at another Web location

             •     e.g., reorganization of Web resources

        •     Modified link targets


Bernhard Haslhofer, Niko Popitsch           12
The DSNotify Approach
        •     periodically monitor items (resources) in a specific
              Linked Data source

        •     extract descriptive features vector for each item

        •     store item + feature vector in index

        •     use feature vectors to detect if items have been
              removed or moved to another location

        •     if moved, add relationship between “old” and “new”
              item

Bernhard Haslhofer, Niko Popitsch     13
Architecture                                             LOD „consuming“
                                                                application



                                                                                         LOD Sources
                                       LOD Source

                                                           owl:sameAs

                                                             owl:sameAs



                                                                                                   monitor
                                             update
                                                                                * Monitor (feature extraction)
                                                                        Event
                                                                        LOG
                                                      notifications
                                       * LOD source                                        Indices
                                          updater
                                                             querying              II        RII         AII




                                        * Decider       Decision making         * Move Detector (heuristic)

                                user
                                                                                                      DSNOTIFY


Bernhard Haslhofer, Niko Popitsch                     14
Index Interaction
                    Item Index (II)           Archived Item Index (AII)       Removed Item Index (RII)
               http://dbpedia.org/resource/
     t1        Green_Day (band)


    t2                                                                         http://dbpedia.org/resource/
                                                                               Green_Day (band)


    t3        http://dbpedia.org/resource/     http://dbpedia.org/resource/
              band/Green_Day                   Green_Day (band)




    t4         http://dbpedia.org/resource/    http://dbpedia.org/resource/
               band/Alternative/Green_Day      band/Green_Day

                                               http://dbpedia.org/resource/
           time                                Green_Day (band)




Bernhard Haslhofer, Niko Popitsch                        15
Move Detection

        •     is a semi-automatic process

        •     calculate similarity between items based on their
              feature vectors using domain-specific heuristics

        •     probability > given threshold: automatic decision

        •     probability < given threshold: ask expert user



Bernhard Haslhofer, Niko Popitsch     16
DSNotify HTTP Interface

        •     GET http://<server>:<port>/<dsnotify>/item/<uri>
             •      find out what happened with an item

        •     GET http://<server>:<port>/<dsnotify>/eventChoice
             •      retrieve pending event choices (move / remove)

        •     ...



Bernhard Haslhofer, Niko Popitsch          17
Evaluation Plan
     t   -n             ...              t   -2                          t   -1                          t   0



DBpedia 2.0                         DBpedia 3.0                  DBpedia 3.1                DBpedia 3.2




                      Diff                              Diff                            Diff
              manual classification                manual classification            manual classification

              mv                    rm            mv             rm               mv             rm

Bernhard Haslhofer, Niko Popitsch                         18
Status / Future Work

        •     1st prototype (infrastructure) ready

        •     annotated test-data set based on DBpedia available

        •     Currently working on:
             •     system for simulating past modifications in DBpedia

             •     the DSNotify evaluation



Bernhard Haslhofer, Niko Popitsch            19
Fixing Your Web since 2009
Backup




Bernhard Haslhofer, Niko Popitsch     21
Evaluation Plan

        •     Monitor simulated DBpedia evolution (t-n - t0)

        •     Precision / recall of automatic move detection
             •     with different similarity thresholds

             •     with different heuristics / and feature vectors




Bernhard Haslhofer, Niko Popitsch            22
Linked Data / Web of Data

        •     Data management paradigm on the basis of Web
              technologies

        •     HTTP, URI, and RDF/S are the key technologies

        •     Applications (not Web browsers) are data consumers

        •     Links between resources play a major role



Bernhard Haslhofer, Niko Popitsch    23

Mais conteúdo relacionado

Semelhante a DSNotify - Detecting and Fixing Broken Links in Linked Data Sets

S. Dixon, C. Mesnage, B. Norton. LinkedBrainz Live
S. Dixon, C. Mesnage, B. Norton. LinkedBrainz LiveS. Dixon, C. Mesnage, B. Norton. LinkedBrainz Live
S. Dixon, C. Mesnage, B. Norton. LinkedBrainz LiveMusicNet
 
Node collaboration - Exported Resources and PuppetDB
Node collaboration - Exported Resources and PuppetDBNode collaboration - Exported Resources and PuppetDB
Node collaboration - Exported Resources and PuppetDBm_richardson
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1IPLODProject
 
A document-inspired way for tracking changes of RDF data - The case of the Op...
A document-inspired way for tracking changes of RDF data - The case of the Op...A document-inspired way for tracking changes of RDF data - The case of the Op...
A document-inspired way for tracking changes of RDF data - The case of the Op...University of Bologna
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila Suero
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila SueroLinked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila Suero
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila SueroBiblioteca Nacional de España
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationBoris Villazón-Terrazas
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with HadoopJosh Devins
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2nolmar01
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeSören Auer
 
RO-crate-FDO-ROHub
RO-crate-FDO-ROHubRO-crate-FDO-ROHub
RO-crate-FDO-ROHubRaul Palma
 
ROHub - Research Object Management Platform Introduction
ROHub - Research Object Management Platform IntroductionROHub - Research Object Management Platform Introduction
ROHub - Research Object Management Platform IntroductionRaul Palma
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integrationRaul Palma
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Furore devdays 2017- rdf1(solbrig)
Furore devdays 2017- rdf1(solbrig)Furore devdays 2017- rdf1(solbrig)
Furore devdays 2017- rdf1(solbrig)DevDays
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
 

Semelhante a DSNotify - Detecting and Fixing Broken Links in Linked Data Sets (20)

S. Dixon, C. Mesnage, B. Norton. LinkedBrainz Live
S. Dixon, C. Mesnage, B. Norton. LinkedBrainz LiveS. Dixon, C. Mesnage, B. Norton. LinkedBrainz Live
S. Dixon, C. Mesnage, B. Norton. LinkedBrainz Live
 
Node collaboration - Exported Resources and PuppetDB
Node collaboration - Exported Resources and PuppetDBNode collaboration - Exported Resources and PuppetDB
Node collaboration - Exported Resources and PuppetDB
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1
 
A document-inspired way for tracking changes of RDF data - The case of the Op...
A document-inspired way for tracking changes of RDF data - The case of the Op...A document-inspired way for tracking changes of RDF data - The case of the Op...
A document-inspired way for tracking changes of RDF data - The case of the Op...
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila Suero
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila SueroLinked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila Suero
Linked Data at the BNE. Elena Escolano Rodríguez, Daniel Vila Suero
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and Organization
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
 
RO-crate-FDO-ROHub
RO-crate-FDO-ROHubRO-crate-FDO-ROHub
RO-crate-FDO-ROHub
 
ROHub - Research Object Management Platform Introduction
ROHub - Research Object Management Platform IntroductionROHub - Research Object Management Platform Introduction
ROHub - Research Object Management Platform Introduction
 
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Furore devdays 2017- rdf1(solbrig)
Furore devdays 2017- rdf1(solbrig)Furore devdays 2017- rdf1(solbrig)
Furore devdays 2017- rdf1(solbrig)
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshow
 

Mais de EuropeanaConnect

Europeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeanaConnect
 
Europeana v1.0 and Interdependencies with EuropeanaConnect
Europeana v1.0 and Interdependencies with EuropeanaConnectEuropeana v1.0 and Interdependencies with EuropeanaConnect
Europeana v1.0 and Interdependencies with EuropeanaConnectEuropeanaConnect
 
in Europeana and the projects
in Europeana and the projectsin Europeana and the projects
in Europeana and the projectsEuropeanaConnect
 
Europeana and linked cultural heritage data
Europeana and linked cultural heritage dataEuropeana and linked cultural heritage data
Europeana and linked cultural heritage dataEuropeanaConnect
 
EuropeanaConnect WP4 - Europeana Licensing Framework
EuropeanaConnect WP4 - Europeana Licensing Framework EuropeanaConnect WP4 - Europeana Licensing Framework
EuropeanaConnect WP4 - Europeana Licensing Framework EuropeanaConnect
 
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...EuropeanaConnect
 
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...EuropeanaConnect
 
Europeana and EuropeanaConnect
Europeana and EuropeanaConnect Europeana and EuropeanaConnect
Europeana and EuropeanaConnect EuropeanaConnect
 
Semantische Kontextualisierung von Museumsbestanden in Europeana
Semantische Kontextualisierung von Museumsbestanden in EuropeanaSemantische Kontextualisierung von Museumsbestanden in Europeana
Semantische Kontextualisierung von Museumsbestanden in EuropeanaEuropeanaConnect
 
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...EU-funded project Europeana - Europe's flagship web portal, making Europe's c...
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...EuropeanaConnect
 
Promoting Austrian Cultural and Scientific Heritage via EUROPEANA
Promoting Austrian Cultural and Scientific Heritage via EUROPEANAPromoting Austrian Cultural and Scientific Heritage via EUROPEANA
Promoting Austrian Cultural and Scientific Heritage via EUROPEANAEuropeanaConnect
 
Linked Data und Semantic Web-basierte Funktionalität in Europeana
Linked Data und Semantic Web-basierte Funktionalität in EuropeanaLinked Data und Semantic Web-basierte Funktionalität in Europeana
Linked Data und Semantic Web-basierte Funktionalität in EuropeanaEuropeanaConnect
 
Enhancing user access to european digital heritage
Enhancing user access to european digital heritageEnhancing user access to european digital heritage
Enhancing user access to european digital heritageEuropeanaConnect
 

Mais de EuropeanaConnect (18)

Europeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeana Connect All-Staff Meeting
Europeana Connect All-Staff Meeting
 
Europeana v1.0 and Interdependencies with EuropeanaConnect
Europeana v1.0 and Interdependencies with EuropeanaConnectEuropeana v1.0 and Interdependencies with EuropeanaConnect
Europeana v1.0 and Interdependencies with EuropeanaConnect
 
The Europeana Personas
The Europeana PersonasThe Europeana Personas
The Europeana Personas
 
in Europeana and the projects
in Europeana and the projectsin Europeana and the projects
in Europeana and the projects
 
Europeana and linked cultural heritage data
Europeana and linked cultural heritage dataEuropeana and linked cultural heritage data
Europeana and linked cultural heritage data
 
090626cc tech-summit
090626cc tech-summit090626cc tech-summit
090626cc tech-summit
 
eBooks on Demand
eBooks on Demand eBooks on Demand
eBooks on Demand
 
EuropeanaConnect WP4 - Europeana Licensing Framework
EuropeanaConnect WP4 - Europeana Licensing Framework EuropeanaConnect WP4 - Europeana Licensing Framework
EuropeanaConnect WP4 - Europeana Licensing Framework
 
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...
Europeana - Digitale Bibliothek Europas. Fenster zur Welt für lokale, regiona...
 
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...
Europeana: Europe's flagship web portal, making Europe's cultural heritage ac...
 
Europeana and EuropeanaConnect
Europeana and EuropeanaConnect Europeana and EuropeanaConnect
Europeana and EuropeanaConnect
 
eBooks & more
eBooks & moreeBooks & more
eBooks & more
 
Semantische Kontextualisierung von Museumsbestanden in Europeana
Semantische Kontextualisierung von Museumsbestanden in EuropeanaSemantische Kontextualisierung von Museumsbestanden in Europeana
Semantische Kontextualisierung von Museumsbestanden in Europeana
 
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...EU-funded project Europeana - Europe's flagship web portal, making Europe's c...
EU-funded project Europeana - Europe's flagship web portal, making Europe's c...
 
Promoting Austrian Cultural and Scientific Heritage via EUROPEANA
Promoting Austrian Cultural and Scientific Heritage via EUROPEANAPromoting Austrian Cultural and Scientific Heritage via EUROPEANA
Promoting Austrian Cultural and Scientific Heritage via EUROPEANA
 
Linked Data und Semantic Web-basierte Funktionalität in Europeana
Linked Data und Semantic Web-basierte Funktionalität in EuropeanaLinked Data und Semantic Web-basierte Funktionalität in Europeana
Linked Data und Semantic Web-basierte Funktionalität in Europeana
 
eBooks on demand
eBooks on demandeBooks on demand
eBooks on demand
 
Enhancing user access to european digital heritage
Enhancing user access to european digital heritageEnhancing user access to european digital heritage
Enhancing user access to european digital heritage
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Último (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

DSNotify - Detecting and Fixing Broken Links in Linked Data Sets

  • 1. DSNotify - Detecting and Fixing Broken Links in Linked Data Sets WebS ’09 @ DEXA 2009 Linz, 02/09/2009 Bernhard Haslhofer and Niko Popitsch Bernhard Haslhofer, Niko Popitsch
  • 3.
  • 4.
  • 5.
  • 6. <mo:MusicGroup rdf:about="/music/artists/084308bd-1654-436f-ba03-df6697104e19#artist"> <foaf:name>Green Day</foaf:name> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Green_Day" /> <mo:image rdf:resource="/music/images/artists/7col_in/084308bd-1654-436f-ba03- df6697104e19.jpg" /> <foaf:page rdf:resource="/music/artists/084308bd-1654-436f-ba03-df6697104e19.html" /> <mo:musicbrainz rdf:resource="http://musicbrainz.org/artist/084308bd-1654-436f-ba03- df6697104e19.html" /> <mo:homepage rdf:resource="http://www.greenday.com/" /> <mo:fanpage rdf:resource="http://www.greendayvideos.com/" /> <mo:fanpage rdf:resource="http://www.greenday.net" /> <mo:imdb rdf:resource="http://www.imdb.com/name/nm1554564/" /> <mo:myspace rdf:resource="http://www.myspace.com/greenday" /> ...
  • 7. ... <rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day"> <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Green Day is an American rock trio formed in 1987. The band has consisted of Billie Joe Armstrong (vocals, guitar), Mike Dirnt, and Tré Cool for the majority of its existence... </dbpprop:abstract> </rdf:Description> ... <rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day"> <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="de">Green Day [gɹiːn deɪ] ist eine US-amerikanische Punk-Rock-Band, mit der Anfang der 1990er das Punk- Revival begann. Die Band wurde 1987 von Billie Joe Armstrong und Mike Dirnt zusammen mit dem Schlagzeuger John Kiffmeyer alias Al Sobrante als The Sweet Children.... </dbpprop:abstract> </rdf:Description> ...
  • 9. Some numbers... • Events between DBpedia 3.2 (10/2008) and 3.3 (05/2009) • # resources created: 29449 • # resources removed: 4789 • # resources moved: 729 Bernhard Haslhofer, Niko Popitsch 9
  • 10.
  • 11. Link Integrity... • is a qualitative property that is given when all links within and between a set of data sources are valid and deliver the result intended by the link creator. • cf. referential integrity in RDBMS • demands a solution that • detects broken links between resources • provides support for fixing broken links Bernhard Haslhofer, Niko Popitsch 11
  • 12. Types of broken links • Removed link targets • e.g., resource deleted, server not available anymore, etc. • Moved link targets • available at another Web location • e.g., reorganization of Web resources • Modified link targets Bernhard Haslhofer, Niko Popitsch 12
  • 13. The DSNotify Approach • periodically monitor items (resources) in a specific Linked Data source • extract descriptive features vector for each item • store item + feature vector in index • use feature vectors to detect if items have been removed or moved to another location • if moved, add relationship between “old” and “new” item Bernhard Haslhofer, Niko Popitsch 13
  • 14. Architecture LOD „consuming“ application LOD Sources LOD Source owl:sameAs owl:sameAs monitor update * Monitor (feature extraction) Event LOG notifications * LOD source Indices updater querying II RII AII * Decider Decision making * Move Detector (heuristic) user DSNOTIFY Bernhard Haslhofer, Niko Popitsch 14
  • 15. Index Interaction Item Index (II) Archived Item Index (AII) Removed Item Index (RII) http://dbpedia.org/resource/ t1 Green_Day (band) t2 http://dbpedia.org/resource/ Green_Day (band) t3 http://dbpedia.org/resource/ http://dbpedia.org/resource/ band/Green_Day Green_Day (band) t4 http://dbpedia.org/resource/ http://dbpedia.org/resource/ band/Alternative/Green_Day band/Green_Day http://dbpedia.org/resource/ time Green_Day (band) Bernhard Haslhofer, Niko Popitsch 15
  • 16. Move Detection • is a semi-automatic process • calculate similarity between items based on their feature vectors using domain-specific heuristics • probability > given threshold: automatic decision • probability < given threshold: ask expert user Bernhard Haslhofer, Niko Popitsch 16
  • 17. DSNotify HTTP Interface • GET http://<server>:<port>/<dsnotify>/item/<uri> • find out what happened with an item • GET http://<server>:<port>/<dsnotify>/eventChoice • retrieve pending event choices (move / remove) • ... Bernhard Haslhofer, Niko Popitsch 17
  • 18. Evaluation Plan t -n ... t -2 t -1 t 0 DBpedia 2.0 DBpedia 3.0 DBpedia 3.1 DBpedia 3.2 Diff Diff Diff manual classification manual classification manual classification mv rm mv rm mv rm Bernhard Haslhofer, Niko Popitsch 18
  • 19. Status / Future Work • 1st prototype (infrastructure) ready • annotated test-data set based on DBpedia available • Currently working on: • system for simulating past modifications in DBpedia • the DSNotify evaluation Bernhard Haslhofer, Niko Popitsch 19
  • 20. Fixing Your Web since 2009
  • 22. Evaluation Plan • Monitor simulated DBpedia evolution (t-n - t0) • Precision / recall of automatic move detection • with different similarity thresholds • with different heuristics / and feature vectors Bernhard Haslhofer, Niko Popitsch 22
  • 23. Linked Data / Web of Data • Data management paradigm on the basis of Web technologies • HTTP, URI, and RDF/S are the key technologies • Applications (not Web browsers) are data consumers • Links between resources play a major role Bernhard Haslhofer, Niko Popitsch 23