SlideShare uma empresa Scribd logo
1 de 19
Commonsense
Knowledge in Wikidata
Filip Ilievski - Pedro Szekely - Daniel Schwabe
submitted to the Wikidata workshop @ ISWC’20
1.1 billion edges
84 million nodes
(May 2020)
‘sister’ of Wikipedia
1.1 billion edges
84 million nodes
(May 2020)
‘sister’ of Wikipedia
Q: pictures of animals with female grammatical gender
in German but male grammatical gender in French
Common sense
the basic ability to perceive, understand, and judge things that
are shared by nearly all people and can be reasonably
expected of nearly all people without need for debate
Research questions
Q1: Does Wikidata contain relevant commonsense knowledge?
Q2: If so, is this complementary to other commonsense knowledge sources?
Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
Frequent words ~ common concepts
Usage stats on a large (independent!) corpus
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
After step 1 & 2:
414 relations
421k edges
Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
Frequent words ~ common concepts
Usage stats on a large (independent!) corpus
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
Take the top 50 relations (97.4% of all edges)
Annotate: domain-specific?
Annotate: map to ConceptNet relations
Domain-specific relations
cell component
strand orientation
molecular function
biological process
decays to
property constraint
Mapping
general-domain
relations to
ConceptNet
How much common sense is there in WD?
Has it been
growing
over time?
Is WD’s commonsense knowledge novel?
Discussion
1. Integrating Wikidata-CS with ConceptNet and other sources
2. Generalizing over instance-level knowledge
a. birthplace of people -> functional property
3. Missing knowledge types
a. typical/expected quantities (chairs have 4 legs, spiders have 8)
b. agent goals (compete in order to win)
c. symbolism (red - danger)
Conclusions
Common concepts & general relations allow us to distill Wikidata-CS
Wikidata contains some commonsense knowledge (0.01%)
Very little overlap with existing commonsense KGs
Future work:
1. enrich common sense coverage of Wikidata
2. integrate commonsense knowledge across sources
Thanks!

Mais conteúdo relacionado

Semelhante a Commonsense knowledge in Wikidata

Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
Rinke Hoekstra
 
Pal gov.tutorial4.session6 2.knowledge double-articulation
Pal gov.tutorial4.session6 2.knowledge double-articulationPal gov.tutorial4.session6 2.knowledge double-articulation
Pal gov.tutorial4.session6 2.knowledge double-articulation
Mustafa Jarrar
 
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
Michele Pasin
 
Collaborative Ontology Building Project
Collaborative Ontology Building Project  Collaborative Ontology Building Project
Collaborative Ontology Building Project
Jie Bao
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012
Figoblog
 

Semelhante a Commonsense knowledge in Wikidata (20)

Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
 
Cascon2011_4_parrot+telix
Cascon2011_4_parrot+telixCascon2011_4_parrot+telix
Cascon2011_4_parrot+telix
 
The Web Ontology Language
The Web Ontology LanguageThe Web Ontology Language
The Web Ontology Language
 
Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...
 
OWL-XML-Summer-School-09
OWL-XML-Summer-School-09OWL-XML-Summer-School-09
OWL-XML-Summer-School-09
 
Greek philosophy programacion-unidad-didactica-clil-template 1
Greek philosophy programacion-unidad-didactica-clil-template 1Greek philosophy programacion-unidad-didactica-clil-template 1
Greek philosophy programacion-unidad-didactica-clil-template 1
 
Pal gov.tutorial4.session6 2.knowledge double-articulation
Pal gov.tutorial4.session6 2.knowledge double-articulationPal gov.tutorial4.session6 2.knowledge double-articulation
Pal gov.tutorial4.session6 2.knowledge double-articulation
 
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
Digital Humanities 2009 - Laying out the conceptual foundations for data inte...
 
The Semantic Web: status and prospects
The Semantic Web: status and prospectsThe Semantic Web: status and prospects
The Semantic Web: status and prospects
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
A Bridge Not too Far
A Bridge Not too FarA Bridge Not too Far
A Bridge Not too Far
 
Collaborative Ontology Building Project
Collaborative Ontology Building Project  Collaborative Ontology Building Project
Collaborative Ontology Building Project
 
Meghyn slides-hse-2014
Meghyn slides-hse-2014Meghyn slides-hse-2014
Meghyn slides-hse-2014
 
Wikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemWikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization System
 
Rudi
RudiRudi
Rudi
 
Rudi
RudiRudi
Rudi
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012
 
Mla May 7
Mla May 7Mla May 7
Mla May 7
 

Mais de Filip Ilievski

Mais de Filip Ilievski (11)

The Commonsense Knowledge Graph
The Commonsense Knowledge GraphThe Commonsense Knowledge Graph
The Commonsense Knowledge Graph
 
SemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tailSemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tail
 
A look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubbleA look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubble
 
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
 
Systematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity LinkingSystematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity Linking
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
LOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataLOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked Data
 
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
 
NAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event CoreferenceNAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event Coreference
 
Mini seminar presentation on context-based NED optimization
Mini seminar presentation on context-based NED optimizationMini seminar presentation on context-based NED optimization
Mini seminar presentation on context-based NED optimization
 
CLiN 25: NED with two-stage coherence optimization
CLiN 25: NED with two-stage coherence optimizationCLiN 25: NED with two-stage coherence optimization
CLiN 25: NED with two-stage coherence optimization
 

Último

Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Silpa
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 

Último (20)

Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 

Commonsense knowledge in Wikidata

  • 1. Commonsense Knowledge in Wikidata Filip Ilievski - Pedro Szekely - Daniel Schwabe submitted to the Wikidata workshop @ ISWC’20
  • 2. 1.1 billion edges 84 million nodes (May 2020) ‘sister’ of Wikipedia
  • 3. 1.1 billion edges 84 million nodes (May 2020) ‘sister’ of Wikipedia Q: pictures of animals with female grammatical gender in German but male grammatical gender in French
  • 4. Common sense the basic ability to perceive, understand, and judge things that are shared by nearly all people and can be reasonably expected of nearly all people without need for debate
  • 5. Research questions Q1: Does Wikidata contain relevant commonsense knowledge? Q2: If so, is this complementary to other commonsense knowledge sources?
  • 6. Principles of Commonsense Knowledge P1: Concepts, not entities Houses have rooms Versailles Palace has 700 rooms P2: Commonness Container used for storage Noma subclass of aphthous stomatitis P3: General-domain knowledge wheel is part of a car cholesterol has component cell membrane
  • 7. Principles of Commonsense Knowledge P1: Concepts, not entities Houses have rooms Versailles Palace has 700 rooms Keep nodes with lowercase alphanumeric characters P2: Commonness Container used for storage Noma subclass of aphthous stomatitis P3: General-domain knowledge wheel is part of a car cholesterol has component cell membrane
  • 8. Principles of Commonsense Knowledge P1: Concepts, not entities Houses have rooms Versailles Palace has 700 rooms Keep nodes with lowercase alphanumeric characters P2: Commonness Container used for storage Noma subclass of aphthous stomatitis Frequent words ~ common concepts Usage stats on a large (independent!) corpus P3: General-domain knowledge wheel is part of a car cholesterol has component cell membrane
  • 9. After step 1 & 2: 414 relations 421k edges
  • 10.
  • 11. Principles of Commonsense Knowledge P1: Concepts, not entities Houses have rooms Versailles Palace has 700 rooms Keep nodes with lowercase alphanumeric characters P2: Commonness Container used for storage Noma subclass of aphthous stomatitis Frequent words ~ common concepts Usage stats on a large (independent!) corpus P3: General-domain knowledge wheel is part of a car cholesterol has component cell membrane Take the top 50 relations (97.4% of all edges) Annotate: domain-specific? Annotate: map to ConceptNet relations
  • 12. Domain-specific relations cell component strand orientation molecular function biological process decays to property constraint
  • 14. How much common sense is there in WD?
  • 16. Is WD’s commonsense knowledge novel?
  • 17. Discussion 1. Integrating Wikidata-CS with ConceptNet and other sources 2. Generalizing over instance-level knowledge a. birthplace of people -> functional property 3. Missing knowledge types a. typical/expected quantities (chairs have 4 legs, spiders have 8) b. agent goals (compete in order to win) c. symbolism (red - danger)
  • 18. Conclusions Common concepts & general relations allow us to distill Wikidata-CS Wikidata contains some commonsense knowledge (0.01%) Very little overlap with existing commonsense KGs Future work: 1. enrich common sense coverage of Wikidata 2. integrate commonsense knowledge across sources