This document discusses how communities curate knowledge and how ontologists can help. It describes how a community accumulates and persuades each other of facts through shared tasks and communication. The author developed an ontology and semantic enrichment of deletion discussions on Wikipedia to organize arguments according to the community's criteria of notability, sources, maintenance, and bias. A user test found the alternate interface using the ontology was preferred as it structured information and provided an overview of key arguments for each criteria. The process involves understanding a community, developing a model of their process, building a computer support system, and testing and refining the system.
4. Knowledge
• Statements believed to be true
• Arrived at through interpreting evidence
"Body of truths or facts accumulated in the
course of time" – dictionary.com
5. Knowledge
• Statements believed to be true
• Arrived at through interpreting evidence
"Body of truths or facts accumulated in the
course of time" – dictionary.com
8. My answers
• How do communities curate knowledge?
– A community has mechanisms for accumulating
and persuading each other of "facts".
• How can information technology help?
9. My answers
• How do communities curate knowledge?
– A community has mechanisms for accumulating
and persuading each other of "facts".
• How can information technology help?
– Use knowledge representation systems.
Structure evidence the community uses to
persuade each other.
10. Which knowledge should be included
in Wikipedia?
Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion: How Experience Improves the
Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW 2013.
Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia: Some Proposals Using Decision
Factors. [Demo]” In WikiSym2012.
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia: Decision Factors and Outcomes.” In
WikiSym2012.
19. Problem: Newcomers are confused about
Wikipedia's standards
o "Emsworth Cricket Club is one of the oldest cricket
clubs in the world, and this really is worth a
mention. Especially on a website, where pointless
people … gets (sic) a mention."
o "Why just because it is a small team and not major
does it not deserve it’s (sic) own page on here?"
19
20. Use community criteria to summarize
discussions
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
21. Determine the ontology
o Content analysis of a corpus
o Compare two different annotation approaches
o Iterative annotation
• Multiple annotators
• Refine to get good inter-annotator agreement
• 4 rounds of annotation
22. About the corpus
o 72 discussions started on 1 day.
Each discussion has
• 3—33 messages
• 2—15 participants
o In total, 741 messages contributed by 244 users.
Each message has
• 3—350+ words
o 98 printed A4 sheets
23. 2 types of annotation
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Informal argumentation
(philosophical & computational argumentation)
• Identify & prevent errors in reasoning (fallacies)
• 60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
24. For the ontology, we chose decision factors
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Most appropriate for writing support
• 15 categories + 2 non-argumentative categories
• Detailed analysis of content
o 2. Decision Factors
o (drawing on Ashley 1991)
• Close to the community rules & policies
• 4 categories + 1 catchall
• Good domain coverage
25. Factor Example (used to justify `keep')
Notability Anyone covered by another
encyclopedic reference is considered
notable enough for inclusion in
Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly
notable band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket "hangon" for
all articles on newly- drafted players
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Decision Factors + "Other"
26. Decision factors articulate values/criteria
o 4 Factors in Deletion Discussions cover
• 91% of comments
• 70% of discussions
o We argue that the best way to avoid deletion is for
readers to understand these criteria.
26
27. Use community criteria to summarize
discussions
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
35. PU* - Perceived usefulness
PE* - Perceived ease of use
DC -Decision completeness
PF - Perceived effort
IC* - Information
completeness
Statistical Significance
PU* p < .001
PE* p .001
IC* p .039
37. Results: 84% prefer our system
“Information is structured and I can quickly get an
overview of the key arguments.”
“The ability to navigate the comments made it a bit
easier to filter my mind set and to come to a
conclusion.”
“It offers the structure needed to consider each factor
separately, thus making the decision easier. Also, the
number of comments per factor offers a quick
indication of the relevance and the deepness of the
decision.”
16/19, based on a 20 participant user test.
1 participant did not take the final survey
38. Summary
o How do communities curate knowledge?
• By discussing and applying community standards.
• In Wikipedia, 4 questions are used to evaluate borderline
articles:
o Notability – Is the topic appropriate for our encyclopedia?
o Sources – Is the article well-sourced?
o Maintenance – Can we maintain this article?
o Bias – Is the article neutral? POV appropriately weighted?
o How can information technology help?
• Organize evidence based on the criteria communities use.
• In Wikipedia, we developed an alternate interface for
deletion discussions.
39. Summary: Our process
o Get to know a community and its needs
(Ethnography)
o Develop a model for their process
(Annotation to inform ontology development)
o Build a computer support system
(Web standards: RDF/OWL, SPARQL)
o Test & refine the system
44. Evidence + Rule -> Conclusion
“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”
CSCW 2013
Notas do Editor
Identify and explicitly represent arguments, and in particular
successful arguments that are persuasive to a given audience.
Explain community expectations (how to be convincing)
Support making & auditing decisions
Retain new users
Technically started or relisted
Corpus is https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Log/2011_January_29
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
very few content standards need to be clearly communicated to readers in order to bring significant benefit. 69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
****42-45.
45: rdfs:type,
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”