O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Workshop on FOLKSONOMIES   Singapore, February 9, 2009
We are very proud to hold a workshop in the „informational city“ of Singapore.
Wolfgang G. Stock Isabella Peters Researcher, Dept. for Information Science, Heinrich-Heine-University Düsseldorf, Germany...
Social Network on our Workshop <ul><li>literature, slides, links </li></ul><ul><li>Start discussions! Start a forum! Invit...
Agenda <ul><li>Folksonomies – Indexing without rules </li></ul><ul><li>Folksonomies in information services and library ca...
Lesson 1 Folksonomies – Indexing without Rules
Indexing without Rules <ul><li>“ Anything goes”  </li></ul><ul><li>“ Against method”, 1975 (Paul K. Feyerabend, Austro-Ame...
Indexing
Prosumer <ul><li>“ Prosumer” </li></ul><ul><li>introduced 1980 by Alvin Toffler (American economist) in “The Third Wave” <...
Tri-partite System <ul><li>document (resource) </li></ul><ul><li>prosumer (user) </li></ul><ul><li>tag </li></ul>
Cognitive Indexing Processes Source:  Sinha (2005)
Library 2.0
 
 
Network Economy: Positive Feedback Loop Time Number of active (tagging) users Critical Mass New users come along. Value of...
Time red: Competitor yellow: Your Library 2.0 Service „ Combat Area“ „ Take off“ Positive Feedback „ Saturation“ „ Entry“ ...
<ul><li>marketing </li></ul><ul><ul><li>product </li></ul></ul><ul><ul><ul><li>invite the catalogue’s users to tag </li></...
<ul><li>marketing (continued) </li></ul><ul><ul><li>promotion </li></ul></ul><ul><ul><ul><li>advertising / public relation...
Lesson 2 Folksonomies in Information Services and Library Catalogues
Narrow Folksonomies <ul><li>only one tagger (the content creator) </li></ul><ul><li>no multiple tagging </li></ul><ul><li>...
Extended Narrow Folksonomies <ul><li>more than one tagger </li></ul><ul><li>no multiple tagging </li></ul><ul><li>example:...
Broad Folksonomies <ul><li>more than one tagger </li></ul><ul><li>multiple tagging </li></ul><ul><li>example:  Delicious <...
Tagging of OPACs <ul><li>2  possibilities: </li></ul><ul><li>1) tagging of resources within the library’s website </li></u...
Tagging of OPACS: Within Library’s Website: PennTags http://tags.library.upenn.edu/
Tagging of OPACS: Within Library’s Website: Ann Arbor District Library http://www.aadl.org/catalog
Tagging of OPACS: Within Library’s Website: University Library Hildesheim http://www.uni-hildesheim.de/mybib/all_tags
Tagging of OPACS: Within Library’s Website <ul><li>advantages: </li></ul><ul><ul><li>user behaviour can be directly observ...
Tagging of OPACS: Within Library’s Website <ul><li>disadvantages: </li></ul><ul><ul><li>development and implementation (co...
Tagging of Resources Outside the Library‘s Firewall: HEIDI http://katalog.ub.uni-heidelberg.de/cgi-bin/search.cgi
Tagging of Resources Outside the Library‘s Firewall: LibraryThing http://www.librarything.com/search
Tagging of Resources Outside the Library‘s Firewall: BibSonomy http://www.bibsonomy.org/
Tagging of Resources Outside the Library‘s Firewall <ul><li>advantages: </li></ul><ul><ul><li>development and implementati...
Tagging of Resources Outside the Library‘s Firewall <ul><li>disadvantages </li></ul><ul><ul><li>user behaviour cannot be o...
Social OPAC – Thoughts which have to be made in advance <ul><li>according  to  Furner  (2007)  </li></ul><ul><li>during in...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)  </li></ul><ul><li>during index...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)   </li></ul><ul><li>during retr...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)   </li></ul><ul><li>during retr...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)   </li></ul><ul><li>for system ...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)   </li></ul><ul><li>for system ...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007)   </li></ul><ul><li>for system ...
Social OPAC – Thoughts which have to be made in advance <ul><li>according to Arch (2007)   </li></ul><ul><li>for system de...
Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>the...
Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>the...
Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>the...
Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>the...
Short Break
Lesson 3 Folksonomies and Knowledge Representation
<ul><li>Collective Intelligence </li></ul><ul><li>“ Wisdom of the Crowds” (Surowiecki) </li></ul><ul><li>“ Hive Minds” (Kr...
Power Law Tag Distribution Source : http:// del.icio.us f (x)= C / x a Users Tags 80/20-Rule Long Tail Power Tags
Inverse-logistic Tag Distribution Source : http:// del.icio.us Users Tags f (x)= e -C‘(x-1) b Long Trunk Long Tail Power T...
Document-specific Tag Distributions distributions of the top 10 tags in a broad folksonomy (sample: Delicious) N = 650 boo...
<ul><li>Power Law Distribution </li></ul><ul><li>Inverse-logistic  Distribution </li></ul>Power Tags Power Tags Power Tags
Tagging Behaviour <ul><li>1 … 3 tags per document and user </li></ul><ul><li>motivations for tagging </li></ul><ul><ul><li...
Sentiment Tags <ul><li>negative tags: “awful” – “foolish”, … </li></ul><ul><li>positive tags: “amazing” – “useful”, … </li...
Documents which Provoke Emotions Tagging using scroll-bars Source:  Schmidt and Stock (2009)
Tag Types
Discrimination Power of Tags low discrimination very low discrimination rare “ Long tail” strong discrimination low discri...
<ul><li>authentic user language – solution of the “vocabulary problem” </li></ul><ul><li>actuality </li></ul><ul><li>multi...
Disadvantages of Indexing with Folksonomies <ul><li>absence of controlled vocabulary </li></ul><ul><li>different basic lev...
Lesson 4 Folksonomies and Information Retrieval
<ul><li>two sides of the same coin </li></ul><ul><li>Immanuel Kant (German philosopher): Thoughts without content are empt...
<ul><li>“ cleaning tags up” </li></ul><ul><li>but: only additionally to raw tags </li></ul><ul><li>important basic tasks: ...
Relevance Ranking: State of the Art <ul><li>Interestingness Ranking (Yahoo! / Flickr) </li></ul><ul><li>number of tags to ...
Relevance Ranking of Tagged Documents Source:  Peters and Stock (2007)
<ul><li>tag frequency  or  TF*IDF  or  TF*ITF </li></ul><ul><ul><li>index tags (only in broad folksonomies) </li></ul></ul...
<ul><li>click rates of a document </li></ul><ul><li>number of tagging users </li></ul><ul><li>number of comments </li></ul...
<ul><li>performative document weight </li></ul><ul><li>sentiment weight </li></ul><ul><li>rating weight </li></ul>Retrieva...
Short Break
Lesson 5 Tag Gardening for Folksonomy Enrichment and Maintenance
The Folksonomy Tag Garden
Goal of Tag Gardening: Emergent Semantics
<ul><li>removing “bad tags”: spelling variants (plural vs. singular, conflation of multi-word tags) and spam through “pest...
<ul><li>extending the folksonomy with rarely used “baby tags” as high-frequency tags do not sufficiently discriminate reso...
<ul><li>shaping the folksonomy into “flower beds”, distinguishing similar looking “plants”, assigning their “species”, bra...
<ul><li>combination of folksonomies and KOS during indexing and retrieval </li></ul><ul><li>achieved by semantic-oriented ...
Emergent Semantics <ul><ul><ul><li>folksonomies have no explicit structure; there are no visible paradigmatic semantic rel...
(geographical) meronymy synonymy hyponymy Hidden Paradigmatic Relations Source : http:// www.flickr.com
<ul><li>Flickr landscape photos: N=491; analysable tags: 3,618; tags per photo: 7.4 </li></ul><ul><li>Possible document-sp...
From Tag Gardening to Collaborative  KOS Development <ul><li>community members als gardeners </li></ul><ul><li>tagging  </...
Maintenance of KOS and Folksonomy Source : Christiaens (2006) Folksonomy KOS Tag Gardening new terms – new relations
From Tag Clouds to Tag Clusters <ul><li>tag cloud </li></ul><ul><li>alphabetical arrangement  </li></ul><ul><li>font size ...
Lesson 6 Find „More like me!“  The Social Function of a Folksonomy
Users – Tags - Documents thematically linked shared users thematically linked shared documents
Shared Documents & Thematically  Linked Users <ul><li>more like this ... </li></ul><ul><li>   similar documents </li></ul...
<ul><li>starting point: single user (ego) </li></ul><ul><li>processing </li></ul><ul><ul><li>(1) tag-specific similarity <...
More like me! Or: More like This User! single linkage clustering (fictitious example) Sim(tag) = 0.21 Sim(doc) = 0.25 Sim(...
The Social Function of a Folksonomy <ul><li>objectives: </li></ul><ul><li>recommendation of other users with similar inter...
Final Discussion Folksonomies in Library Catalogues – Lessons Learned
Lessons Learned <ul><li>Folksonomies – Indexing without rules </li></ul><ul><li>tagging: anything goes – against methods <...
Lessons Learned <ul><li>Folksonomies in information services and library catalogues </li></ul><ul><li>folksonomy types </l...
Lessons Learned <ul><li>Folksonomies and knowledge representation </li></ul><ul><li>collective intelligence (diversity of ...
Lessons Learned <ul><li>Folksonomies and information retrieval </li></ul><ul><li>information linguistics (natural language...
Lessons Learned <ul><li>Tag gardening for folksonomy enrichment and maintenance </li></ul><ul><li>“ weeding”: information ...
Lessons Learned <ul><li>Find „more like me!“. The social function of a folksonomy </li></ul><ul><li>a new function: “More ...
We would like to thank you very much for attending this Workshop. Greetings from Düsseldorf, Germany!
Próximos SlideShares
Carregando em…5
×

Folksonomies: In General and in Libraries

5.385 visualizações

Publicada em

Workshop on "Folksonomies" held at the National Library Board in Singapore, 2009.

Publicada em: Educação
  • Login to see the comments

Folksonomies: In General and in Libraries

  1. 1. Workshop on FOLKSONOMIES Singapore, February 9, 2009
  2. 2. We are very proud to hold a workshop in the „informational city“ of Singapore.
  3. 3. Wolfgang G. Stock Isabella Peters Researcher, Dept. for Information Science, Heinrich-Heine-University Düsseldorf, Germany Lectures on Web 2.0 Services and Information Retrieval Main research area: Folksonomies in Knowledge Representation and Information Retrieval Professor, Head of Dept. for Information Science, Heinrich-Heine-University Düsseldorf, Germany Lectures on Information Retrieval, Knowledge Representation, Informetrics and Information Market Main research areas: Folksonomies, Emotional Information Retrieval and Informetrics of LIS Journals
  4. 4. Social Network on our Workshop <ul><li>literature, slides, links </li></ul><ul><li>Start discussions! Start a forum! Invite more people! Blog!, etc. </li></ul>http://taggingworkshop.ning.com/
  5. 5. Agenda <ul><li>Folksonomies – Indexing without rules </li></ul><ul><li>Folksonomies in information services and library catalogues </li></ul><ul><li>Short Break </li></ul><ul><li>3. Folksonomies and knowledge representation </li></ul><ul><li>4. Folksonomies and information retrieval </li></ul><ul><li>Short Break </li></ul><ul><li>5. Tag gardening for folksonomy enrichment and maintenance </li></ul><ul><li>6. Find „more like me!“. The social function of a folksonomy </li></ul>
  6. 6. Lesson 1 Folksonomies – Indexing without Rules
  7. 7. Indexing without Rules <ul><li>“ Anything goes” </li></ul><ul><li>“ Against method”, 1975 (Paul K. Feyerabend, Austro-American philosopher) </li></ul><ul><li>Tagging </li></ul><ul><li>no rules </li></ul><ul><li>no methods – or even against methods </li></ul><ul><li>indexing a single document </li></ul><ul><ul><li>synonyms – why not? (New York – NY – Big Apple – … ) </li></ul></ul><ul><ul><li>homonyms – never heard! (not: Java [Programming Language] – Java [Island], but Java) </li></ul></ul><ul><ul><li>translations – why not? (Singapore – Singapur – …) </li></ul></ul><ul><ul><li>typing errors – nobody is perfect (Syngapur) </li></ul></ul><ul><ul><li>hierarchical relations (hyponymy) – why not? (Düsseldorf – North Rhine-Westfalia – Germany) </li></ul></ul><ul><ul><li>hierarchical relations (meronymy) – why not? (tree – branch – leaf) </li></ul></ul>
  8. 8. Indexing
  9. 9. Prosumer <ul><li>“ Prosumer” </li></ul><ul><li>introduced 1980 by Alvin Toffler (American economist) in “The Third Wave” </li></ul><ul><li>prosumerism: characteristic property of the knowledge society </li></ul>Producer Consumer Prosumer
  10. 10. Tri-partite System <ul><li>document (resource) </li></ul><ul><li>prosumer (user) </li></ul><ul><li>tag </li></ul>
  11. 11. Cognitive Indexing Processes Source: Sinha (2005)
  12. 12. Library 2.0
  13. 15. Network Economy: Positive Feedback Loop Time Number of active (tagging) users Critical Mass New users come along. Value of the network increase. Number of users of the network increase. „ success breeds success“ only one standard (in a technological area)
  14. 16. Time red: Competitor yellow: Your Library 2.0 Service „ Combat Area“ „ Take off“ Positive Feedback „ Saturation“ „ Entry“ Number of active (tagging) users Critical Mass
  15. 17. <ul><li>marketing </li></ul><ul><ul><li>product </li></ul></ul><ul><ul><ul><li>invite the catalogue’s users to tag </li></ul></ul></ul><ul><ul><ul><li>make it easy! </li></ul></ul></ul><ul><ul><ul><li>no additional password </li></ul></ul></ul><ul><ul><li>price </li></ul></ul><ul><ul><ul><li>“ price” = user’s time </li></ul></ul></ul><ul><ul><ul><li>Save the time of the user! (Ranganathan): time for tagging – time for searching </li></ul></ul></ul><ul><ul><li>place </li></ul></ul><ul><ul><ul><li>add the folksonomy to a well-known service (e.g., your library catalogue) </li></ul></ul></ul>How to Become a Standard (Part 1)
  16. 18. <ul><li>marketing (continued) </li></ul><ul><ul><li>promotion </li></ul></ul><ul><ul><ul><li>advertising / public relations </li></ul></ul></ul><ul><ul><ul><li>communicate the benefits: “Search with your own tags!” – “Create your personomy!” – “Share your knowledge!” – “Find more – and better – resources!” – “Find other users similar to your interests!” – … </li></ul></ul></ul><ul><ul><ul><li>awards: “Tagger of the Week” – “Super-Poster” – “Best tagger award” / prizes (e.g., books) </li></ul></ul></ul><ul><ul><li>personnel </li></ul></ul><ul><ul><ul><li>especially at the entry phase: your staff has to tag </li></ul></ul></ul><ul><ul><ul><li>for promotion </li></ul></ul></ul><ul><ul><ul><li>always: software specialists </li></ul></ul></ul><ul><ul><li>processes </li></ul></ul><ul><ul><ul><li>process management: knowledge representation tasks (e.g., tag gardening) </li></ul></ul></ul><ul><ul><ul><li>process management: information retrieval tasks (e.g., relevance ranking) </li></ul></ul></ul>How to Become a Standard (Part 2)
  17. 19. Lesson 2 Folksonomies in Information Services and Library Catalogues
  18. 20. Narrow Folksonomies <ul><li>only one tagger (the content creator) </li></ul><ul><li>no multiple tagging </li></ul><ul><li>example: YouTube </li></ul>Tags
  19. 21. Extended Narrow Folksonomies <ul><li>more than one tagger </li></ul><ul><li>no multiple tagging </li></ul><ul><li>example: Flickr </li></ul>Source: Vander Wal (2005) Tags Add Tags Option
  20. 22. Broad Folksonomies <ul><li>more than one tagger </li></ul><ul><li>multiple tagging </li></ul><ul><li>example: Delicious </li></ul>Source: Vander Wal (2005) Tags
  21. 23. Tagging of OPACs <ul><li>2 possibilities: </li></ul><ul><li>1) tagging of resources within the library’s website </li></ul><ul><li>2) tagging of resources outside the library’s firewall </li></ul>
  22. 24. Tagging of OPACS: Within Library’s Website: PennTags http://tags.library.upenn.edu/
  23. 25. Tagging of OPACS: Within Library’s Website: Ann Arbor District Library http://www.aadl.org/catalog
  24. 26. Tagging of OPACS: Within Library’s Website: University Library Hildesheim http://www.uni-hildesheim.de/mybib/all_tags
  25. 27. Tagging of OPACS: Within Library’s Website <ul><li>advantages: </li></ul><ul><ul><li>user behaviour can be directly observed and exploited for own applications </li></ul></ul><ul><ul><li>used knowledge organization system (KOS) can profit from user behaviour and user language </li></ul></ul><ul><ul><li>users will be “attracted” to the library </li></ul></ul><ul><ul><li>library will appear “trendy” </li></ul></ul>
  26. 28. Tagging of OPACS: Within Library’s Website <ul><li>disadvantages: </li></ul><ul><ul><li>development and implementation (costs and manpower) of the tagging service have to be taken over from the library </li></ul></ul><ul><ul><li>if only users may tag: librarians may loose their work motivation or may have a feeling of uselessness </li></ul></ul><ul><ul><li>“ lock-in”-effect of users </li></ul></ul>
  27. 29. Tagging of Resources Outside the Library‘s Firewall: HEIDI http://katalog.ub.uni-heidelberg.de/cgi-bin/search.cgi
  28. 30. Tagging of Resources Outside the Library‘s Firewall: LibraryThing http://www.librarything.com/search
  29. 31. Tagging of Resources Outside the Library‘s Firewall: BibSonomy http://www.bibsonomy.org/
  30. 32. Tagging of Resources Outside the Library‘s Firewall <ul><li>advantages: </li></ul><ul><ul><li>development and implementation (costs and manpower) of the tagging service haven‘t to be taken over from the library </li></ul></ul><ul><ul><li>the library may profit from the “know-how” of the provider of the tagging system </li></ul></ul><ul><ul><li>users may profit from tagging activities of hundreds of other users  no lock-in </li></ul></ul><ul><ul><li>library appears “trendy” </li></ul></ul>
  31. 33. Tagging of Resources Outside the Library‘s Firewall <ul><li>disadvantages </li></ul><ul><ul><li>user behaviour cannot be observed or exploited </li></ul></ul><ul><ul><li>your users support other tagging service </li></ul></ul><ul><ul><li>used KOS cannot profit from user behaviour </li></ul></ul>
  32. 34. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>during indexing </li></ul><ul><li>the degree of restriction (if any) placed on the number and/or combination of tags that a tagger may assign to a given resource; </li></ul><ul><li>the degree of restriction (if any) placed on the tagger’s choice and form of tags; </li></ul><ul><li>the provision (if any) of context-sensitive suggestions for tags, or for facets that the tagger may wish to consider; </li></ul>
  33. 35. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>during indexing </li></ul><ul><li>the provision of access (if any) to structured vocabularies of tags; </li></ul><ul><li>the provision of access (if any) to lists or clouds of most frequently- or recently-assigned tags; </li></ul><ul><li>the provision of online access to the full content of resources. </li></ul>
  34. 36. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>during retrieval </li></ul><ul><li>the degree of restriction (if any) placed on the number and/or combination of tags that a searcher may use in a given query; </li></ul><ul><li>the degree of restriction (if any) placed on the searcher’s choice and form of tags; </li></ul><ul><li>the provision (if any) of context-sensitive suggestions for tags, or for facets that the searcher may wish to consider; </li></ul>
  35. 37. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>during retrieval </li></ul><ul><li>the provision of access (if any) to structured vocabularies of tags; </li></ul><ul><li>the provision of access (if any) to lists or clouds of most frequently- or recently-searched tags; </li></ul><ul><li>the extent to which tag search is integrated into the existing OPAC search. </li></ul>
  36. 38. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>for system design and user environment </li></ul><ul><li>to engender a sense of community among library users in separate and remote locations; </li></ul><ul><li>to allow library users to identify other individuals with whom they share interests; </li></ul><ul><li>to engender a sense of empowerment among library users who may not otherwise participate in or contribute to library activities; </li></ul>
  37. 39. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>for system design and user environment </li></ul><ul><li>to encourage library users to engage with the resources that they tag, and thereby to allow users to come to a deeper understanding of those resources and of the contexts in which they were produced; </li></ul><ul><li>to improve the effectiveness of retrieval of records and discovery of resources; </li></ul><ul><li>to improve the effectiveness of personal rediscovery of resources; </li></ul>
  38. 40. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Furner (2007) </li></ul><ul><li>for system design and user environment </li></ul><ul><li>to allow library users to determine which kinds of resources and/or topics are currently popular, newsworthy, or receiving attention; </li></ul><ul><li>to improve the entertainment value of, and thereby the level of user satisfaction with, the search experience; </li></ul><ul><li>to reduce the costs normally incurred in manually cataloging, indexing, or classifying the resources in a collection; </li></ul>
  39. 41. Social OPAC – Thoughts which have to be made in advance <ul><li>according to Arch (2007) </li></ul><ul><li>for system design and user environment </li></ul><ul><li>how to handle spam or spagging; </li></ul><ul><li>how to handle linguistic variations, synonyms, homonyms, etc.; </li></ul><ul><li>„ ramp up-problem“: who will provide the first content? </li></ul><ul><ul><li>subject specialist at your library </li></ul></ul><ul><ul><li>forwarding links to users, who are interested in the topic </li></ul></ul><ul><ul><li>... </li></ul></ul>
  40. 42. Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>they participate in the activities of a community of like-minded people; </li></ul><ul><li>they identify other individuals with whom they share interests; </li></ul><ul><li>they contribute to the activities of the library; </li></ul>
  41. 43. Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>they engage with the resources being tagged and/ or with the records that describe them; </li></ul><ul><li>they contribute to improvements in the effectiveness of other users’ searches; </li></ul><ul><li>they bookmark resources to which repeated personal access is foreseen; </li></ul>
  42. 44. Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>they determine which kinds of resources and/or topics are currently receiving attention; </li></ul><ul><li> they pass the time in a manner that provides entertainment; </li></ul><ul><li> they share their knowledge of the content of resources with others; </li></ul>
  43. 45. Social OPAC: Benefits <ul><li>according to Furner (2007) </li></ul><ul><li>How do the users benefit? </li></ul><ul><li>they demonstrate the extent of their knowledge of the content of resources; </li></ul><ul><li>they instantly recognize the „aboutness“ of the ressource via the tags; </li></ul><ul><li>they benefit from the receipt of any concrete incentives supplied by the implementing institution in return for tagging efforts. </li></ul>
  44. 46. Short Break
  45. 47. Lesson 3 Folksonomies and Knowledge Representation
  46. 48. <ul><li>Collective Intelligence </li></ul><ul><li>“ Wisdom of the Crowds” (Surowiecki) </li></ul><ul><li>“ Hive Minds” (Kroski) – “Vox populi” (Galton) – “Crowdsourcing” </li></ul><ul><li>no discussions, diversity of opinions, decentralisation </li></ul><ul><li>users tag a document independently from each other </li></ul><ul><li>statistical aggregation of data </li></ul><ul><li>Collaborative Intelligence </li></ul><ul><li>discussions and consensus </li></ul><ul><li>prototype service: Wikipedia (but: 90 + 9 + 1 – rule) </li></ul><ul><li>“ Madness of the Crowds” </li></ul><ul><li>e.g., soccer fans – hooligans </li></ul><ul><li>no diversity of opinion – no independence – no decentralisation – no (statistical) aggregation </li></ul>Collective Intelligence
  47. 49. Power Law Tag Distribution Source : http:// del.icio.us f (x)= C / x a Users Tags 80/20-Rule Long Tail Power Tags
  48. 50. Inverse-logistic Tag Distribution Source : http:// del.icio.us Users Tags f (x)= e -C‘(x-1) b Long Trunk Long Tail Power Tags
  49. 51. Document-specific Tag Distributions distributions of the top 10 tags in a broad folksonomy (sample: Delicious) N = 650 bookmarks (minimum of 100 different taging users) Source : Reher (2008); unpublished
  50. 52. <ul><li>Power Law Distribution </li></ul><ul><li>Inverse-logistic Distribution </li></ul>Power Tags Power Tags Power Tags
  51. 53. Tagging Behaviour <ul><li>1 … 3 tags per document and user </li></ul><ul><li>motivations for tagging </li></ul><ul><ul><li>future (own) retrieval </li></ul></ul><ul><ul><li>contribution and sharing </li></ul></ul><ul><ul><li>attract attention </li></ul></ul><ul><ul><li>play and competition </li></ul></ul><ul><ul><li>self presentation </li></ul></ul><ul><ul><li>opinion expression </li></ul></ul><ul><li>factors which influence tagging </li></ul><ul><ul><li>conformity </li></ul></ul><ul><ul><li>the role of recommendation </li></ul></ul>Source: Sen et al. (2006) Source: Rader & Wash (2008 )
  52. 54. Sentiment Tags <ul><li>negative tags: “awful” – “foolish”, … </li></ul><ul><li>positive tags: “amazing” – “useful”, … </li></ul><ul><li>applicable for sentiment analysis of documents </li></ul>Source: Yanbe et al. (2007); service: Hatena Bookmarks
  53. 55. Documents which Provoke Emotions Tagging using scroll-bars Source: Schmidt and Stock (2009)
  54. 56. Tag Types
  55. 57. Discrimination Power of Tags low discrimination very low discrimination rare “ Long tail” strong discrimination low discrimination frequent “ Power tags” rare frequent Tags in Concrete Document Tags in Folksonomy
  56. 58. <ul><li>authentic user language – solution of the “vocabulary problem” </li></ul><ul><li>actuality </li></ul><ul><li>multiple interpretations – many perspectives – bridging the semantic gap </li></ul><ul><li>raise access to information resources </li></ul><ul><li>follow “desire lines” of users </li></ul><ul><li>cheap indexing method – shared indexing </li></ul><ul><li>the more taggers, the more the system becomes better – network effects </li></ul><ul><li>capable of indexing mass information on the Web </li></ul><ul><li>resources for development of knowledge organization systems </li></ul><ul><li>mass quality “control” </li></ul><ul><li>searching - browsing – serendipity </li></ul><ul><li>neologisms </li></ul><ul><li>identify communities and “small worlds” </li></ul><ul><li>collaborative recommender system </li></ul><ul><li>make people sensitive to information indexing </li></ul>Benefits of Indexing with Folksonomies
  57. 59. Disadvantages of Indexing with Folksonomies <ul><li>absence of controlled vocabulary </li></ul><ul><li>different basic levels (in the sense of Eleanor Rosch) </li></ul><ul><li>different interests – loss of context information </li></ul><ul><li>language merging </li></ul><ul><li>hidden paradigmatic relations </li></ul><ul><li>merging of formal (bibliographical) and aboutness tags </li></ul><ul><li>no specific fields </li></ul><ul><li>tags make evaluations (“stupid”) </li></ul><ul><li>spam-tags </li></ul><ul><li>syncategoremata (user-specific tags, “me”) </li></ul><ul><li>performative tags (“to do”, “to read”) </li></ul><ul><li>other misleading keywords </li></ul>
  58. 60. Lesson 4 Folksonomies and Information Retrieval
  59. 61. <ul><li>two sides of the same coin </li></ul><ul><li>Immanuel Kant (German philosopher): Thoughts without content are empty, intuitions without concepts are blind. ... </li></ul>Knowledge Representation and Information Retrieval Knowledge Representation Information Retrieval Knowledge Representation without Information Retrieval is empty. Information Retrieval without Knowledge Representation is blind. Feedback Loop
  60. 62. <ul><li>“ cleaning tags up” </li></ul><ul><li>but: only additionally to raw tags </li></ul><ul><li>important basic tasks: </li></ul><ul><ul><li>language identification </li></ul></ul><ul><ul><li>word identification (problems: “informationscience”, “information_science”, …) </li></ul></ul><ul><ul><li>detection and correction of typing errors </li></ul></ul><ul><ul><li>context-specific tags (“me”) </li></ul></ul><ul><ul><li>identification of named entities </li></ul></ul><ul><ul><li>word form conflation (using, e.g., Porter stemmer) </li></ul></ul><ul><ul><li>decompounding, phrases </li></ul></ul><ul><ul><li>homonymy – synonymy </li></ul></ul><ul><li>“ higher” tasks: </li></ul><ul><ul><li>semantic relations </li></ul></ul><ul><ul><li>translation </li></ul></ul>Information Linguistics
  61. 63. Relevance Ranking: State of the Art <ul><li>Interestingness Ranking (Yahoo! / Flickr) </li></ul><ul><li>number of tags to the document </li></ul><ul><li>number of users, who tagged the document </li></ul><ul><li>number of users, who retrieved the document </li></ul><ul><li>time (the older the document the less relevant) </li></ul><ul><li>relevance of metadata </li></ul><ul><li>Personalized Interestingness Ranking </li></ul><ul><li>user preferences (e.g. favorites) </li></ul><ul><li>user‘s residence </li></ul>
  62. 64. Relevance Ranking of Tagged Documents Source: Peters and Stock (2007)
  63. 65. <ul><li>tag frequency or TF*IDF or TF*ITF </li></ul><ul><ul><li>index tags (only in broad folksonomies) </li></ul></ul><ul><ul><li>search tags </li></ul></ul><ul><li>tag evaluation (feedback of users: „Is this tag useful for finding this document?“) </li></ul><ul><li>more than one search argument: vector space model </li></ul><ul><li>time (new tags in platform: higher weight) </li></ul><ul><li>Super Poster (term tagged by super poster: higher weight) </li></ul><ul><li>Power Tag (higher weight) </li></ul>Retrieval Status Value – Factor 1: Tags tag evaluation
  64. 66. <ul><li>click rates of a document </li></ul><ul><li>number of tagging users </li></ul><ul><li>number of comments </li></ul><ul><li>linked documents: PageRank </li></ul>Retrieval Status Value – Factor 2: Collaboration
  65. 67. <ul><li>performative document weight </li></ul><ul><li>sentiment weight </li></ul><ul><li>rating weight </li></ul>Retrieval Status Value – Factor 3: Prosumer
  66. 68. Short Break
  67. 69. Lesson 5 Tag Gardening for Folksonomy Enrichment and Maintenance
  68. 70. The Folksonomy Tag Garden
  69. 71. Goal of Tag Gardening: Emergent Semantics
  70. 72. <ul><li>removing “bad tags”: spelling variants (plural vs. singular, conflation of multi-word tags) and spam through “pesticides” </li></ul><ul><li>achieved by type-ahead functionality during indexing, editing functionalities for tags afterwards the application (remove, change, etc.), Natural Language Processing of index tags and search tags, indexing and retrieval tutorials or guidelines for users, authorised users as pesticides </li></ul><ul><li>in order to enhance recall and a consistent indexing vocabulary </li></ul><ul><li>simplest form of tag gardening because of </li></ul><ul><li>neglecting semantics of tags </li></ul>Weeding
  71. 73. <ul><li>extending the folksonomy with rarely used “baby tags” as high-frequency tags do not sufficiently discriminate resources </li></ul><ul><li>achieved by displaying an inverse tag cloud during indexing or particular “green house” areas where the seedlings may develop and grow, discrete tag suggestions during indexing </li></ul><ul><li>in order to enhance precision and </li></ul><ul><li>expressiveness of the folksonomy </li></ul>Seeding
  72. 74. <ul><li>shaping the folksonomy into “flower beds”, distinguishing similar looking “plants”, assigning their “species”, branding each species with labels and giving additional information regarding their application (e.g., cooking, healing, etc.) </li></ul><ul><li>achieved by conflation of multi-language tags, summarization of synonyms, division of homonyms, establishment of semantic relations by comparison with KOS </li></ul><ul><li>(afterwards indexing) </li></ul><ul><li>in order to enhance precision and </li></ul><ul><li>expressiveness of the folksonomy by adding </li></ul><ul><li>semantics, for query expansion during retrieval </li></ul><ul><li>via semantic relations, for enhanced navigation </li></ul><ul><li>within the folksonomy, as basis for </li></ul><ul><li>semantic-oriented displays </li></ul>Landscape Architecture
  73. 75. <ul><li>combination of folksonomies and KOS during indexing and retrieval </li></ul><ul><li>achieved by semantic-oriented tag suggestions during indexing and retrieval (  tag suggestions not based on tag popularity to avoid self-fulfilling “success breeds success-effect”) or field-based tagging which stimulates semantically richer index tags and search tags </li></ul><ul><li>in order to enhance precision and recall and the expressiveness of the folksonomy by adding semantics, for query expansion during retrieval via semantic relations, for enhanced indexing functionalities, for enhanced navigation within the folksonomy, as basis for semantic-oriented displays </li></ul>Fertilizing
  74. 76. Emergent Semantics <ul><ul><ul><li>folksonomies have no explicit structure; there are no visible paradigmatic semantic relations </li></ul></ul></ul><ul><ul><ul><li>document-specific co-occurring tags are linked by syntagmatic relations </li></ul></ul></ul><ul><ul><ul><li>task: to identify paradigmatic relations and to use them in a controlled vocabulary </li></ul></ul></ul>Synonyms Is_a
  75. 77. (geographical) meronymy synonymy hyponymy Hidden Paradigmatic Relations Source : http:// www.flickr.com
  76. 78. <ul><li>Flickr landscape photos: N=491; analysable tags: 3,618; tags per photo: 7.4 </li></ul><ul><li>Possible document-specific relations (tag-pairs, co-occurrences): 16,098 </li></ul><ul><li>Document-specific relations </li></ul><ul><li>Synonymy 0.56% </li></ul><ul><li>Abbreviation 0.12% </li></ul><ul><li>Quasi-Synonymy 0.21% </li></ul><ul><li>Translation 2.65% </li></ul><ul><li>Equivalence (sum) 3.54% </li></ul><ul><li>Taxonomy 0.23% </li></ul><ul><li>Simple hyponymy 0.06% </li></ul><ul><li>Hyponymy (IS-A relation) (sum) 0.29% </li></ul><ul><li>Geographical meronymy (administrative) 4.94% </li></ul><ul><li>Geographical meronymy (not administrative) 3.91% </li></ul><ul><li>Element-collection-relation 0.21% </li></ul><ul><li>Component-complex-relation 0.84% </li></ul><ul><li>Segment-time-bond event-relation 0.11% </li></ul><ul><li>Other meronymy 0.01% </li></ul><ul><li>Meronymy (IS-PART-OF relation) (sum) 10.02% </li></ul><ul><li>Instance 0.23% </li></ul><ul><li>Instance 0.23% </li></ul><ul><li>All relations 14.08% </li></ul>Hidden Paradigmatic Relations. Flickr Landscape Photos Source : own research project
  77. 79. From Tag Gardening to Collaborative KOS Development <ul><li>community members als gardeners </li></ul><ul><li>tagging </li></ul><ul><li>evaluation of tags </li></ul><ul><li>field-specific tagging </li></ul><ul><li>additional: professional chief-gardener </li></ul><ul><li>KOS development </li></ul><ul><li>new concepts / new words for known concepts </li></ul><ul><li>relations between concepts </li></ul>
  78. 80. Maintenance of KOS and Folksonomy Source : Christiaens (2006) Folksonomy KOS Tag Gardening new terms – new relations
  79. 81. From Tag Clouds to Tag Clusters <ul><li>tag cloud </li></ul><ul><li>alphabetical arrangement </li></ul><ul><li>font size = „importance“ (but mostly no concrete data) </li></ul><ul><li>no relations between tags </li></ul><ul><li>tag cluster </li></ul><ul><li>tags located in a network </li></ul><ul><li>tuneable granularity (threshold value of similarity) </li></ul><ul><li>relations between tags </li></ul><ul><li>processes: </li></ul><ul><li>- calculation of similarity (Jaccard-Sneath, …) </li></ul><ul><li>- cluster algorithms </li></ul>Source: Knautz (2008)
  80. 82. Lesson 6 Find „More like me!“ The Social Function of a Folksonomy
  81. 83. Users – Tags - Documents thematically linked shared users thematically linked shared documents
  82. 84. Shared Documents & Thematically Linked Users <ul><li>more like this ... </li></ul><ul><li> similar documents </li></ul><ul><li>detection of documents </li></ul><ul><li>more like me ... </li></ul><ul><li>similar users </li></ul><ul><li>detection of communities </li></ul>thematically linked shared documents
  83. 85. <ul><li>starting point: single user (ego) </li></ul><ul><li>processing </li></ul><ul><ul><li>(1) tag-specific similarity </li></ul></ul><ul><ul><ul><li>all tags of ego: a(t) </li></ul></ul></ul><ul><ul><ul><li>all tags of another user B: b(t) </li></ul></ul></ul><ul><ul><ul><li>common tags of ego and another user B: g(t) </li></ul></ul></ul><ul><ul><li>(2) document-specific similarity </li></ul></ul><ul><ul><ul><li>all tagged documents of ego: a(d) </li></ul></ul></ul><ul><ul><ul><li>all tagged documents of another user B: b(d) </li></ul></ul></ul><ul><ul><ul><li>common tagged documents of ego and another user B: g(d) </li></ul></ul></ul><ul><ul><li>calculation of similarity </li></ul></ul><ul><ul><ul><li>tag-specific: Jaccard-Sneath: Sim(tag; Ego,B) = g(t) / [a(t) + b(t) – g(t)] </li></ul></ul></ul><ul><ul><ul><li>document-specific: Jaccard-Sneath: Sim(doc; Ego,B) = g(d) / [a(d) + b(d) – g(d)] </li></ul></ul></ul><ul><ul><ul><li>ranking of B i by similarity to ego (say, top 10 tag-specific and top 10 document-specific users) </li></ul></ul></ul><ul><ul><ul><li>merging of both lists (exclusion of duplicates) </li></ul></ul></ul><ul><ul><ul><li>cluster analysis (k-nearest neighbours, single linkage, complete linkage, group average linkage) </li></ul></ul></ul><ul><ul><li>result presentation: social network of ego in the centre </li></ul></ul>More like me! Or: More like This User!
  84. 86. More like me! Or: More like This User! single linkage clustering (fictitious example) Sim(tag) = 0.21 Sim(doc) = 0.25 Sim(tag) = 0.65 Sim(doc) = 0.55 Sim(tag) = 0.33 Sim(doc) = 0.29 Sim(tag) = 0.17 Sim(doc) = 0.23 Sim(tag) = 0.08 Sim(doc) = 0.11 Sim(tag) = 0.15 Sim(doc) = 0.17 Sim(tag) = 0.45 Sim(doc) = 0.36
  85. 87. The Social Function of a Folksonomy <ul><li>objectives: </li></ul><ul><li>recommendation of other users with similar interests </li></ul><ul><li>hints for forming a virtual community </li></ul><ul><li>and – perhaps – for forming a (real) social group </li></ul>
  86. 88. Final Discussion Folksonomies in Library Catalogues – Lessons Learned
  87. 89. Lessons Learned <ul><li>Folksonomies – Indexing without rules </li></ul><ul><li>tagging: anything goes – against methods </li></ul><ul><li>actor: prosumer </li></ul><ul><li>tri-partite system: document – prosumer – tag </li></ul><ul><li>folksonomy behaves like a network good </li></ul><ul><ul><li>only one standard </li></ul></ul><ul><ul><li>“ success breeds success” </li></ul></ul><ul><li>essential: marketing </li></ul>
  88. 90. Lessons Learned <ul><li>Folksonomies in information services and library catalogues </li></ul><ul><li>folksonomy types </li></ul><ul><ul><li>narrow folksonomy (only one tagger per document – no multiple tagging) </li></ul></ul><ul><ul><li>extended narrow folksonomy (more than tagger per document – no multiple tagging) </li></ul></ul><ul><ul><li>broad folksonomy (more than one tagger per document – multiple tags) </li></ul></ul><ul><li>“ best” solution for library catalogues </li></ul><ul><ul><li>broad folksonomy or </li></ul></ul><ul><ul><li>extended narrow folksonomy (only usable if search tags can be processed) </li></ul></ul><ul><li>platform </li></ul><ul><ul><li>own platform (example: PennTags) </li></ul></ul><ul><ul><li>third party platform (example: LibraryThing) </li></ul></ul>
  89. 91. Lessons Learned <ul><li>Folksonomies and knowledge representation </li></ul><ul><li>collective intelligence (diversity of options, independence of taggers, decentralisation, statistical aggregation of data) </li></ul><ul><li>document-specific tag distributions </li></ul><ul><ul><li>power law </li></ul></ul><ul><ul><li>inverse-logistic distribution </li></ul></ul><ul><ul><li>Power Tags </li></ul></ul><ul><li>tagging behaviour of the users </li></ul><ul><ul><li>1 … 3 tags per document and user </li></ul></ul><ul><ul><li>conformity </li></ul></ul><ul><ul><li>recommendation (very problematic) </li></ul></ul><ul><li>sentiment tags (positive – negative) </li></ul><ul><li>documents which provoke emotions </li></ul>
  90. 92. Lessons Learned <ul><li>Folksonomies and information retrieval </li></ul><ul><li>information linguistics (natural language processing) </li></ul><ul><ul><li>additional to raw tags: “cleaning” of tags </li></ul></ul><ul><ul><li>important tasks: language identification, error detection, word form conflation </li></ul></ul><ul><li>relevance ranking criteria (calculation of retrieval status values) </li></ul><ul><ul><li>tags (TF*IDF, tag evaluation, super posters, time, power tags) </li></ul></ul><ul><ul><li>“ collaboration” (click rates, number of tagging users, number of comments, PageRank) </li></ul></ul><ul><ul><li>prosumer (performative tags, sentiment tags, rating) </li></ul></ul>
  91. 93. Lessons Learned <ul><li>Tag gardening for folksonomy enrichment and maintenance </li></ul><ul><li>“ weeding”: information linguistics (NLP): core tasks </li></ul><ul><li>“ seeding”: baby tags, inverse tag cloud </li></ul><ul><li>“ landscape architecture”: combination of folksonomy and KOS (afterwards indexing) </li></ul><ul><li>“ fertilizing”: using KOS during indexing and retrieval </li></ul><ul><li>emergent semantics: identification of (hidden) paradigmatic relations (e.g., synonymy and hierarchy) </li></ul>
  92. 94. Lessons Learned <ul><li>Find „more like me!“. The social function of a folksonomy </li></ul><ul><li>a new function: “More like me!” </li></ul><ul><li>recommendation of other users with similar interests </li></ul><ul><li>helpful for community building </li></ul>
  93. 95. We would like to thank you very much for attending this Workshop. Greetings from Düsseldorf, Germany!

×