O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Cognitive Computing
with Human in the Loop
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Lora Aroyo
Web &...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Who am I …
Vrije Universiteit Amsterdam
computer science pr...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
VU Web & Media Group …
Tobias Kuhn
Davide	Ceolin	
Victor	de...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
VU Web & Media Group …
Tobias Kuhn
Davide	Ceolin	
Victor	de...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… but they don’t actually understand people
software system...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
not all human knowledge can yet be captured by machines
for...
all the information machines have
is all the information there is
there is always something else …
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
key scientific challenge:
capturing human knowledge
at scal...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Human Computation:
how human intelligence at scale can be u...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
understanding human computation:
improving how machine-base...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… understanding the data
variety of meanings
multitude of p...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… understanding the crowds
volunteers
enthusiasts
visitors ...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://crowdtruth.org/
framework that facilitates
data coll...
“best collective decisions are
result of disagreement,
not consensus or compromise”
James Surowiecki
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
disagreement = signal
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://crowdtruth.org/
disagreement is signal
for the natur...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://controcurator.org/
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
X
Interac(ve	Explora,on	&	Discovery	in	Context	
building	au...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
VOTE for DIVE: https://summit2017.lodlam.net/2017/04/12/div...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
VU – IBM CAS Team
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Victor	de	Boer		Lora	Aroyo	 Oana	Inel	
Chiel	van	den	Akker	...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Carlos	MarAnez	OrAz	
Werner	Helmich	
Berber	Hagedoorn	Sabri...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Liliana	Melgar	
Johan	Oomen	 Jaap	Blom
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Cognitive Computing
with Human in the Loop
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Lora Aroyo
Web &...
https://www.rijksmuseum.nl/en/rijksstudio
Crowds	for	Co-crea-on	Data
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… by user-driven augmentations
of exiting online collections
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Nichesourcing	with	Experts	
http://annotate.accurator.nl
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
niches of people with the right expertise to
contribute spe...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Train	Lay	Crowds	to	be	Experts	
training the general crowd ...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://spotvogel.vroegevogels.vara.nl
Volunteer crowds for ...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Paid	Crowds	for	Video	Analysis	
CrowdTruth.org
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Paid	Crowds	for	Text	Analysis	
CrowdTruth.org
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Paid	Crowds	for	Image	Analysis	
http://lora-aroyo.org u htt...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Challenge 1: Typically undertaken in isolation
Challenge 2:...
measure & assess
ensure impact
•  be aware of the channel, e.g. Wikipedia,
Wikimedia, Facebook
Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Guus Schreiber, Lora Aroyo
(2011). On the role of user-generat...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
user vocabulary
8% in professional vocabulary
23% in Dutch ...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://crowdtruth.org/
disagreement signals ambiguity
if pe...
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://mediasuite.clariah.nl/
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
1998
from DVDs to data science
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
1998 2006
1 million dollar prize
for best algorithm
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Netflix switches to streaming
20071998 2006
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Team BellKor wins Netflix Prize
20071998 2006 2009
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Team BellKor wins Netflix Prize
20071998 2006 2009
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
From Jeopardy to real-world problems
2011 2017
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
data is at the centre of every process
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
data is essential to evolve with users
Próximos SlideShares
Carregando em…5
×

Data Science with Human in the Loop @Faculty of Science #Leiden University

635 visualizações

Publicada em

Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they don’t actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.

Some project links:
- http://controcurator.org/
- http://crowdtruth.org/
- http://diveproject.beeldengeluid.nl/
- http://vu-amsterdam-web-media-group.github.io/linkflows/

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Data Science with Human in the Loop @Faculty of Science #Leiden University

  1. 1. Cognitive Computing with Human in the Loop http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Lora Aroyo Web & Media Group, VU IBM Center for Advanced Studies (CAS) Harnessing User Semantics at Scale
  2. 2. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Who am I … Vrije Universiteit Amsterdam computer science professor heading web & media group Amsterdam Data Science IBM Center for Advanced Studies, Amsterdam research associate leading cognitive computing & crowdsourcing team Columbia University, NY visiting scholar computer science, NLP, Computer Vision Columbia Data Science Tagasauris Inc, NY Chief of Science
  3. 3. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo VU Web & Media Group … Tobias Kuhn Davide Ceolin Victor de Boer Jan Wielemaker 10 PhD Students Lora Aroyo
  4. 4. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo VU Web & Media Group … Tobias Kuhn Davide Ceolin Victor de Boer Jan Wielemaker 10 PhD Students Lora Aroyo Intelligent & Interactive Information Systems enriching metadata & content of digital collections content analysis for entity extraction modeling provenance in digital collections tracking changes over time augmenting online multimedia text & video summarization interactive product placement, hotspots assessing quality of web data bias, controversy, opinions, perspectives uncertainty, ambiguity trust, privacy
  5. 5. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo … but they don’t actually understand people software systems becoming ever more intelligent
  6. 6. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo not all human knowledge can yet be captured by machines for wide ranges of real-world contexts Knowledge Representation aims at human knowledge in machine-readable form
  7. 7. all the information machines have is all the information there is
  8. 8. there is always something else …
  9. 9. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo key scientific challenge: capturing human knowledge at scale and adequate to real-world needs
  10. 10. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Human Computation: how human intelligence at scale can be used to improve machine-based knowledge
  11. 11. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo understanding human computation: improving how machine-based systems acquire, capture & harness human knowledge
  12. 12. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo … understanding the data variety of meanings multitude of perspectives abundance of sources endless applications
  13. 13. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo … understanding the crowds volunteers enthusiasts visitors on-site visitors online paid crowds in-house experts understand who are the different crowds what can they do for your collection
  14. 14. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://crowdtruth.org/ framework that facilitates data collection, processing & analytics of human computation knowledge
  15. 15. “best collective decisions are result of disagreement, not consensus or compromise” James Surowiecki
  16. 16. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo disagreement = signal
  17. 17. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://crowdtruth.org/ disagreement is signal for the natural ambiguity of language and diversity & perspectives of human interpretation
  18. 18. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://controcurator.org/
  19. 19. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  20. 20. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  21. 21. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo X Interac(ve Explora,on & Discovery in Context building automa(c storylines (narra(ves) DIVE+ Aggregated views over the collec(on collec(ng perspec,ves from crowds & niches http://diveproject.beeldengeluid.nl/
  22. 22. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo VOTE for DIVE: https://summit2017.lodlam.net/2017/04/12/dive-explorative-search-for-digital-humanities/
  23. 23. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo VU – IBM CAS Team
  24. 24. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Victor de Boer Lora Aroyo Oana Inel Chiel van den Akker Susan Legêne
  25. 25. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Carlos MarAnez OrAz Werner Helmich Berber Hagedoorn Sabrina Sauer
  26. 26. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Liliana Melgar Johan Oomen Jaap Blom
  27. 27. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  28. 28. Cognitive Computing with Human in the Loop http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Lora Aroyo Web & Media Group, VU IBM Center for Advanced Studies (CAS) Harnessing User Semantics at Scale
  29. 29. https://www.rijksmuseum.nl/en/rijksstudio Crowds for Co-crea-on Data
  30. 30. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo … by user-driven augmentations of exiting online collections
  31. 31. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  32. 32. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Nichesourcing with Experts http://annotate.accurator.nl
  33. 33. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo niches of people with the right expertise to contribute specific information
  34. 34. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Train Lay Crowds to be Experts training the general crowd to be a niche: game in which players can carry out an expert annotation tasks with some assistance
  35. 35. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://spotvogel.vroegevogels.vara.nl Volunteer crowds for continuous gaming
  36. 36. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Paid Crowds for Video Analysis CrowdTruth.org
  37. 37. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Paid Crowds for Text Analysis CrowdTruth.org
  38. 38. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Paid Crowds for Image Analysis http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo CrowdTruth.org
  39. 39. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Challenge 1: Typically undertaken in isolation Challenge 2: Difficult to estimate & control the time to complete Challenge 3: Difficult to assess & compare quality Challenge 4: Demands continuous promotional effort Challenge 5: Active learning (human-in-the-loop) needs different expertise Challenge 6: Challenging for institutions to incorporate crowdsourcing results into their existing content infrastructure Crowdsourcing Challenges
  40. 40. measure & assess ensure impact •  be aware of the channel, e.g. Wikipedia, Wikimedia, Facebook
  41. 41. Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Guus Schreiber, Lora Aroyo (2011). On the role of user-generated metadata in audio visual collections. International conference on Knowledge capture K-CAP '11, Pages 145-152 measure & assess monitor progress 6 months 2 years 340,551 tags 36,981 tags 137.421 matches 602 items 1.782 items 555 registered players 2,017 users (taggers) thousands of anonymous players 12,279 visits (3+ min online) 44,362 pageviews
  42. 42. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo user vocabulary 8% in professional vocabulary 23% in Dutch lexicon 89% found on Google locations (7%) engeland persons (31%) objects (57%) measure & assess evaluate content, compare crowds 88% of the tags useful for specific genres
  43. 43. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://crowdtruth.org/ disagreement signals ambiguity if people disagree then it will be more difficult for a machine to classify that example
  44. 44. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  45. 45. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  46. 46. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo http://mediasuite.clariah.nl/
  47. 47. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  48. 48. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  49. 49. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  50. 50. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  51. 51. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  52. 52. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
  53. 53. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo 1998 from DVDs to data science
  54. 54. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo 1998 2006 1 million dollar prize for best algorithm
  55. 55. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Netflix switches to streaming 20071998 2006
  56. 56. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Team BellKor wins Netflix Prize 20071998 2006 2009
  57. 57. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo Team BellKor wins Netflix Prize 20071998 2006 2009
  58. 58. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo From Jeopardy to real-world problems 2011 2017
  59. 59. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo data is at the centre of every process
  60. 60. http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo data is essential to evolve with users

×