O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Humans in a loop: Jupyter notebooks as a front-end for AI

JupyterCon NY 2017-08-24

Paco Nathan reviews use cases where Jupyter provides a front-end to AI as the means for keeping "humans in the loop". This talk introduces *active learning* and the "human-in-the-loop" design pattern for managing how people and machines collaborate in AI workflows, including several case studies.

The talk also explores how O'Reilly Media leverages AI in Media, and in particular some of our use cases for active learning such as disambiguation in content discovery. We're using Jupyter as a way to manage active learning ML pipelines, where the machines generally run automated until they hit an edge case and refer the judgement back to human experts. In turn, the experts training the ML pipelines purely through examples, not feature engineering, model parameters, etc.

Jupyter notebooks serve as one part configuration file, 
one part data sample, one part structured log, 
one part data visualization tool. O'Reilly has released an open source project on GitHub called `nbtransom` which builds atop `nbformat` and `pandas` for our active learning use cases.

This work anticipates upcoming work on collaborative documents in JupyterLab, based on Google Drive. In other words, where the machines and people are collaborators on shared documents.

  • Entre para ver os comentários

Humans in a loop: Jupyter notebooks as a front-end for AI

  1. 1. Humans  in  a  loop:   Jupyter  notebooks  as  a  front-­‐end  for  AI Paco  Nathan    @pacoid   Dir,  Learning  Group  @  O’Reilly  Media   JupyterCon    2017-­‐08-­‐24
  2. 2. Framing 2 Imagine  having  a  mostly-­‐automated  system  where  
 people  and  machines  collaborate  together…   May  sound  a  bit  Sci-­‐Fi  –  though  arguably  commonplace.  
 One  challenge  is  whether  we  can  advance  beyond  rote   tasks.     Instead  of  simply  running  code  libraries,  can  machines  
 make  difficult  decisions,  exercise  judgement  in  complex   situations?     Can  we  build  systems  in  which  people  who  aren’t  
 AI  experts  can  “teach”  machines  to  perform  complex  
 work  based  on  examples,  not  code?
  3. 3. Research  questions ▪ How  do  we  personalize  learning  experiences,  across  
 ebooks,  videos,  conferences,  computable  content,  live   online  courses,  case  studies,  expert  AMAs,  etc.   ▪ How  do  we  help  experts  —  by  definition,  really  busy   people  —  share  knowledge  with  their  peers  in  industry?   ▪ How  do  we  manage  the  role  of  editors  at  human  scale,  
 while  technology  and  delivery  media  evolve  rapidly?   ▪ How  do  we  help  organizations  learn  and  transform   continuously? 3
  4. 4. 4
  5. 5. 5 UX  for  content  discovery:   ▪ partly  generated  +  curated  by  humans   ▪ partly  generated  +  curated  by  AI  apps
  6. 6. Background:  
 helping  machines  learn 6
  7. 7. Machine  learning supervised  ML:   ▪ take  a  dataset  where  each  element   has  a  label   ▪ train  models  on  a  portion  of  the   data  to  predict  the  labels,  test  on   the  remainder   ▪ deep  learning  is  a  popular  example,  
 though  only  if  you  have  lots  of   labeled  training  data  available
  8. 8. Machine  learning unsupervised  ML:   ▪ run  lots  of  unlabeled  data  through   an  algorithm  to  detect  “structure”   or  embedding   ▪ for  example,  clustering  algorithms   such  as  K-­‐means   ▪ unsupervised  approaches  for  AI  
 are  an  open  research  question
  9. 9. Active  learning special  case  of  semi-­‐supervised  ML:   ▪ send  difficult  calls  /  edge  cases  to   experts;  let  algorithms  handle   routine  decisions   ▪ works  well  in  use  cases  which  have   lots  of  inexpensive,  unlabeled  data   ▪ e.g.,  abundance  of  content  to  be   classified,  where  the  cost  of   labeling  is  the  expense
  10. 10. Active  learning Real-­‐World  Active  Learning:  Applications  and   Strategies  for  Human-­‐in-­‐the-­‐Loop  Machine  Learning
 radar.oreilly.com/2015/02/human-­‐in-­‐the-­‐loop-­‐ machine-­‐learning.html
 Ted  Cuzzillo
 O’Reilly  Media,  2015-­‐02-­‐05   Develop  a  policy  for  how  human  experts  select  exemplars:   ▪ bias  toward  labels  most  likely  to  influence  the  classifier   ▪ bias  toward  ensemble  disagreement   ▪ bias  toward  denser  regions  of  training  data 10
  11. 11. Active  learning Data  preparation  in  the  age  of  deep  learning
 oreilly.com/ideas/data-­‐preparation-­‐in-­‐the-­‐ age-­‐of-­‐deep-­‐learning
 Luke  Biewald    CrowdFlower
 O’Reilly  Data  Show,  2017-­‐05-­‐04   send  human  workers  cases  where  machine  learning   algorithms  signal  uncertainty  (low  probability  scores)     …or  when  your  ensemble  of  ML  algorithms  signals   disagreement 11
  12. 12. Design  pattern:  Human-­‐in-­‐the-­‐loop Building  a  business  that  combines  human  experts   and  data  science
 oreilly.com/ideas/building-­‐a-­‐business-­‐that-­‐ combines-­‐human-­‐experts-­‐and-­‐data-­‐science-­‐2
 Eric  Colson    StitchFix
 O’Reilly  Data  Show,  2016-­‐01-­‐28   “what  machines  can’t  do  are  things  around  cognition,
    things  that  have  to  do  with  ambient  information,  or
    appreciation  of  aesthetics,  or  even  the  ability  to
    relate  to  another  human”
  13. 13. Design  pattern:  Human-­‐in-­‐the-­‐loop Strategies  for  integrating  people  and  machine   learning  in  online  systems
 safaribooksonline.com/library/view/oreilly-­‐ artificial-­‐intelligence/9781491976289/ video311857.html
 Jason  Laska    Clara  Labs
 The  AI  Conf,  2017-­‐06-­‐29   how  to  create  a  two-­‐sided  marketplace  where  machines   and  people  compete  on  a  spectrum  of  relative  expertise   and  capabilities
  14. 14. Design  pattern:  Human-­‐in-­‐the-­‐loop Building  human-­‐assisted  AI  applications
 oreilly.com/ideas/building-­‐human-­‐ assisted-­‐ai-­‐applications
 Adam  Marcus    B12
 O’Reilly  Data  Show,  2016-­‐08-­‐25   Orchestra:  a  platform  for  building  human-­‐ assisted  AI  applications,  e.g.,  to  create   business  websites
 https://github.com/b12io/orchestra   example  http://www.coloradopicked.com/ 14
  15. 15. Design  pattern:  Flash  teams Expert  Crowdsourcing  with  Flash  Teams
 hci.stanford.edu/publications/2014/ flashteams/flashteams-­‐uist2014.pdf
 Daniela  Retelny,  et  al.  
 Stanford  HCI   “A  flash  team  is  a  linked  set  of  modular  tasks  
    that  draw  upon  paid  experts  from  the  crowd,  
    often  three  to  six  at  a  time,  on  demand”   http://stanfordhci.github.io/flash-­‐teams/ 15
  16. 16. Weak  supervision  /  Data  programming Creating  large  training  data  sets  quickly
 oreilly.com/ideas/creating-­‐large-­‐training-­‐ data-­‐sets-­‐quickly
 Alex  Ratner    Stanford
 O’Reilly  Data  Show,  2017-­‐06-­‐08   Snorkel:  “weak  supervision”  and  “data   programming”  as  another  instance  of  
 github.com/HazyResearch/snorkel   conferences.oreilly.com/strata/strata-­‐ny/public/ schedule/detail/61849 16
  17. 17. Reinforcement  learning Reinforcement  learning  explained
 oreilly.com/ideas/reinforcement-­‐ learning-­‐explained
 Junling  Hu    AI  Frontiers
 O’Reilly  Radar,  2016-­‐12-­‐08   learning  to  act  based  on  long-­‐term   payoffs   often  as  rewards/punishments  of  an  
 actor  within  a  simulation 17
  18. 18. Background:   helping  people  learn 18
  19. 19. AI  in  Media ▪ content  which  can  represented  as  
 text  can  be  parsed  by  NLP,  then   manipulated  by  available  AI  tooling     ▪ labeled  images  get  really  interesting   ▪ text  or  images  within  a  context  have  
 inherent  structure   ▪ representation  of  that  kind  of  structure   is  rare  in  the  Media  vertical  –  so  far 19
  20. 20. {"graf": [[21, "let", "let", "VB", 1, 48], [0, "'s", "'s", "PRP", 0, 49], "take", "take", "VB", 1, 50], [0, "a", "a", "DT", 0, 51], [23, "look", "l "NN", 1, 52], [0, "at", "at", "IN", 0, 53], [0, "a", "a", "DT", 0, 54], [ "few", "few", "JJ", 1, 55], [25, "examples", "example", "NNS", 1, 56], [0 "often", "often", "RB", 0, 57], [0, "when", "when", "WRB", 0, 58], [11, "people", "people", "NNS", 1, 59], [2, "are", "be", "VBP", 1, 60], [26, " "first", "JJ", 1, 61], [27, "learning", "learn", "VBG", 1, 62], [0, "abou "about", "IN", 0, 63], [28, "Docker", "docker", "NNP", 1, 64], [0, "they" "they", "PRP", 0, 65], [29, "try", "try", "VBP", 1, 66], [0, "and", "and" 0, 67], [30, "put", "put", "VBP", 1, 68], [0, "it", "it", "PRP", 0, 69], "in", "in", "IN", 0, 70], [0, "one", "one", "CD", 0, 71], [0, "of", "of", 0, 72], [0, "a", "a", "DT", 0, 73], [24, "few", "few", "JJ", 1, 74], [31, "existing", "existing", "JJ", 1, 75], [18, "categories", "category", "NNS 76], [0, "sometimes", "sometimes", "RB", 0, 77], [11, "people", "people", 1, 78], [9, "think", "think", "VBP", 1, 79], [0, "it", "it", "PRP", 0, 80 "'s", "be", "VBZ", 1, 81], [0, "a", "a", "DT", 0, 82], [32, "virtualizati "virtualization", "NN", 1, 83], [19, "tool", "tool", "NN", 1, 84], [0, "l "like", "IN", 0, 85], [33, "VMware", "vmware", "NNP", 1, 86], [0, "or", " "CC", 0, 87], [34, "virtualbox", "virtualbox", "NNP", 1, 88], [0, "also", "also", "RB", 0, 89], [35, "known", "know", "VBN", 1, 90], [0, "as", "as" 0, 91], [0, "a", "a", "DT", 0, 92], [36, "hypervisor", "hypervisor", "NN" 93], [0, "these", "these", "DT", 0, 94], [2, "are", "be", "VBP", 1, 95], "tools", "tool", "NNS", 1, 96], [0, "which", "which", "WDT", 0, 97], [2, "be", "VBP", 1, 98], [37, "emulating", "emulate", "VBG", 1, 99], [38, "hardware", "hardware", "NN", 1, 100], [0, "for", "for", "IN", 0, 101], [ "virtual", "virtual", "JJ", 1, 102], [40, "software", "software", "NN", 1 103]], "id": "001.video197359", "sha1": "4b69cf60f0497887e3776619b922514f2e5b70a8"} AI  in  Media 20 {"count": 2, "ids": [32, 19], "pos": "np", "rank": 0.0194, "text": "virtualization tool"} {"count": 2, "ids": [40, 69], "pos": "np", "rank": 0.0117, "text": "software applications"} {"count": 4, "ids": [38], "pos": "np", "rank": 0.0114, "text": "hardware"} {"count": 2, "ids": [33, 36], "pos": "np", "rank": 0.0099, "text": "vmware hypervisor"} {"count": 4, "ids": [28], "pos": "np", "rank": 0.0096, "text": "docker"} {"count": 4, "ids": [34], "pos": "np", "rank": 0.0094, "text": "virtualbox"} {"count": 10, "ids": [11], "pos": "np", "rank": 0.0049, "text": "people"} {"count": 4, "ids": [37], "pos": "vbg", "rank": 0.0026, "text": "emulating"} {"count": 2, "ids": [27], "pos": "vbg", "rank": 0.0016, "text": "learning"} Transcript: let's take a look at a few examples often when people are first learning about Docker they try and put it in one of a few existing categories sometimes people think it's a virtualization tool like VMware or virtualbox also known as a hypervisor these are tools which are emulating hardware for virtual software Confidence: 0.973419129848 39 KUBERNETES 0.8747 coreos 0.8624 etcd 0.8478 DOCKER CONTAINERS 0.8458 mesos 0.8406 DOCKER 0.8354 DOCKER CONTAINER 0.8260 KUBERNETES CLUSTER 0.8258 docker image 0.8252 EC2 0.8210 docker hub 0.8138 OPENSTACK orm:Docker a orm:Vendor; a orm:Container; a orm:Open_Source; a orm:Commercial_Software; owl:sameAs dbr:Docker_%28software%29; skos:prefLabel "Docker"@en;
  21. 21. Which  parts  do  people  or  machines  do  best? 21 team  goal:  maintain  structural  correspondence  between  the  layers   big  win  for  AI:  inferences  across  the  graph human  scale   primary  structure   control  points   testability machine  generated  data  products   ~80%  of  the  graph
  22. 22. Ontology ▪ provides  context  which  Deep  Learning  lacks   ▪ aka,  “knowledge  graph”  –  a  computable  thesaurus   ▪ maps  the  semantics  of  business  relationships   ▪ S/V/O:  “nouns”,  some  “verbs”,  a  few  “adjectives”   ▪ difficult  work,  a  relatively  expensive  investment,   potentially  high  ROI   ▪ conversational  interfaces  (e.g.,  Google  Assistant)   improve  UX  by  importing  ontologies 22
  23. 23. Ontology 23
  24. 24. Problem:   disambiguating  contexts 24
  25. 25. Disambiguating  contexts Overlapping  contexts  pose  hard  problems  in  natural  language  understanding.   Not  much  available  tooling.  Counter  to  the  correlation  emphasis  of  big  data.
  26. 26. Disambiguating  contexts 26 Suppose  someone  publishes  a  book  which  uses  the  term   `IOS`:  are  they  talking  about  an  operating  system  for  an   Apple  iPhone,  or  about  an  operating  system  for  a  Cisco   router?     We  handle  lots  of  content  about  both.  Disambiguating  those   contexts  is  important  for  good  UX  in  personalized  learning.   In  other  words,  how  do  machines  help  people  
 distinguish  that  content  within  search?   Potentially  a  good  case  for  deep  learning,  
 except  for  the  lack  of  labeled  data  at  scale.
  27. 27. Disambiguation  in  content  discovery 27 Consider  searching  for  the  term  `react`  on  Google.  
 The  first  page  of  results:   ▪ acting  coaches   ▪ video  games   ▪ student  engagement   ▪ children’s  charities   ▪ UI  web  components   ▪ surveys
  28. 28. Active  learning  through  Jupyter 28 Jupyter  notebooks  are  used  to  manage  ML  
 pipelines  for  disambiguation,  where  machines  
 and  people  collaborate:   ▪ notebooks  as  one  part  configuration  file,  
 one  part  data  sample,  one  part  structured  log,  
 one  part  data  visualization  tool   ▪ ML  based  on  examples,  not  feature  engineering,   model  parameters,  etc.
  29. 29. Active  learning  through  Jupyter 1. Experts  use  notebooks  to  provide  examples  of  book  chapters,  video   segments,  etc.,  for  each  key  phrase  that  has  overlapping  contexts   2. Machines  build  ensemble  ML  models  based  on  those  examples,   updating  notebooks  with  model  evaluation   3. Machines  attempt  to  annotate  labels  for  millions  of  pieces  of  content,  
 e.g.,  `AlphaGo`,  `Golang`,  versus  a  mundane  use  of  the  verb  `go`   4. Disambiguation  can  run  mostly  automated,  in  parallel  at  scale  –  
 through  integration  with  Apache  Spark   5. In  cases  where  ensembles  disagree,  ML  pipelines  defer  to  human   experts  who  make  judgement  calls,  providing  further  examples   6. New  examples  go  into  training  ML  pipelines  to  build  better  models   7. Rinse,  lather,  repeat
  30. 30. Active  learning  through  Jupyter 30 ML#Pipelines Jupyter#kernel Browser SSH#tunnel
  31. 31. Active  learning  through  Jupyter ▪ Jupyter  notebooks  allow  human  experts  to  access  the   internals  of  a  mostly  automated  ML  pipeline,  rapidly   ▪ Stated  another  way,  both  the  machines  and  the  people   become  collaborators  on  shared  documents   ▪ Anticipates  upcoming  collaborative  document  features   in  JupyterLab
  32. 32. Active  learning  through  Jupyter 32 Open  source  nbtransom  package:   ▪ https://github.com/ceteri/nbtransom   ▪ https://pypi.python.org/pypi/nbtransom   ▪ based  on  use  of  nbformat  and  pandas     ▪ notebook  as  both  Py  data  store  and  analytics   ▪ custom  “pretty-­‐print”  helps  with  use  of  Git  for   diffs,  commits,  etc.
  33. 33. Nuances ▪ No  Free  Lunch  theorem:  it’s  better  to  err  on  the  side  of  
 less  false  positives  /  more  false  negatives  for  this  use  case   ▪ Bias  toward  exemplars  most  likely  to  influence  the  classifier   ▪ Potentially,  the  “experts”  may  be  Customer  Service  staff  who   review  edge  cases  within  search  results  or  recommended   content  –  as  an  integral  part  of  our  UI  –  then  re-­‐train  the  
 ML  pipelines  through  examples  
  34. 34. Summary:   people  and  machines  at  work 34
  35. 35. Human-­‐in-­‐the-­‐loop  as  management  strategy personal  op-­‐ed:  the  “game”  isn’t  to  replace  people   –  instead  it’s  about  leveraging  AI  to  augment  staff,  
 so  organizations  can  retain  people  with  valuable   domain  expertise,  making  their  contributions  and   expertise  even  more  vital
  36. 36. Why  we’ll  never  run  out  of  jobs 36
  37. 37. Acknowledgements 37 Many  thanks  to  people  who’ve  helped  with  this  work:   ▪ Taylor  Martin   ▪ Eszti  Schoell   ▪ Fernando  Perez   ▪ Andrew  Odewahn   ▪ Scott  Murray   ▪ John  Allwine   ▪ Pascal  Honscher  
  38. 38. Strata  Data   NY,  Sep  25-­‐28
 SG,  Dec  4-­‐7
 SJ,  Mar  5-­‐8,  2018
 UK,  May  21-­‐24,  2018   The  AI  Conf   SF,  Sep  17-­‐20
 NY,  Apr  29-­‐May  2,  2018   OSCON  (returns  to  Portland!)   PDX,  Jul  16-­‐19,  2018 38
  39. 39. 39 Get  Started  with   NLP  in  Python Just  Enough  Math Building  Data   Science  Teams Hylbert-­‐Speys How  Do  You  Learn? updates,  reviews,  conference  summaries…   liber118.com/pxn/