O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Membase, ShareThis, AOL - Phillips, Mukerji, Jackson - Hadoop World 2010

2.126 visualizações

Publicada em

Better Ad, Offer and Content Targeting with Membase and Hadoop

James Phillips, Membase
Manu Mukerji, ShareThis
Ben Jackson, AOL

Learn more about Hadoop @ http://www.cloudera.com/hadoop/

Publicada em: Tecnologia
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT, ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT, ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Seja a primeira pessoa a gostar disto

Membase, ShareThis, AOL - Phillips, Mukerji, Jackson - Hadoop World 2010

  1. 1. Better Ad, Offer and Content Targeting with Membase and Hadoop<br />ShareThis<br />James Phillips, Co-founder, Membase<br />ManuMukerji, Architect, ShareThis<br />Ben Jackson, Chief Architect, Aol.<br />Aol.<br />
  2. 2. What is Membase?<br />
  3. 3. Data Manager<br />Pure Key-Value Store<br />Built-in Memcached<br />CP behavior (AP option)<br />Cluster Manager<br />Configuration manager<br />Replication supervisor<br />Rebalance orchestration<br />NodeCode Manager<br />Protocol extenders<br />Real-time aggregation<br />Index management<br />Membase Server<br />3<br />Memcached protocol<br />NodeCodeManager<br />(in development)<br />DataManager<br />Cluster Manager<br />
  4. 4. Membase Clustering<br />4<br /><ul><li> Data and I/O spread across nodes
  5. 5. Peer-to-peer replication
  6. 6. Hot cluster maintenance – zero downtime</li></li></ul><li>Targeting atShareThis<br />
  7. 7. Largest integrated sharing network<br />We make sharing simple, engaging & valuable <br />Powerful Social Analytics & Audience Monetization<br />About ShareThis<br />450/momillion consumers<br />~850thousand sites<br />50+social channels<br />
  8. 8. This is how it works<br />Sharing Behavior<br />Log Files<br />Search Keywords<br />Page Views<br />HDFS<br />Map/Reduce<br />Content Analysis<br />Taxonomy<br />Ad Server<br />User<br />Membase<br />2<br />
  9. 9. ShareThis Ad Products<br />
  10. 10. Targeting at Aol.<br />
  11. 11. Internet Conentent and Advertising <br />Over 80 O&O internet brands<br />Advertising.com 3rd party ad network<br />Reach 90% of US<br />Sells Ad Serving technology<br />ADTECH<br />About AOL<br />10<br />
  12. 12. How AOL uses Hadoop<br />Cookie <br />Classification<br />Events<br />Reporting<br />Cookie Profiling<br />Model Building<br />Scoring <br />Database Supported <br />LAMP Reporting<br />Ad Servers<br />Hadoop<br />Cluster<br />
  13. 13. Real Time Cookie Scoring?<br />Cookie <br />Classification<br />Campaign<br />Insights<br />Ad Servers<br />Events<br />Node Code<br />Membase<br />/MPI Cluster<br />Report <br />Generation<br />Cookie <br />Scoring<br />Flume<br />Model Creation<br />To HDFS<br />Cookie Scoring<br />Report Generation<br />Reporting System<br />No DB<br />Hadoop Cluster<br />
  14. 14. <ul><li>HDFS/MapReduce
  15. 15. Stores months of historic data
  16. 16. Long running model building jobs
  17. 17. Supports research and development
  18. 18. Flume
  19. 19. A reliable and open source solution to streaming event data from multiple sources to multiple sinks.
  20. 20. Membase/MPI cluster
  21. 21. Stores recent data (a few weeks)
  22. 22. Allows data access on how a cookie scores for each model
  23. 23. MPI performs very fast in memory processing on data, given real time updates to cookie classification.</li></ul>The Components<br />
  24. 24. <ul><li>Data is to be stored keyed by AOL user cookie
  25. 25. All events stored keyed by user cookie
  26. 26. Impressions, clicks, …
  27. 27. Enriching data (known demographic and behavioral data)
  28. 28. Ideal but not yet available
  29. 29. Sets stored for each key
  30. 30. Atomic insert/delete
  31. 31. Membase Node Code is coming</li></ul>Node Code Next Year<br />
  32. 32. MPI<br />Parallel processing environment <br />Used heavily in high performance scientific computing<br />Parallel reductions outside of Hadoop MapReduce<br />MPI and Membase on same cluster<br />Local access to membase data for fast analysis<br />Cookie Scoring and Reporting via MPI<br />
  33. 33. <ul><li>Memcached has no facility to get the list of available keys
  34. 34. (this is outside of the traditional memcached model)
  35. 35. Need to store the list of cookie keys in special index locations in Memcached
  36. 36. All MPI Nodes talk to all Memcached Nodes
  37. 37. Large communication overhead</li></ul>Memcache Cluster<br />MPI Cluster<br />MPI processing using traditional Memcached API<br />
  38. 38. <ul><li>Not yet fully baked
  39. 39. Currently only available via Python
  40. 40. MPI is traditionally C/C++
  41. 41. Each MPI instance talks to the local Membase daemon
  42. 42. No external communication needed
  43. 43. Process all data in that local instance
  44. 44. Future exploration needed</li></ul>The fast way w/Membase streaming<br />
  45. 45. Test the speed of the Membase implementation of Memcached<br />How quickly can ad servers access data associated with a user cookie?<br />Goal 5 millisecond latency for all requests<br />Can Membase handle concurrent reads and writes?<br />Testbed – 3 Memcached servers with replication factor 1.<br />Results<br />Pushed ~ 14 GB of data onto the servers<br />Tried reading data while pushing up to 20k/sec keys<br />Found <1 to 3 millisecond response times for queries, easily meeting our needs<br />Membase POC: Part 1<br />
  46. 46. <ul><li>GOALS
  47. 47. Integrating MPI/Membase
  48. 48. Very fast processing of recent server data
  49. 49. Is this faster than similar processing on Hadoop?
  50. 50. Test Bed
  51. 51. 10 servers with 128 GB of ram each.
  52. 52. Flow
  53. 53. Push cookie profiles into Membase from HDFS
  54. 54. Waiting on node code for a fully realized streaming implementation.
  55. 55. Run simple aggregation job in MPI
  56. 56. Use the MPI for parallel reductions
  57. 57. Stream Data using TAP
  58. 58. Do most of the computation locally, doing the bare minimum in parallel reductions</li></ul>Membase POC: Part 2 (ongoing)<br />
  59. 59. Have Questions?<br />James Phillips, james@membase.com<br />Manu Mukerji, manu@sharethis.com<br />Ben Jackson, ben.jackson@teamaol.com<br />

×