O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Fans just wanna have fun!
@petricek
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese an...
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the sprea...
Every year:
hundreds of millions tickets sold
worth Billions of Dollars
!
Tickets to the Fans!
(not scalpers)!
!
Recommendations!
!
Who are my fans? !
Quality evaluation
Data collection
Recommendation Strategies
Javascript request URL:http://pixelserver:8080/pixel?
evt=page_load
imp=e4beb325-310e-493a-88cd-3da709ef509a
impp=2d60a5ee...
http://pixelserver/api/pixel?…
Number of requests over time
something big went on sale
~10 x
Quality evaluation
Data collection
Recommendation Strategies
Ticket Alert
~100m
users
tens of thousands
artists
~100m
users
thousands
of artists
Availability
Recommendability
Online
Recommendations
Top Sellers
Offline aggregate over many days
Hot right now
30min
sliding window
count rank “hot right now”filter
After viewing
fans eventually bought
Personalized
recommendations
100m
users
tens of thousands
artists
Recommender!
Service
Service endpoints
Public endpoints
Ticket!
Availability !
Service
Event Feed
Recommender!
Service
Pixel!
Service
100ms SLA
async
Storm
$ docker-compose up!
:-)
Quality
Expert feedback
Availability
A/B testing
Stealth testing
hdfs bolttee bolt
Recommender!
service
replay!
bolt
hdfs bolt
recB
recA
(live)
clickA
impA
CTR
recA
0
0.25
0.5
0.75
1
0
0.25
0.5
0.75
1
recB
Quality evaluation
Data collection
Recommendation Strategies
Contextual bandits
Exploration v Exploitation
More context
(user, artist, venue, event metadata)
Next steps
!
Tickets to the Fans!
(not scalpers)!
!
Recommendations!
!
Who are my fans? !
$$$
$$$
$
$
Counting unique users
O(N*log(N))
O(N)
HyperLogLog
O(N)
HLL
$$$HLL
+
+
+
reduce
HLL
HLL
HLL
HLL
HLL
HLL
HLL
HLL
HLL
HLL
HLL
HLL
map
HLL
Unique female users
——————————
Unique users total
Unique users with family
——————————
Unique users total
Unique high incom...
SF Philharmonic
Phish
Meshuggagh Backstreet Boys
Chvrchs 15% Asian K. Michelle 57% African American
Ramon Ayala 92% Hispanic Tom Petty 96% Caucasian
Millennials Grits and Biscuits
Gen X Depeche Mode
Baby Boomers Chicago Musical
Tens of thousands
models
Next
!
Tickets to the Fans!
(not scalpers)!
!
Recommendations!
!
Who are my fans? !
0
500
1000
Price
|+
vowpal wabbit
JNI
Storm
vw
vw
vw
vw
?
In-cart conversion
4 x
Arms Race
Next steps
!
Tickets to the Fans!
(not scalpers)!
!
Recommendations!
!
Who are my fans? !
Decades of
granular and
longitudinal data
Now what?
Recommend
the best Seat
for you
linkedin.com/in/petricek
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/storm-
hadoop-spark
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
How 30 Years of Ticket Transaction Data Helps you Discover New Shows!
Próximos SlideShares
Carregando em…5
×

How 30 Years of Ticket Transaction Data Helps you Discover New Shows!

378 visualizações

Publicada em

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1Q3kAZc.

Vaclav Petricek discusses how to train models as well as architect and build a scalable system powered by Storm, Hadoop, Spark, Spring Boot and Vowpal Wabbit that meets SLAs measured in tens of milliseconds. Filmed at qconnewyork.com.

Vaclav Petricek is a Director of Data Science at Ticketmaster and Live Nation Entertainment in Hollywood, prior to that he lead Machine Learning at Santa Monica-based eHarmony. He has been responsible for building real-time recommendation systems, optimization and large machine learning as well as people analytics.

Publicada em: Tecnologia
  • Seja o primeiro a comentar

How 30 Years of Ticket Transaction Data Helps you Discover New Shows!

  1. 1. Fans just wanna have fun! @petricek
  2. 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /storm-hadoop-spark
  3. 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  4. 4. Every year: hundreds of millions tickets sold worth Billions of Dollars
  5. 5. ! Tickets to the Fans! (not scalpers)! ! Recommendations! ! Who are my fans? !
  6. 6. Quality evaluation Data collection Recommendation Strategies
  7. 7. Javascript request URL:http://pixelserver:8080/pixel? evt=page_load imp=e4beb325-310e-493a-88cd-3da709ef509a impp=2d60a5ee-ac02-46df-8d0e-fbbe976361ef bid=bj6LvMJg-NnIOXp7tScUAHp-vcLsATCZoDvp08idDaIZo-4Iok1ppkDXZvA3RUWwoSV9IiRB pid=lILEaTVwqW8iwByWQzh0Iuk3w4kxXAhBzoFx2-PBdBriuqOKeRousDZgIgrVZQ mid=lILEaTVwqW8iwByWQzh0Iuk3w4kxXAhBzoFx2-PBdBriuqOKeRousDZgIgrVZQ sid=lS9oSQ72XqeVxPX0axzEjwFMJQ5RfAbYOdA5G_IssvgmOl4fiQHQ5gO0Lg_FDQNZ5hirrvm pgt=Artist%3A+On+Sale%3A+List dmn=TM_US aid=1687536 mkt=35 tsu=1433631937802 ref=http%3A%2F%2Fwww.ticketmaster.com%2F pgn=1 tml=tm_homeA_b_10001_1 fbt=not_authorized Of Monsters and Men 1px 1px“pixel”
  8. 8. http://pixelserver/api/pixel?…
  9. 9. Number of requests over time something big went on sale ~10 x
  10. 10. Quality evaluation Data collection Recommendation Strategies
  11. 11. Ticket Alert
  12. 12. ~100m users tens of thousands artists
  13. 13. ~100m users thousands of artists Availability Recommendability
  14. 14. Online Recommendations
  15. 15. Top Sellers Offline aggregate over many days
  16. 16. Hot right now 30min sliding window count rank “hot right now”filter
  17. 17. After viewing fans eventually bought
  18. 18. Personalized recommendations
  19. 19. 100m users tens of thousands artists Recommender! Service
  20. 20. Service endpoints Public endpoints
  21. 21. Ticket! Availability ! Service Event Feed Recommender! Service Pixel! Service 100ms SLA async Storm
  22. 22. $ docker-compose up! :-)
  23. 23. Quality
  24. 24. Expert feedback
  25. 25. Availability
  26. 26. A/B testing
  27. 27. Stealth testing
  28. 28. hdfs bolttee bolt Recommender! service replay! bolt hdfs bolt
  29. 29. recB recA (live) clickA impA CTR recA 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 recB
  30. 30. Quality evaluation Data collection Recommendation Strategies
  31. 31. Contextual bandits Exploration v Exploitation More context (user, artist, venue, event metadata) Next steps
  32. 32. ! Tickets to the Fans! (not scalpers)! ! Recommendations! ! Who are my fans? !
  33. 33. $$$ $$$ $ $
  34. 34. Counting unique users O(N*log(N)) O(N) HyperLogLog O(N)
  35. 35. HLL $$$HLL + + + reduce HLL HLL HLL HLL HLL HLL HLL HLL HLL HLL HLL HLL map HLL
  36. 36. Unique female users —————————— Unique users total Unique users with family —————————— Unique users total Unique high income users ——————————— Unique users total $$$HLL = HLL HLL HLL HLL = = HLL
  37. 37. SF Philharmonic Phish Meshuggagh Backstreet Boys
  38. 38. Chvrchs 15% Asian K. Michelle 57% African American Ramon Ayala 92% Hispanic Tom Petty 96% Caucasian
  39. 39. Millennials Grits and Biscuits Gen X Depeche Mode Baby Boomers Chicago Musical
  40. 40. Tens of thousands models Next
  41. 41. ! Tickets to the Fans! (not scalpers)! ! Recommendations! ! Who are my fans? !
  42. 42. 0 500 1000 Price
  43. 43. |+
  44. 44. vowpal wabbit JNI Storm
  45. 45. vw vw vw vw ?
  46. 46. In-cart conversion 4 x
  47. 47. Arms Race Next steps
  48. 48. ! Tickets to the Fans! (not scalpers)! ! Recommendations! ! Who are my fans? !
  49. 49. Decades of granular and longitudinal data Now what?
  50. 50. Recommend the best Seat for you
  51. 51. linkedin.com/in/petricek
  52. 52. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/storm- hadoop-spark

×