Anúncio

AUA Data Science Meetup

Senior Software Engineer at Netflix em Netflix
5 de Jun de 2015
Anúncio

Mais conteúdo relacionado

Apresentações para você(20)

Similar a AUA Data Science Meetup(20)

Anúncio

Último(20)

AUA Data Science Meetup

  1. D AV I D G E V O R K YA N @ d a v i d g e v d a v i d g e v o r k y a n
  2. G R A D U AT E D A U A I N 2 0 0 8
  3. W H AT I S B I G D ATA ?
  4. FA S H I O N A B L E T E R M ?
  5. 8 0 % O F D ATA E X I S T I N G I N A N Y E N T E R P R I S E I S U N S T R U C T U R E D D ATA ST R U C T U R E D   DATA S E M I -­‐   ST R U C T U R E D   U N ST R U C T U R E D   DATA RDBMS Data Warehousing
  6. 9 0 % O F T H E D ATA I N T H E W O R L D T O D AY H A S B E E N C R E AT E D I N T H E L A S T T W O Y E A R S A L O N E S o u rc e : h t t p : / / w w w. i n t e l . c o m / c o n t e n t / w w w / u s / e n / c o m m u n i c a t i o n s / i n t e r n e t - m i n u t e - i n f o g r a p h i c . h t m l
  7. 4 V ’ S O F B I G D ATA VOLUME (large amount of data) VARIETY (sensors, video, audio, email, social) VELOCITY (speed of data generation) VERACITY (authenticity and/or accuracy)
  8. S O L U T I O N S R E Q U I R E D f o rc e s y o u t o c h a n g e t h e w a y y o u • C O L L E C T • T R A N S P O RT • S T O R E • M A N A G E • A N A LY Z E • V I S U A L I Z E
  9. W H AT I S D ATA S C I E N C E ?
  10. D ATA S C I E N C E ! = S TAT I S T I C A L A N A LY S I S I T I S S C I E N C E A N D “ A RT ” O F … • E X P L O R I N G T H E U N K N O W N A B O U T D ATA “ m a k e d i s c o v e r i e s w h i l e s w i m m i n g i n t h e d a t a ” • R E F I N I N G T H E R E S U LT S F O R A C C U R A C Y • D E R I V I N G A C T I O N A B L E I N S I G H T • C R E AT I N G D ATA - D R I V E N P R O D U C T S
  11. W H O A R E D ATA S C I E N T I S T S ?
  12. W H O A R E D ATA S C I E N T I S T S ? D re w C o n w a y, 2 0 1 0
  13. B I G D ATA S C I E N C E T O O L S ?
  14. • S c a l a , J a v a , P y t h o n , R … ( b o n u s : C l o j u re , H a s k e l l , E r l a n g ) • H a d o o p , H D F S , M a p R e d u c e … ( b o n u s : S p a r k , S t o r m , Te z ) • S c a l d i n g , H B a s e , P i g , H i v e … ( b o n u s : S h a r k , T i t a n , G i r a p h ) • F l u m e , S q o o p , E T L , We b s c r a p e r s … ( b o n u s : H u m e ) • S Q L , R D B M S , D W, O L A P… ( b o n u s : S O L R , E l a s t i c S e a rc h ) • K n i m e , We k a , R a p i d M i n e r… ( b o n u s : S c i P y, N u m P y, P a n d a s ) • D 3 . j s , K i b a n a , g g p l o t 2 , Ta b l e u … ( b o n u s : S h i n y, F l a re , D a t a m e e r ) • S P S S , M a t l a b , S A S … ( t h e e n t e r p r i s e m a n ) • N o S Q L , M o n g o D B , C a s s a n d r a , C o u c h D B • A n d Ye s ! … M S - E x c e l : t h e m o s t u s e d , m o s t u n d e r r a t e d D S t o o l
  15. G O A L ?
  16. • R e v e n u e , re v e n u e , re v e n u e • I m p ro v e t h e c u s t o m e r e x p e r i e n c e • I n c re a s e o p e r a t i o n a l e ff i c i e n c y • G E : O p t i m i z e m a i n t e n a n c e i n t e r v a l s f o r i n d u s t r i a l p ro d u c t s • G o o g l e : R e f i n e s e a rc h a n d a d - s e r v i n g a l g o r i t h m s • Z y n g a : O p t i m i z e t h e g a m e e x p e r i e n c e f o r b o t h l o n g - t e r m e n g a g e m e n t a n d re v e n u e • N e t f l i x : M o v i e re c o m m e n d a t i o n s • K a p l a n : U n c o v e r e ff e c t i v e l e a r n i n g s t r a t e g i e s • e H a r m o n y : C re a t e h a p p y re l a t i o n s h i p s
  17. W H O A R E W E ?
  18. T R A D I T I O N A L M E T H O D S D O N O T W O R K A N Y M O R E …
  19. E H A R M O N Y C R E AT E S T H E H A P P I E S T, M O S T PA S S I O N AT E A N D M O S T F U L F I L L I N G R E L AT I O N S H I P S * * A C C O R D I N G T O A R E C E N T S T U D Y
  20. 4 3 8 M A R R I A G E S P E R D AY
  21. T H E D I F F E R E N C E ?
  22. T H E D I F F E R E N C E ? Compatibility Matching System® C O M PAT I B I L I T Y M AT C H I N G A F F I N I T Y M AT C H I N G M AT C H D I S T R I B U T I O N
  23. T H E D I F F E R E N C E ? Compatibility Matching System® C O M PAT I B I L I T Y M AT C H I N G A F F I N I T Y M AT C H I N G M AT C H D I S T R I B U T I O N
  24. U N I D I R E C T I O N A L U S E R D E F I N E D C R I T E R I A Nicolette
  25. U N I D I R E C T I O N A L U S E R D E F I N E D C R I T E R I A B I D I R E C T I O N A L Leo Ian Steve Nicolette
  26. U N I D I R E C T I O N A L U S E R D E F I N E D C R I T E R I A Leo Ian Steve Nicolette B I D I R E C T I O N A L
  27. 150     ques5ons Personality   Values   A@ributes   Beliefs
  28. Intellect Energy Sociability Ambition Kindness Curiosity Humor Spirituality
  29. C O M PAT I B I L I T Y M AT C H I N G U S E R D E F I N E D C R I T E R I A C O M PAT I B I L I T Y M O D E L S M O N G O D B V O L D E M O RT
  30. M O N G O D B DATA STORE NEEDS P O W E R F U L I N D E X I N G M O D E L S FA S T M U LT I - AT T R I B U T E S E A R C H E S E A S Y T O M A I N TA I N 6 0 M + Q U E R I E S per day
  31. M O N G O D B WINS A U T O S C A L I N G B U I LT- I N S H A R D I N G A U T O B A L A N C I N G M M S
  32. V O L D E M O RT ? T H AT N A M E S O U N D S FA M I L I A R
  33. V O L D E M O RT DATA STORE NEEDS C R U D O P E R AT I O N S VA R I E D T R A N S A C T I O N S I Z E S B I L L I O N + P O T E N T I A L M AT C H E S per day
  34. V O L D E M O RT WINS A U T O R E P L I C AT I O N A U T O PA RT I T I O N I N G P L U G G A B L E S E R I A L I Z AT I O N
  35. A F F I N I T Y M AT C H I N G Compatibility Matching System® C O M PAT I B I L I T Y M AT C H I N G A F F I N I T Y M AT C H I N G M AT C H D I S T R I B U T I O N
  36. 65 30 3000 miles
  37. Commprobability Distance in Miles 0 1 3 7 15 63 255 1023 4095 P R O B
  38. Commprobability Height difference in cm -29 -25 -21 -17 -13 -9 -6 -3 0 3 6 9 12 16 20 24 28 32 36 40 44 48 52 56 4  -­‐  8  in P R O B
  39. W O R D S T O U S E
  40. W O R D S T O U S E
  41. S O M E I N S I G H T
  42. D ATA N E E D S F O R A F F I N I T Y 5 0 M + R E G I S T E R E D U S E R S 1 0 3 AT T R I B U T E S 1 0 7 D A I LY M AT C H E S 2 5 0 M + P H O T O S 4 B + Q U E S T I O N N A I R E S A N S W E R E D
  43. C O M M U N I C AT I O N A G G R E G AT E S E V E N T L I S T E N E R S E R V I C E U S E R A C T I V I T Y S E R V I C E ~ 5 M S R E S P O N S E T I M E S 1 0 K E V E N T S P E R S E C O N D U S E R S E R V I C E H O U R LY, D A I LY T O TA L
  44. O F F L I N E B AT C H J O B S U S E R S E R V I C E M A P - S I D E J O I N S ( T B ) S C O R I N G 1+GB  Compressed  Protocol   Buffers   PA I R I N G S S E R V I C E 750M  Compressed   Protocol  Buffers   B I L L I O N + P O T E N T I A L M AT C H E S
  45. A M A Z O N E M R AW S D I R E C T C O N N E C T 2 5 6 N O D E S 5 0 T B S T O R A G E I N - H O U S E S E A M I C R O D ATA R E T R I E VA L L AT E N C Y L O W O P E R AT I O N A L C O S T L O W P O W E R C O N S U M P T I O N P R E D I C TA B L E C O M P L E T I O N T I M E S
  46. M O D E L R E T R A I N I N G distcp Protocol  Buffers  from   Offline  Jobs  
  47. M AT C H D I S T R I B U T I O N Compatibility Matching System® C O M PAT I B I L I T Y M AT C H I N G A F F I N I T Y M AT C H I N G M AT C H D I S T R I B U T I O N
  48. Delivering the right matches at the right time to as many people as possible across the entire network
  49. T H A N K Y O U Q U E S T I O N S ?
  50. C R E D I T S : The Noun Project http://thenounproject.com Visual Elements From
Anúncio