O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Snowplow Meetup
London, February 2017
2
busuu is the world’s leading social
network for language learning
Language courses Social network
+
• Access to native s...
How does busuu work?
Most important
vocabulary
Key
grammar
Practice with
native speakers
Faster
fluency
busuu is a complet...
busuu 2016
What sort of data do we use?
● Front end tracking data
● Progress data
● Backend db data
● Third party data
Why did we
look at using
Snowplow
busuu 2016
Problems
My data says
X, why does
yours say Y?
Cloudwatch
Alert!
“Why can’t i find
the results of my
A/B test t...
busuu 2016
Scalability
busuu 2016
Batch vs Real time
busuu 2016
Reconciliation
busuu 2016
Then we thought...
Can we use
snowplow
framework for
more than just
analytics?
busuu 2016
Too many SKDs and trackers
busuu 2016
Snowplow delivery
How do we get
Snowplow to deliver
the events to
everybody/thing that
needs it, instead of
add...
Tech Stack
busuu 2016
Data Collection Phase
Events Events Backend
Data
API Calls
Yet to be done
Scala Stream
Collector
busuu 2016
Processing
15
Stream
Enrich
Raw Data Enriched Data
busuu 2016
Processing
Validation
● Customised busuu event
schemas
● Different based on environment
Enrichments
● ip lookup...
busuu 2016
Distribution
17
Results
back to
App/SiteMachine Learning
Models
Yet to
be done
busuu 2016
Plug & Play Integrations
18
● One source of truth
● Scalability
● Third party systems can be added very quickly
busuu 2016
Lambda?
19
Parse through each
field of enriched
data looking for
custom schema
name
One lambda function
per typ...
Problem areas
20 busuu 2016
busuu 2016
Main implementation bugbears
1. Strict Multi Platform Schemas
2. Offline mode delay
3. Device vs Collector Time...
Future
Improvements
22 busuu 2016
busuu 2016
Future projects
● Live A/B test trains & results
● Live machine learning results in app
● Automated alerting on...
Thanks!
Bruce Pannaman
busuu 2016
Frontend event data
● Track and find issues in user behavior.
● Insight into product usage
● A/B Testing
● CRM ...
busuu 2016
Progress Data
● What has a user learnt?
● How is our content performing
● What is their language level?
● Vocab...
busuu 2016
Backend Data
● What are the user’s attributes
● Social relationships (friends)
● Writing exercises and comments
busuu 2016
Third Party
● Payments
● CRM performance
● App store metadata (review etc.)
● PPC data
Próximos SlideShares
Carregando em…5
×

Snowplow at the heart of Busuu's data & analytics infrastructure

850 visualizações

Publicada em

Presented at Snowplow London Meetup, 8 February 2017

Bruce Pannaman, data scientist at Busuu, talked about why they are using Snowplow to validate and enrich data, enable one source of truth across different data sources, cope with peaks and troughs in the data stream, and easily integrate with third party systems such as Intercom, a customer messaging platform. One of Busuu’s future projects is to load multiple A/B tests into the apps and monitor their results in real time.

Publicada em: Dados e análise
  • I think you need a perfect and 100% unique academic essays papers have a look once this site i hope you will get valuable papers, ⇒ www.HelpWriting.net ⇐
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Seja a primeira pessoa a gostar disto

Snowplow at the heart of Busuu's data & analytics infrastructure

  1. 1. Snowplow Meetup London, February 2017
  2. 2. 2 busuu is the world’s leading social network for language learning Language courses Social network + • Access to native speakers • Peer to peer text corrections • High quality courses in 12 languages • Beginner to advanced intermediate level 1 2
  3. 3. How does busuu work? Most important vocabulary Key grammar Practice with native speakers Faster fluency busuu is a complete self-study and language practice environment 3
  4. 4. busuu 2016 What sort of data do we use? ● Front end tracking data ● Progress data ● Backend db data ● Third party data
  5. 5. Why did we look at using Snowplow
  6. 6. busuu 2016 Problems My data says X, why does yours say Y? Cloudwatch Alert! “Why can’t i find the results of my A/B test till tomorrow again? “Oh my god, do we really have to put yet another tracker in?”
  7. 7. busuu 2016 Scalability
  8. 8. busuu 2016 Batch vs Real time
  9. 9. busuu 2016 Reconciliation
  10. 10. busuu 2016 Then we thought... Can we use snowplow framework for more than just analytics?
  11. 11. busuu 2016 Too many SKDs and trackers
  12. 12. busuu 2016 Snowplow delivery How do we get Snowplow to deliver the events to everybody/thing that needs it, instead of adding more trackers to the frontend
  13. 13. Tech Stack
  14. 14. busuu 2016 Data Collection Phase Events Events Backend Data API Calls Yet to be done Scala Stream Collector
  15. 15. busuu 2016 Processing 15 Stream Enrich Raw Data Enriched Data
  16. 16. busuu 2016 Processing Validation ● Customised busuu event schemas ● Different based on environment Enrichments ● ip lookup ● Forex Conversion
  17. 17. busuu 2016 Distribution 17 Results back to App/SiteMachine Learning Models Yet to be done
  18. 18. busuu 2016 Plug & Play Integrations 18 ● One source of truth ● Scalability ● Third party systems can be added very quickly
  19. 19. busuu 2016 Lambda? 19 Parse through each field of enriched data looking for custom schema name One lambda function per type of data and per integration Relay required data to third party service through REST api or given python client
  20. 20. Problem areas 20 busuu 2016
  21. 21. busuu 2016 Main implementation bugbears 1. Strict Multi Platform Schemas 2. Offline mode delay 3. Device vs Collector Timestamps
  22. 22. Future Improvements 22 busuu 2016
  23. 23. busuu 2016 Future projects ● Live A/B test trains & results ● Live machine learning results in app ● Automated alerting on complex company metrics.
  24. 24. Thanks! Bruce Pannaman
  25. 25. busuu 2016 Frontend event data ● Track and find issues in user behavior. ● Insight into product usage ● A/B Testing ● CRM cohorting ● In-app message cohorting
  26. 26. busuu 2016 Progress Data ● What has a user learnt? ● How is our content performing ● What is their language level? ● Vocabulary lists
  27. 27. busuu 2016 Backend Data ● What are the user’s attributes ● Social relationships (friends) ● Writing exercises and comments
  28. 28. busuu 2016 Third Party ● Payments ● CRM performance ● App store metadata (review etc.) ● PPC data

×