O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Performance metrics for a social network

26.454 visualizações

Publicada em

Performance metrics for a social network.
Presentation on Fashiolista's usage of Newrelic, Statsd/Graphite and PgFouine to say on top of load times.

See the blogost at

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Performance metrics for a social network

  1. 1. Performance metricsfor a social network
  2. 2. About Me• Thierry Schellenbach• Founder/ CTO Fashiolista• Author of Django Facebook• Github/tschellenbach• Blog: mellowmorning.com• @tschellenbach
  3. 3. Global Fashion Discovery
  4. 4. 5.000.000+8.000.000+
  5. 5. Growth2nd largest fashion community• 1mln members• 17mln loves/month• 94mln non-bot pageviews
  6. 6. Powered By• Django/Python• PostgreSQL• Solr• Redis• Celery• AWS/ Ubuntu• Nginx/ Gunicorn/ Supervisor
  7. 7. Sexy Metrics driven optimizationHard Because• All content is personalized• Activity is clustered around a few users (>100k followers)• Individual users are insanely active (7 hours in a day is normal)• Social network, can’t easily shard data
  8. 8. Speed is a Feature
  9. 9. Metrics across the board• Development – Spot things early on, wrong usage of ORM etc• System Health – Is my DB healthy, my Redis cluster etc• Page level – Why is my page slow – What is the average speed of the components (DB, Redis, Solr etc)
  10. 10. Tools we use Development System Health Page Level • Cloudwatch • New Relic• Debug toolbar • Munin – Cache calls • Graphite • Nagios – Graphite Timings • DB slow log – Queries and • Redis slow log their explains • Integration Tests – Duplicate query • PgFouine detection
  11. 11. Development StatsD Duplicates Cache Calls
  12. 12. Today’s Presentation New Relic Graphite PgFouine• Dashboard, High • Stash all data, • Understand level insights query it any way what keeps you want your DB busy • Tool, not a dashboard
  13. 13. New Relic• Frontend -> App -> Components (DB, Solr, etc.)• Breaks page performance down into it’s components• Tracks deploys and compares before and after
  14. 14. Are you Supported?• Ruby • Pip install newrelic• Java • Edit the .ini• .NET • Add the WSGI middleware• PHP • Wait for Magic• Python
  15. 15. End user load times• Drill down all the way to Database calls• The purple line is our app, the rest frontend Frontend (97%) App
  16. 16. Global page loads
  17. 17. Page Level• Average frontend performance per page• Click to view App level breakdown Page. Not URL. To App Level
  18. 18. Drill down/ App overview History Memcached DB Query
  19. 19. Database • See which tables are under most load • See which pages cause the load• Development over time
  20. 20. Deploys
  21. 21. Deploys part TwoResponse Time Pre & Post Memory Utilization
  22. 22. Background TaskNumber of Taskcalls (sample)
  23. 23. Graphite Insights• NewRelic has the overview, Graphite the detail• Open Source!• Throw data at it via UDP• Popularized by Etsy (see mellowmorning.com for link)
  24. 24. It’s Complicated
  25. 25. Tracks Everything
  26. 26. Setup• Track using StatsD – Support for (PHP, Python, Ruby, Node, Java)• Hierarchy (python example)• get.<app>.<view>.<component> with request.timings(get.user.profile_page.sql): print ‘database query here’
  27. 27. Data tool/ Not a dashboard• Wildcards – get.<app>.<view>.*.upper_90 – get.<app>.*.redis.zadd.upper_90 – limit(sortByMaxima(get.<app>.<view>.*.up per_90),4)
  28. 28. /style/<user>/ performance Memcached Slowdown ZADD Set Many
  29. 29. Including Functional parts of Pages• More like this part is tracked• Solr & Redis Cache
  30. 30. What we Track• Loadtime per bit of functionality• Database calls per DB• 90th percentile load times• Task broker roundtrip times• Facebook API calls
  31. 31. PgFouine• Run on samples of all queries (say 5m)• Not just slow queries• Repeating a simple query many times is also wrong, PgFouine finds it• See Instagram’s fabric snippet• https://gist.github.com/2307647
  32. 32. PgFouine ContinuedQueries that took upthe most time (N)• Spots issues with many small queries NormalizedCompare multiplereports
  33. 33. PgFouine Tips• My colleague wrote a fast C++ version• github.com/WoLpH/pg_query_analyser Also look at:• Pg Stat Statement• Pg Badger
  34. 34. Concluding New Relic Graphite PgFouine• Dashboard, High • Stash all data, • Understand level insights query it any way what keeps you want your DB busy • Tool, not a dashboard
  35. 35. Q&AWe’re Searching for Django Developers & Linuxsystem administrators! Fashiolista.com/jobsOpen source projects: Github.com/tschellenbach Try Django Facebook!