Disqus talks about how they scale their Python web application to over 500 million visitors a month.
Video is available here: http://pycon.blip.tv/file/4880330/
1. DISQUS
Python at 400 500 million visitors
Jason Yan David Cramer
@jasonyan @zeeg
Got feedback? Use hashtag #sckrw
Sunday, March 13, 2011
2. Agenda
• What is DISQUS?
• An Overview of the Infrastructure
• Iterative Development and Deployment
• Why We Love Python
Sunday, March 13, 2011
3. What is DISQUS?
dis·cuss • dĭ-skŭs'
We are a comment system with an
emphasis on connecting communities
http://disqus.com/about/
Sunday, March 13, 2011
6. Startup-ish
• Founded just about 4 years ago
• 16 employees, 8 engineers
• Tra c increasing 15-20% a month
• Flat organizational structure, every
engineer is a product manager
• Fast turnaround, new feature launches
every week (sometimes daily)
Sunday, March 13, 2011
7. Tra c
Number of Visitors
500M
375M
250M
125M
0M
March 2008 through March 2011
Sunday, March 13, 2011
8. DjangoCon 2010
• 17,000 requests/
second peak
• 450,000 websites
• 15 million profiles
• 75 million
comments
• 250 million visitors
Sunday, March 13, 2011
9. Six Months Later
• 17,000 requests/ • 25,000 requests/
second peak second peak
• 450,000 websites • 700,000 websites
• 15 million profiles • 30 million profiles
• 75 million • 170 million
comments comments
• 250 million visitors • 500 million visitors
Sunday, March 13, 2011
10. Six Months Later
• September 2010: 250 million uniques
• March 2011: 500 million uniques
• Handling over 2x the tra c
Sunday, March 13, 2011
11. Six Months Later
• September 2010: ~100 servers
• March 2011: ~100 servers
• Scale diagonally
Sunday, March 13, 2011
12. Scaling Diagonally
• We still rent hardware, so there is no
“commodity hardware”
• Cheaper to upgrade
• Everything is redundant
• Partition data where you need to, scale
partitions vertically
• Upgrade hardware (more RAM, more
drives, more cores)
• Python apps tend to be CPU bound
Sunday, March 13, 2011
13. Infrastructure
• 35% Web Servers
(Apache + mod_wsgi)
• 15% Utility Servers
(Python scripts, background workers)
• 20% Databases
(PostgreSQL, Redis, Membase)
• 20% Load Balancing / High Availability
(HAProxy + Heartbeat)
• 10% Caching servers
(Memcached, Varnish)
• Half of our servers run Python
Sunday, March 13, 2011
14. Python Web Servers
• Use what you’re comfortable with
• Apache + mod_wsgi vs nginx + uWSGI
Min Avg Max Memory
60.0
mod_wsgi 45.0
30.0
uWSGI
15.0
0 200 400 600 0
mod_wsgi uWSGI
req/sec
• Bottleneck is in the application
Sunday, March 13, 2011
15. Background Workers
• Lots of tasks that don’t need to be done in
web application process:
• Crawling URLs
• Updating avatars
• Email notifications
• Analytics
• Counters
Sunday, March 13, 2011
16. Background Workers (cont’d)
• Most jobs are I/O bound
• Slow external calls
• Twitter is slow
• Facebook is slow
• Could parallelize with multiple processes,
but...
Sunday, March 13, 2011
17. Background Workers (cont’d)
• Waste of memory
• Use non-blocking I/O
• Celery 2.2 adds support for gevent/
eventlet
Sunday, March 13, 2011
23. Which means...
• Largest Django-powered web application
• We fork, and even sometimes monkey
patch to make it scale to our needs
• Fortunately, we don’t have to do too
much (Yay, Django!)
• Unfortunately, we can’t use the whole of
the Django internal components (and if
we do, we do it in atypical ways)
Sunday, March 13, 2011
25. Iterating Quickly
• Abstracting our application environment
• Less dependancies locally
• Rely on CI for dependency coverage
• Heavy use of open source packages
• No NIH syndrome
• Deploy frequently, 3-7 times a day
• Lots of branches, but master is “stable”
• Realtime reporting on exceptions, metrics
• Our test suite is the main blocker (slow)
Sunday, March 13, 2011
27. Gargoyle
Deploy features to portions of a user base at a
time to ensure smooth, measurable releases
Being users of our product, we actively use
early versions of features before public release
Sunday, March 13, 2011
28. The Deployment Problem
• Make some changes locally
• Run a subset of the test suite
• Push your commits
• CI server begins running tests
• ....
Sunday, March 13, 2011
30. Rinse and Repeat
• 30 minutes later tests fail, start over
• Finally, deploy to a subset of servers
• Open Sentry (our exception logger)
• Monitor Graphite
• Deploy to 35 servers (~8 minutes)
• Full rollback in < 30 seconds
Sunday, March 13, 2011
33. Testing Code
• Test suite takes around 25 minutes usually
• “Stuck” with Hudson (or Jenkins)
• Most tightly integrated plugins are
geared towards Java developers
• Which framework do we use?
• unittest(2), nose, doctests, LETTUCE?
• We use unittest and nose
• Need to report code coverage, speed of
tests, pylint (or pyflakes)
Sunday, March 13, 2011
35. Love-ish
• Many of us started with PHP or Rails
• Clean syntax, clear standards
• All languages need PEP8.py and
PyFlakes
• Interpreted, fast... enough
• Very easy to learn
• We all started by learning Django first,
then Python
Sunday, March 13, 2011
36. Haters Gonna Hate
If you could choose one thing in
Python to hate on...
Sunday, March 13, 2011
38. What can we do?
• Too many forks, too many frameworks
• We need less clones, and more combined
e ort
• Improving existing Python solutions
• More Python solutions for existing
products
Sunday, March 13, 2011
41. References
• Sentry (our exception tracking tool)
http://github.com/dcramer/django-sentry
• Gargoyle (feature switches)
https://github.com/disqus/gargoyle
• Django DB Utils (collection of db helpers for Django)
https://github.com/disqus/django-db-utils
• Jenkins CI
http://jenkins-ci.org/
code.disqus.com
Sunday, March 13, 2011
Notas do Editor
Hi. I'm Jason (and I'm David), and we're from Disqus.
For those of you who are not familiar with us, DISQUS is a comment systemthat focuses on connecting communities. We power discussions on such sites as CNN, IGN, andmore recently Engadget and TechCrunch. Our company was founded back in 2007 by my co-founder,Daniel Ha, and I back where we started working out of our dorm room.Our decision to use Django came down primarily to our dislike for PHP whichwe were previously using. Since then, we've grown Disqus to over 250+million visitors a month.
Show of hands, How many of you know what DISQUS is?
We've peaked at over 17,000 requests per second, to Django, and we currentlypower comments on nearly half a million websites which accounts for more than15 million profiles who have left over 75 million comments.