Tapjoy faced challenges with rapidly scaling their mobile ad network due to the costs of maintaining test environments. They developed Tapinabox to automate the provisioning of test environments using Amazon EC2 Spot Instances, allowing tests to run in parallel across many servers at a lower cost. By utilizing Spot Instances for their continuous integration testing with Jenkins, Tapjoy was able to scale out testing without slowing down their development process. The use of Spot Instances reduced costs and improved the quality of their software through increased testing capabilities.
3. Who Am I?
• Engineer turned DevOps junkie
• Multiple Areas of Focus
•
•
•
•
Performance
Cost Management & Reduction
Efficiency
As Needed Firefighter
Friday, November 15, 13
4. Tapjoy’s Challenge
• Premier Mobile Ad Network Across iOS & Android
• Global Network (435 Million Monthly Reach)
• Diverse user base (54% Male, 46% Female)
• Billions of requests per day
• Growth requires iteration, experimentation, and massive
scale
• Small mistakes are magnified to millions of consumers
instantly
Friday, November 15, 13
6. What Everyone Sees
1. Develop a killer product, hire more engineers
2. Expand product as fast as possible, gain market
share
3. Iterate, grow, refine, scale...
4. #Profit
Friday, November 15, 13
7. What Everyone Forgets to Mention
• Engineering is a game of tradeoffs. Fast growth
comes at a price.
• Testing and QA are often the first to lag behind.
• Building quality test beds for engineers can be
complicated.
Friday, November 15, 13
8. Food for Thought
• Where are your bottlenecks in your development
cycle?
• How do you simulate production?
Friday, November 15, 13
10. The Mindset Required
• Engineer application level redundancy and fault
tolerance.
• Spread yourself amongst many zones, potentially even
many regions.
• Identify areas of required, not preferred, persistence.
• Understand how your “neighbors” are utilizing your
preferred instance type.
Friday, November 15, 13
11. Always Be Testing
• Lots of engineering means...
• Lots of code pushes
• Lots of pull requests
• Lots of automated tests to be run frequently to prevent regressions
• Full test suites can take > 1 hour when run serially.
• Can slow down the review process when changes
require at minimum an hour lag for automated sign-off.
Friday, November 15, 13
12. Scale Horizontally With Spot Instances
• Jenkins + Spot Instances
• https://github.com/bwall/ec2-plugin
• Go wide during business hours, scale back in the
evenings. Automatically kicks online at 06:00ET
• Workers scale horizontally to support dozens of
simultaneous regression tests spread out over dozens
of workers
• Jenkins automatically guards against spot termination
Friday, November 15, 13
13. Tapinabox
• Born from Tapjoy Hackathon
• Quarterly hackathons to promote engineering growth and creativity.
• Question Posed: “What if spinning up a fresh QA was as easy as launching
a app server on Heroku?”
• Goals
•
•
•
•
Behaves, operates, and performs like production.
Should be dead-simple to use. “Push a button, get a server.”
Save money through improved quality and low cost to operate.
End-To-End Product. No mocks, no stubs.
Friday, November 15, 13
14. Tapinabox Cont.
• Fully automated build & deployment process tied to a
developers Github account.
• Simple web interface for creating, managing, and
editing instances.
• Quick collaboration with Product Managers, remote
QA, and partner testing.
• Fully run on Spot Instances.
Friday, November 15, 13
17. How we Built Tapinabox
• Ruby on Rails
• AWS CLI
• Chef
• Amazon EC2 Spot Instances (zone & region agnostic)
• Tapjoy Slugs: Custom slug-deployments based on
FPM https://github.com/jordansissel/fpm
Friday, November 15, 13
18. Pointers & Lessons Learned
• Spot Instances can take a while to provision with a different
workflow than a traditional on-demand model.
• Very important to find quiet zones / regions based on your
workload.
• Pricing sometimes takes a back seat to reliability. What
instance types have contention?
• Guard against termination with adequate pricing, but don’t try
and prevent it. Automation is key. The price will eventually fall.
• Pick the right tool for the job. Don’t get greedy!
Friday, November 15, 13
19. Kudos
• Tapjoy Automation Team
•
•
•
•
•
John Russell
Chris Gerber
JLo
Adam Bell
Hugh Barrigan
Friday, November 15, 13
21. Please give us your feedback on this
presentation
CPN207
As a thank you, we will select prize
winners daily for completed surveys!
Friday, November 15, 13
Thank You