2. What am I talking about today?
What was OFA? Why is this relevant?
• Who did it?
• What did they build?
How did they do that?
• Technologies and Tradeoffs
• Services vs. Software
What did they learn from building something so big?
3. Full Disclosure
I work for AWS
AWS does not endorse
political candidates
Yes, I talk too much
4. So here’s the Idea
~30th biggest E-commerce operation, globally
~200 distinct new applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users
FUN FUN FUN
5. a few constraints…
~30th biggest E-commerce operation, globally
~200 distinct applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users
Critically compressed budget
Less than a year to execute
Volunteer and near-volunteer development team
Core systems will be used for a single critical day
Constitutionally-mandated completion date
NOT
NOT
17. Cloud Computing Benefits
No Up-Front
Capital Expense
Low Cost
Pay Only for
What You Use
Self-Service
Infrastructure
Easily Scale
Up and Down
Improve Agility &
Time-to-Market
Deploy
26. Technology Choice
Polyglot Development
Cloud Hosting
Expected Tradeoff
More Complex Ops
Diverse, App-centered
Databases
Less Infra Control,
performance
More Complex Ops,
Fragility, Data Corruption
SOA, queue-based system
integrations
Dev Complexity, slower
system performance
27. Technology Choice
Polyglot
Development
Cloud Hosting
Diverse, Appcentered Databases
SOA, queue-based
system integrations
Expected Tradeoff
More Complex
Ops
Upside
Build as little as
possible, rev-1 faster,
reuse dev skills
Less Infra Control,
performance
More Complex
Ops, Fragility,
Data Corruption
Scale, Speed, Cost
Dev Complexity,
slower system
performance
Scalability,
serviceability,
operational
flexibility, and
substantially faster
in aggregate
Heterogeneous
Resilience, right
tools for the job
31. This applies to lots of services!
ELB
ElastiCache
RDS
CloudSearch
Route53
S3
CloudFront
DynamoDB
You can mostly do
these on your own…
But do you have extra:
focus, expertise, time, research,
money, risk-tolerance, staff,
dedication to
innovate, operations coverage, scalability in design...
35. They had this built for the previous 3
months, all on the East Coast.
36. They had this built for the previous 3
months, all on the East Coast.
We built this
part in 9 hours
to be safe.
AWS +
Puppet +
Netflix Asgard +
CloudOpt +
DevOps =
Cross-Continent FaultTolerance On-Demand
39. So what did they learn?
Game Day: Practice failures so you know what to do.
Loose-Coupling: Ops easy, scale easy, test easy, fix easy…
Fail-Forward: features, quality, and focus are all critical.
HA in Depth: S3 static pages, de-coupled UI, jekyll/hyde
Cloud works.
43. AMAZON REDSHIFT
Redshift runs on HS type instances
HS1.8XL: 128 Go RAM, 16 Coeurs, 16 To de contenu compressé, 2 Go/sec en lecture
HS1.XL: 16 Go RAM, 2 Coeurs, 2 To de contenu compressé
44. Extra Large Node
(HS1.XL)
Single node
Cluster 2-32 Nodes (4 To – 64 To)
Eight Extra Large Node (HS1.8XL)
Cluster 2-100 Nodes (32 To – 1.6 Po)