[판교에서 만나는 아마존웹서비스] Obama for America를 통해서 본 AWS에서의 데이터 분석

Obama For America on AWS

Younjin Jeong
Solutions Architect

What am I talking about today?
What was OFA? Why is this relevant?
• Who did it?
• What did they build?

How did they do that?
• Technologies and Tradeoffs
• Services vs. Software

What did they learn from building something so big?

Full Disclosure
I work for AWS
AWS does not endorse
political candidates
Yes, I talk too much

So here’s the Idea
~30th biggest E-commerce operation, globally
~200 distinct new applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users

FUN FUN FUN

a few constraints…
~30th biggest E-commerce operation, globally
~200 distinct applications, many mobile
Hundreds of new, untested analytical approaches
Processing hundreds of TB of data on thousands of servers
Spikes of hundreds of thousands of concurrent users
Critically compressed budget
Less than a year to execute
Volunteer and near-volunteer development team
Core systems will be used for a single critical day
Constitutionally-mandated completion date

NOT
NOT

Built by guys and gals like these: Obama For America

Business as usual..

…for a technology startup

Election Day – OFA Headquarters

So they built it all, and it worked

The old approach, even from Amazon 

The old approach.. Might have some problems..

Cloud Computing Benefits
No Up-Front
Capital Expense

Low Cost

Pay Only for
What You Use

Self-Service
Infrastructure

Easily Scale
Up and Down

Improve Agility &
Time-to-Market

Deploy

OFA’s Infrastructure

awsofa.info

Ingredients
Ubuntu nginx boundary Unity jQuery SQLServer hbase
NewRelic EC2 node.js Cybersource hive ElasticSearch
Ruby Twilio EE S3 ELB boto Magento PHP EMR SES
Route53 SimpleDB Campfire nagios Paypal CentOS
CloudSearch levelDB mongoDB python securitygroups
Usahidhi PostgresSQL Github apache bootstrap SNS
cloudformation Jekyll RoR EBS FPS VPC Mashery
Vertica RDS Optimizely MySQL puppet tsunamiUDP R
asgard cloudwatch ElastiCache cloudopt SQS cloudinit
DirectConnect BSD rsync STS Objective-C dynamoDB

Data Stores

Development Frameworks

Sites

Communications
Ad Targeting
Ops Tools
Analytics
Apps

Micro-targeting
Micro-listening
Reporting
Registrations
Volunteer
Coordination
Etc, etc, etc.

Technology Choice
Polyglot Development
Cloud Hosting

Expected Tradeoff
More Complex Ops

Diverse, App-centered
Databases

Less Infra Control,
performance
More Complex Ops,
Fragility, Data Corruption

SOA, queue-based system
integrations

Dev Complexity, slower
system performance

Technology Choice
Polyglot
Development
Cloud Hosting
Diverse, Appcentered Databases
SOA, queue-based
system integrations

Expected Tradeoff
More Complex
Ops

Upside
Build as little as
possible, rev-1 faster,
reuse dev skills

Less Infra Control,
performance
More Complex
Ops, Fragility,
Data Corruption

Scale, Speed, Cost

Dev Complexity,
slower system
performance

Scalability,
serviceability,
operational
flexibility, and
substantially faster
in aggregate

Heterogeneous
Resilience, right
tools for the job

This applies to lots of services!
ELB
ElastiCache
RDS
CloudSearch
Route53
S3
CloudFront
DynamoDB

You can mostly do
these on your own…

But do you have extra:
focus, expertise, time, research,
money, risk-tolerance, staff,

dedication to

innovate, operations coverage, scalability in design...

Looks pretty simple.

Inserts 7.5m records in DynamoDB, in 8 minutes

One thing that is difficult to prepare for…

They had this built for the previous 3
months, all on the East Coast.

They had this built for the previous 3
months, all on the East Coast.

We built this
part in 9 hours
to be safe.

AWS +
Puppet +
Netflix Asgard +
CloudOpt +
DevOps =

Cross-Continent FaultTolerance On-Demand

Replication across the continent..

http://tsunami-udp.sourceforge.net/

So what did they learn?
Game Day: Practice failures so you know what to do.
Loose-Coupling: Ops easy, scale easy, test easy, fix easy…
Fail-Forward: features, quality, and focus are all critical.

HA in Depth: S3 static pages, de-coupled UI, jekyll/hyde
Cloud works.

Maybe look at some of their Ruby code?

https://github.com/democrats/voter-registration

AMAZON REDSHIFT
Redshift runs on HS type instances

HS1.8XL: 128 Go RAM, 16 Coeurs, 16 To de contenu compressé, 2 Go/sec en lecture

HS1.XL: 16 Go RAM, 2 Coeurs, 2 To de contenu compressé

Extra Large Node
(HS1.XL)

Single node
Cluster 2-32 Nodes (4 To – 64 To)

Eight Extra Large Node (HS1.8XL)
Cluster 2-100 Nodes (32 To – 1.6 Po)

JDBC/ODBC

10 GigE
(HPC)

Ingestion
Backup
Restoration

Amazon DynamoDB

AMAZON EC2

AMAZON
DYNAMODB

AMAZON RDS

AMAZON ELASTIC
MAPREDUCE

AMAZON
REDSHIFT

AMAZON S3

AWS STORAGE
GATEWAY

DATA CENTER

Thank you!

Younjin Jeong - AWS
younjin@amazon.com

[판교에서 만나는 아마존웹서비스] Obama for America를 통해서 본 AWS에서의 데이터 분석

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a [판교에서 만나는 아마존웹서비스] Obama for America를 통해서 본 AWS에서의 데이터 분석

Semelhante a [판교에서 만나는 아마존웹서비스] Obama for America를 통해서 본 AWS에서의 데이터 분석 (20)

Mais de Amazon Web Services Korea

Mais de Amazon Web Services Korea (20)

Último

Último (20)

[판교에서 만나는 아마존웹서비스] Obama for America를 통해서 본 AWS에서의 데이터 분석