Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies

Escape from Amazon:
Tips/Techniques for Reducing
AWS Dependence

Soam Acharya, PhD
Chief Scientist, Limelight Video Platform
@soamwork
Oct 2012

Introduction
• Without Amazon we wouldn’t be where we are today
• Audience for this talk:
– Advanced AWS users
• Too much of a good thing
• Have to stop using AWS
– Beginners
• Design system to avoid pitfalls

3

Agenda
• Why Reduce AWS Dependence?
• Case Study: Delve, now Limelight Video Platform
– Who Are We?
– Our Experiences
• Pre Migration Status
• Challenges
• Current Setup
• Lessons Learned: Tips/Techniques For Reducing AWS Dependencies
& Costs

4

Why Reduce AWS Dependence?
• Outages
– Not limited to a single service

5

• Service depreciation
– SimpleDB
• Shared public cloud
– Multi-tenancy issues
• Business Reasons:
– “frenemy” i.e. you compete with Amazon in something
– single vendor lock-in
• Reduces leverage

6


$$$
• Scenario #1:
– Startup acquisition
– Required to migrate
• Scenario #2:
– Grow too big for your own good
– Economical to run your own hardware

7

Case Study - Limelight Video Platform (LVP)
• Many world class customers – NFL, Sony, QVC, Pokemon,
MBC, Hearst, Prudential, Alloy Media etc
• Global footprint – 100+ countries, 5000+ websites
• Based in Seattle with employees in SF, NYC, LON, LAX
• Founded in 2006 as Pluggd
• Pivoted in 2008 as Delve Networks
– Online Video Platform (OVP)
– Competes with Ooyala, Brightcove, Kaltura

• Acquired by Limelight Networks in August 2010
– Limelight is a global content delivery network

8

LVP Workflow

upload manage

transcode
publish

analytics

9

Backend Notes
• SOA
– Java, Spring, Hibernate, MySQL, NoSQL, REST etc

10

Case Study – LVP AWS Usage History
• Delve Networks:
– Founded by ex-Amazon folks
– Started moving to AWS in Summer 2008
• Used Scalr for cloud management
• At peak:
– Several hundred EC2 instances
– ELB, S3, SimpleDB, EMR, CloudFront, CloudWatch, EBS, SQS

• Acquired by Limelight Networks in August 2010
– Migration work started in late Fall 2010

11

Migration Challenges – AWS Dependence

12

Migration Challenges – LVP Growth

13

Migration Challenges - Other
• LVP
– Personnel
– Service interdependencies
– Growing pains
• Our own services
• AWS outages
• Limelight Integration/Migration Challenges
– Machines:
• Obtaining
• Environment
• Placement
• Maintenance
– Operation philosophies
• CDN vsSaaS

14

Current Status
• Hybrid model
– Limelight
• 4 data centers
– ~400
– 50+ services/handlers
• Other infrastructure
– Hadoop cluster
– Databases
– CDN services
– AWS services
• Burst into EC2
• S3, DynamoDB, SimpleDB, SQS, Elastic Map Reduce
– Work continuing on reducing dependence on these

15

Tips/Techniques for Reducing AWS
Dependency and Costs

• Machine Placement
• Caching
• Parallelization
• Open Source + Alternative Services
• Cross service redundancy
• Miscellaneous tips

16

Tip: Machine Placement

• Our strategy: use EC2 as little as possible for steady state
• Where put non EC2 machines?
– Still need access to other AWS services
• Weight of data
– Find data centers as close as possible to target AWS center (N Virginia)
• Proximity is important
– S3 files visible from one data center may not be immediately visible from another

– One data center isn’t enough:
• Service, geo redundancy

Tip: Machine Placement
• Limelight POPs:
– Direct connections to access networks
– Global fiber-optic interconnect
– But:
• POP capacity
• placement within POP
• shipping ..

Machine Placement - PHX

• Started off in PHX
• Close to Limelight HQ
• S3 download tests
conducted every hour
over a week
• Early 2011

19

Machine Placement – SoftLayer/Houston

• From SoftLayer in
Houston
• Has peering
arrangement with
Amazon

20

Machine Placement - ATL

• From Atlanta POP

21

Machine Placement - EC2

• From EC2 in N Virginia

22

Machine Placement - IAD

• From IAD
– Best non EC2 performance
– One external hop away
• But even within IAD:
– Machine NIC
– Switch/Router setup

• Peering helps

23

Caching

• Tip: cache access to AWS services
– Save on RTT
– Better redundancy, fault tolerance
– AWS bandwidth costs

24

Caching: LVP Analytics Reporting
S3

LLNW
Simple Reporting
mem- +
DB cached service
clusters

• Need to quickly fetch, assemble
Dynamo analytics reports
DB
• SimpleDB: charged by usage 25

Caching: Transcoding
AWS
Virginia

IAD

Video
Processing
Handlers

Video
Processing
Handlers S3

• Video processors (transoders, thumbnail
processors …) require access to original video
• Bandwidth out of AWS - $$

26

AWS
Virginia

• Use Limelight Proxy IAD
Caching
Video
Processing
Handlers

L
L
Video
Processing
Handlers P S3
r
o
x
y

27

AWS
Virginia

• Additional benefits IAD

Video
Processing
Handlers

L
L
Video
Processing
Handlers P S3
r
o
Another POP
x
y
Video
Processing L
Handlers L

P
28
r

Parallelization
• AWS services are set up to be highly distributed
• Construct application/systems to parallelize requests:
– Useful for applications/systems located outside AWS
– Pipelining to get around large RTTs to AWS
• Example:
– Our transcoding
– Our real time analytics processing

29

Parellization – RT Processing

Simple
• hadoop process in DB
IAD

Metadata lookup

“fast” logs Job
S3 Hadoop
process Controller

Reports

Simple
DB
30

Parellelization – RT Processing

Simple
• Move to LL hadoop DB
cluster in PHX
• Further away from
Metadata lookup
AWS but ….
“fast” logs
h h
Job
S3 Controller
h h

Reports

Simple
DB
31

Parellelization/Caching – RT Processing

Simple
• Introduce caching DB
into the mix

cache

“fast” logs
h h
Job
S3 Controller
h h

Reports

Simple
DB
32

Open Source + Alternative Services
• Moving out of AWS means you have to find alternatives
• Sometimes involves multiple building blocks
• Alternatives to
– SimpleDB
• MongoDB instances
– CloudWatch
• Cloudkick
• Zabbix
– S3
• GlusterFS, Limelight Cloud Storage
– ELB
– Public cloud

33

ELB Alternative

• Use Limelight’s Traffic
Balancer product (DNS-XD)
• nginx

ELB Alternative II

• Traffic Balancer also
allows geo based
request routing

Private Cloud Alternative

• At AWS:
– Used Scalr for cloud management
– Amazon constantly improving own tools
• At Limelight:
– Original vision:
• Use something like Eucalyptus/OpenStack
• Seamless amalgam of public-private cloud using Scalr
– Rude reality:
• Learning curve
• Price, maintainance
• Didn’t know internal Limelight processes, network topology
• Business reality: start migration ASAP

36


• Opscode’s Chef
– Infrastructure as code
– Infrastructure as a service
• Hosted version of Chef
• We use Chef for:
– Node management
– Service deployment
• Limelight
• Starting to use in EC2 as well

37


• Our infrastructure management model:
– Recipes:
• Tomcat service, apache service, java, memcached setup
– Roles:
• Use recipes to construct a service
– Environment:
• Base, dev, staging, production
– Node:
• Environment + roles

• Difficulties:
– Rolling deployments
– Repurposing nodes without virtualization
38

Cross Service Redundancy

• Backup data
• Example: we keep copies in S3 of reports stored in
SimpleDB, DynamoDB
– Alternative source if SimpleDB, DynamoDB goes down
– Also:
• Easy to copy reports to other alternatives
• Don’t have to incur additional AWS costs pulling entire corpus out of dbs

39

Other Miscellaneous Tips
• S3:
– Compress files!
• Save storage costs
• Less time to transfer over networks

• Elastic Map Reduce:
– Multitenancy issues affect performance
• Time of day
• instance type
– Non cluster compute instances

40

Other Miscellaneous Tips

• DynamoDB:
– A big component of DynamoDB bill is read/write provisioning speed
• Limits on how often provisioning can be changed
• Can be reduced only once a day
– Toggle speeds if uploads can be batched
• raise write throughput prior to uploading the bulk of our data for the day,
then reduce
Start most of the day’s uploads
Complete most of the day’s uploads

Ddb write speed

Time during a day

Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Destaque

Destaque (20)

Semelhante a Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies

Semelhante a Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies (20)

Último

Último (20)

Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies