6. Tour of AWS
Service Service Name
Compute Elastic Compute Cloud (EC2)
Elastic Map/Reduce (EMR)
Auto Scaling (ASG)
Database Relation Database Services (RDS)
Messaging Simple Queue Service (SQS)
Simple Notification Service (SNS)
Monitoring CloudWatch
Networking Elastic Load Balancing (ELB)
Storage SimpleDB (SDB)
Simple Storage Service (S3)
Elastic Block Storage (EBS)
7. Tour of AWS (EC2)
EC2 = Elastic Compute Cloud
Computer
EC2 Instance
EC2 Instance AMI
Application
EC2 Instance OS
Memory Storage
(non-persistent)
• Elastic
• Management control
• Flexible (OS, etc..) Multi-tenant
• Secure
8. Tour of AWS (EC2)
EC2 Instance
Elastic IP Address
AMI • Static IP
Application • Map to EC2 instance
• Hourly charge (when not mapped)
OS • Limited # of IP addresses per account
Instance id i-79f90613
Private DNS/IP ip-10-202-26-32.ec2.internal/
(transient) 10.202.26.32
Public DNS/IP ec2-184-73-69-47.compute-1.amazonaws.com/
(transient) 184.73.69.47
9. Tour of AWS (EC2)
Computing Memory Storage
Type Platform I/O Name
Unit (GB) (GB)
Small 1 1.7 160 32 Moderate m1.small
Large 4 7.5 850 64 High m1.large
X-Large 8 15 1690 64 High m1.xlarge
High-CPU Medium 5 1.7 350 32 Moderate c1.medium
High-CPU X-Large 20 7 1690 64 High c1.xlarge
High-Memory X-Large 6.5 17.1 420 64 Moderate m2.xlarge
High-Memory 2X-Large 13 34.2 850 64 High m2.2xlarge
High-Memory 4X-Large 26 68.4 1690 64 High m2.4xlarge
Cluster Compute Very High
33.5 23 1690 64 cc1.4xlarge
(10 Gbps)
$0.085/hr - $2.40/hr
10. Tour of AWS (EC2)
How to launch an EC2 instance?
Amazon provides scripts or AWS console
RightScale
AMIs are stored in S3 or EBS
Use an existing AMI from AMZ
11. Tour of AWS (EC2)
High Availability
Region
US-East (Northern Virginia)
US-West (Northern California)
EU (Ireland)
Netflix: us-east-1c & us-east-1d Asia Pacific (Singapore)
12. Tour of AWS – Security Group
app1 app3
EC2 Instance EC2 Instance
EC2 Instance EC2 Instance
app2
EC2 Instance
EC2 Instance
Access rule: protocol | from port | to port
13. Tour of AWS - ELB
Client1 Client2 Client3
ELB (DNS name, port) Health check
HTTP/HTTPS URL, interval
EC2 Instance EC2 Instance EC2 Instance EC2 Instance
Availability Zone Availability Zone
us-east
14. Tour of AWS – Auto Scaling
EC2 Instance
Application demand
EC2 Instance EC2 Instance
EC2 Instance EC2 Instance
EC2 Instance EC2 Instance EC2 Instance
EC2 Instance EC2 Instance EC2 Instance
EC2 Instance EC2 Instance EC2 Instance EC2 Instance
EC2 Instance EC2 Instance EC2 Instance EC2 Instance
Time
Launch or terminate EC2 instances based on user-defined triggers
15. Tour of AWS – Auto Scaling
Auto Scaling
Configuration Launch Configuration
AMI
LaunchConfigName Application
AMI id
Min Security group
Max OS
Instance type
Load balancer User data
Availability zone
Triggers
• Health
• CPU Utilization
• Latency
• I/O activity
16. Tour of AWS – Cloud Watch
Visibility into resource utilization, operational performance
CPU
Network
EC2 Instance EC2 Instance
Disk I/O
EC2 Instance EC2 Instance
EBS
Load Balancer
AWS Management Console RDS
17. Tour of AWS - EBS
Persistent storage for EC2 instances
100 volumes or 20 TB – Max 1TB per volume
Off-instance persistent storage
Attach and detach to/from
EC2 instance
Why do we need this?
Persistent file systems
Take backups and store in S3
Public data sets: Human Genone, US Census Data
18. Tour of AWS – S3
Data storage infrastructure – for the Internet
Write, read, delete objects up to 5 GB
Scalable, reliable, unlimited storage
Objects can be made publicly accessible
Per Account
..100
99.999999999% durability and 99.99% availability $0.055/GB -> $0.15/GB
19. Tour of AWS – S3
Interesting APIs
Can I search objects in a bucket? NO
Can I get a list of objects in a bucket? NO
Can I remove a bucket?
Remove all objects first
Can I get list of keys? YES
Also by prefix
20. Tour of AWS - SimpleDB
For structured, non-relational text data
Highly available
Zero administrative overhead
Auto indexing
Domain
itemId Email Pets
primary key Item
jdoe jdoe@yahoo.com dog
mjane mjane@gmail.com cat, bird
Domains are collections of items that are described by
attribute-value pairs
21. Tour of AWS - SimpleDB
No Schema
itemId Email Pet
jdoe jdoe@yahoo.com dog
mjane mjane@gmail.com cat, bird
itemId Email Pet Phone
jdoe jdoe@yahoo.com dog
mjane mjane@gmail.com cat, bird 333-444-5555
22. Tour of AWS - SimpleDB
256 Attributes 1024 Bytes
10 GB
itemId Email Pet
jdoe jdoe@yahoo.com dog
mjane mjane@gmail.com cat, bird
1024 Bytes
select <attributes> from <domain> where <query expression>
Default to 100 items per select, maximum up to 2500 items
23. Tour of AWS - SimpleDB
SimpleDB
Read Consistency
Node 1
itemId Email Pet
jdoe jdoe@yahoo.com dog cat
mjane mjane@gmail.com cat, bird
Node 2
itemId Email Pet
jdoe jdoe@yahoo.com dog cat
mjane mjane@gmail.com cat, bird
Select, G
etAttribu
te Node 3
itemId Email Pet
jdoe jdoe@yahoo.com dog cat
mjane mjane@gmail.com cat, bird
24. Tour of AWS - SimpleDB
CAP THEOREM
Tolerant of node
failures
A
Availability
CA AP
All nodes see the
Tolerant of
same data BigTable SimpleDB
message loss
C
CP
Hbase
P
Partition
Consistency Membase Tolerance
Shared-data Distributed System
25. Tour of AWS - SimpleDB
Eventual Consistent Read Consistent Read
Stale reads possible No stale reads
Lowest latency Higher latency (500 to 1000ms)
Highest throughput Lower throughput (1/3)
Conditional Put & Delete
Optimistic concurrency control
Eliminate lost updates due to concurrent writing to same item
Comparing an attribute with specified expected value
Transactional semantics
26. Tour of AWS - SQS
Web-scale Message Infrastructure
• Up to 64KB size
• Retain up to 14 days
• Message visibility -12 hours
m6
m1
m2 m5 m4
• Concurrent writers & readers m3
• No FIFO
• Delivery “at least once”
m7
27. Tour of AWS - SNS
Notification Infrastructure
m1
HTTP/HTTPS
m5
Topic m1
SQS
m4 m3 m2
m1
• 100 topics per account
• Message max size 8K text
data
m6 Email
28. Tour of AWS - Security
AWS
Account: Access key & Secret key
EC2
HTTP/HTTPS SimpleDB
S3
SQS
SNS
Authentication via
HMAC signature ..
29. Netflix In AWS Cloud
Encoding
Use ~4K EC2 Petabytes on
Instances S3 CDN
30. Netflix In AWS Cloud
Netflix Data Center
Discovery API ELB
Discovery
Service
Service
Oracle
Internal
Internal API
Service API
Service
memcached memcached SQS
Consumer
S3 SQS
SimpleDB
31. Netflix In AWS Cloud
Security Group
Auto Scaling
Internal
Group Internal
Services
Internal
Services
Internal
Services
Service
32. Netflix In AWS Cloud
SimpleDB
Rental
history: ~800M items
Queue: ~1B items
S3
Compressed rental history: ~17M objects
Streaming activity logs
Access through customer id or movie id or both
33. Netflix In AWS Cloud
Missing infrastructure services
Discovery service
Middle tier load balancer
Encryption service
Key management
Caching
Wrap memcached server
Discoverable
Instrumented
34. Netflix In AWS Cloud
Discovery
Discovery
Service
Service
Middle
Web Tier
Load
Application
Balancer
Heart beat
Internal
Internal
Services
Internal
Services
Internal
Services
Service
35. Netflix In AWS Cloud
Big Bang Transition
iPhone Launch
Totallyrun in cloud and no fallback option
No control once App Store gate is opened
Have to scale on day one
EC2 elasticity
36. Netflix In AWS Cloud
Datacenter vs Cloud
Copy from Adrian’s slide
39. Best Practices
SimpleDB
Sorting is lexicographical
Pad numeric attributes
Use consistent date format (Joda time)
Explicit
selecting limit
Use batch put and batch delete
Dealing with null
Dealing with eventual consistency
Consistentget
Conditional put
Item name
Combining columns
UUID
40. Best Practices
SimpleDB
Index selectivity and performance
# of distinct attribute values in all the items in domain
Sharding
Get around the limits
Scale the write throughput
BatchPutAttribute or BatchDeleteAttribute
41. Best Practices
S3
Achieving high write throughput
Pre-sortedthe keys before upload
Preprend object key with increasing 4 to 6 digits
SQS
Decouple system
Asynchronous processing
Buffering
Visibility window > processing time