I gave a presentation at the AWS User Group, Singapore on 24 Oct 2013 explaining how we used EC2 Auto Scaling with CloudWatch Custom Metrics for our product.
Using AWS CloudWatch Custom Metrics and EC2 Auto Scaling -VSocial Infrastructure
1. Auto Scaling Spot Instances with
Custom CloudWatch Metrics
AWS Singapore User Group
Chris Drumgoole
Head, Product Management
2. Agenda
• Quick overview of VSocial
• Infrastructure Scaling Needs
• Overview of EC2 Auto Scaling with AWS
– Scaling from CloudWatch
• Configuring Auto Scaling Launch
Configs, Groups, and Policies
• Creating your own CloudWatch Metric
• Bringing it all together
2
4. VSocial Overview
• “Social Media Enterprise Command Center”
• Core Features:
– Social CRM
– Social Media Community Management
• Community Managers use tool to manage FB and
TW communities
• Tool helps manage conversations from multiple
Social Media accounts
5. A glance at the infrastructure…
Database Cluster
This needs to scale!
Front-End Application
Server Cluster
Data Pull
Data Pull
AWS SQS
Data Pull
Data Pull
AWS
SNS
Data Pull
Data Pull
Load Balancer
Public Visitors
6. Infrastructure Scaling Needs
• Timely Conversation Data Pulls from Social Networks
• Data Pull Processes can take longer due to:
– Increased activity on channel (more tweets or FB
Messages)
– Network Latency
• Slowdowns will cascade across all requests
Requirement:
Way to automatically adjust compute resources when
needed based on real-time performance data, 24/7
8. Introducing Auto Scaling (EC2)
• AWS allows two types of Auto Scaling:
– Minimal Configuration and Management directly from a
CloudWatch Alarm
– More advanced via SDK
• AWS Supports auto scaling On-Demand or Spot
Instances, depending on needs
9. Spot Instances vs. On Demand Instances
• Bidding System
• Considerably cheaper than On Demand*
– Micro Instances:
• On Demand Pricing @ $0.02/hr ~$14.40/mth
• Spot Pricing @ >$0.004/hr ~$2.88/mth
– Small Instances:
• On Demand Pricing @ $0.08/hr ~$57.60/mth
• Spot Pricing @ >$0.01/hr ~$7.20/mth
• Spot Instances not guaranteed
• Great for non-mission critical tasks
* All prices in USD for resources in the Asia Pacific Singapore Region
10. Spot Instances and Our Needs
• We need to be able to scale up data pull instances
so we can keep up with volume (and mitigate
slowdowns)
• It’s ok if the machine are taken away from us, we’ll
just request more.
• Cost Savings is essential
13. Configuring Auto Scaling
Before you can finish setting up the CloudWatch Auto
Scaling, you need to follow these steps:
1.
2.
3.
4.
Install the AWS CLI tools on your trusty Linux box
Create an Auto Scaling Launch Config
Create an Auto Scaling Group
Create Scale Out (Up) and In (Down) Policies
14. Configuring Auto Scaling:
Launch Configurations
Defines the type of
Instance to create
• Create an AMI
• Launch Config links to
an AMI
• Define Instance Type
• Define max bid price
(in USD, /hr)
• Turn off Monitoring!
$as-create-launch-config
MyLaunchConfigName
--image-id ami-123ab456
--instance-type t1.micro
--spot-price 0.025
--key our_key_name
--monitoring-disabled
15. Configuring Auto Scaling:
Scaling Groups
• Create a Load
Balancer to “hold”
these launch configs
• State what AZ zones
you want these to be
launched in
• # Instances Lower and
Upper bounds
• Meta Configuration
• Link to the Launch
Configuration
$as-create-auto-scaling-group
MyASGroupName
--availability-zones apsoutheast-1a,ap-southeast-1b
--default-cooldown 120
--load-balancers VSocialCrons
--min-size 1
--max-size 45
--tag="k=Name, v=SQSProcessor, p=true"
--launch-configuration
MyLaunchConfigName
16. Configuring Auto Scaling:
Policies
• Need both Up and
Down policies
• Link to the Auto Scaling
Group
• Define whether it’s an
addition or subtraction
for “adjustment”
$as-put-scaling-policy
MyPolicyName-add1
--auto-scaling-group
MyASGroupName
--adjustment=1
--type ChangeInCapacity
--cooldown 300
$as-put-scaling-policy
MyPolicyName-del1
--auto-scaling-group
MyASGroupName
--adjustment=-1
--type ChangeInCapacity
--cooldown 600
17. • Now you’re ready to configure your
CloudWatch Alarms to utilize the auto
scaling configuration!
• But wait…
20. Creating a CloudWatch Custom Metric
• Caveats:
– Custom Metrics can be stored as frequent as every 5
minutes
– After a while old data will be truncated
• Steps
– Install the AWS SDK for your favorite language
– Write simple script to calculate and store metric
– Schedule script by cron
• Code Sections:
– Set Up
– Metric Data Pulls
– Store New Metric
21. Custom Metric Code – Set Up
// Constants
define("CW_NAMESPACE", "Vocanic/VSocial");
define("CW_MONITORING_AWS_KEY_ID", "XXXXXXXXXXXXXXXXX");
define("CW_MONITORING_AWS_SECRET_ACCESS_KEY", "YYYYYYYYYYYYYY
YYY");
// Includes
include '/usr/lib/php5/amazon/sdk-1.5.17.1/sdk.class.php';
// Configure options for CloudWatch object
$options = array();
$options["key"] = CW_MONITORING_AWS_KEY_ID;
$options["secret"] = CW_MONITORING_AWS_SECRET_ACCESS_KEY;
// Set up Amazon Objects
$cw = new AmazonCloudWatch($options);
$cw->set_region(AmazonCloudWatch::REGION_SINGAPORE);
22. Custom Metric Code – Define Time
Bounds
// Define time bounds for pulling metric data
// We're only interested in the most recent metric
// These particular metrics are stored every 5 minutes)
$endTime = time(); // Store current unix time in end time
variable
$startTime = $endTime - 599; // Subtract 1 second less than
10 minutes
$endTime = date("j F Y H:i", $endTime);
$startTime = date("j F Y H:i", $startTime);
23. Custom Metric Code – Pull Metric Values
// Get Num Deleted Metric
$NumberOfMessagesDeletedMetric = $cw->get_metric_statistics(
'AWS/SQS', // CloudWatch Namespace of Metric
'NumberOfMessagesDeleted', // Name of Metric
$startTime, $endTime, // Timestamp bounds for pulling set
of metrics
60, // Attempt to get a number every 60 seconds, but in
reality, it's every 5 min
'Sum', // Statistics: Get Sum value
'Count', // Unit: Get a count of the values
array('Dimensions' => array(array('Name'
=>'QueueName', 'Value' => 'My_SQS_Queue_Name')))
// Define which SQS Queue you want this metric for
);
$NumberOfMessagesDeleted =
$NumberOfMessagesDeletedMetric->body>GetMetricStatisticsResult->Datapoints[0]->member->Sum;
*Repeat for Number Sent Metric
24. Custom Metric Code – Store New Metric
// Get Timestamp for storing in our custom metric so that
timing can match up
$TimeStamp = $NumberOfMessagesSentMetric->body>GetMetricStatisticsResult->Datapoints[0]->member->Timestamp;
// Get Difference of Sent versus Deleted
$SentVSDeleted = $NumberOfMessagesSent $NumberOfMessagesDeleted;
// Send metric to CloudWatch
$response = $cw->put_metric_data(CW_NAMESPACE, array(
array(
'MetricName' => 'DataPullQueueSentvsDel',
'Timestamp'
=> $TimeStamp,
'Value'
=> $SentVSDeleted
)));
27. Scaling Deciding Factors
• For each scaling policy, you need to creating a specific
policy from the CLI
–
–
–
–
Queue Delta >100 for 5 min, Add 1 Instance
Queue Delta >350 for 5 min, Add 2 Instances
Queue Delta >1000 for 5 min, Add 4 Instance
Queue Delta < 10 for 60 minutes, Remove 1 Instance
28. Danger!
Queue Delta > 1000!!
When Alarms are
hit, Auto Scale
Out policy is
triggered, instan
ces are created
4 Instances Created
??