Traditionally, computer hardware was a scarce, expensive resource. Running performance tests often meant scavenging for machines around the office. Today, however, things are different. With Amazon's EC2, a cluster of servers is now just a web service call away. In this presentation you will learn about design and implementation of Cloud Tools, which is a Groovy-based framework for deploying and testing Java EE applications on EC2. This framework provides a simple (internal) DSL for configuring a cluster (database + web container + apache), deploying a web application, and running performance tests using JMeter. You will learn about capabilities of EC2 and how to use it for development and deployment. We describe how we use Amazon S3 to work around EC2's lack of a persistent file system and avoid time-consuming uploads of WAR files.
1. Running Java applications on the
Amazon El ti C
A Elastic Compute Cl d
t Cloud
Chris Richardson
Author of POJOs in Action
Founder of Cloud Tools
www.chrisrichardson.net
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 1
2. Overall presentation g
p goal
Show how to use
Amazon Elastic Compute Cloud
for developing and deploying Java
applications
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 2
3. About Chris
Grew up in England and live in Oakland, CA
p g ,
Over twenty years of software development
experience
Building object-oriented software since 1986
Using Java since 1996
Using J2EE since 1999
Author of POJOs in Action
Speaker at JavaOne, SpringOne, NFJS,
JavaPolis, Spring Experience, etc.
Chair of the eBIG Java SIG in Oakland
(www.ebig.org)
(www ebig org)
Run a consulting and training company that
helps organizations build better software faster
and deploy it on Amazon EC2
Founder of Cloud Tools, an open-source project
for d l i
f deploying J Java applications on Amazon EC2:
li ti A EC2
http://code.google.com/p/cloudtools
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 3
4. Agenda
g
Cloud computing with Amazon EC2
Using Amazon EC2
Overview of Cloud Tools
Developing on Amazon EC2
Deploying on Amazon EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 4
5. Computing has come a long way
p g g y
Present
Past
www.computermuseum.org.uk
www.dell.com
Yet we rarely have enough
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 5
6. Cloud computing
p g
A pool of highly scalable, abstracted
infrastructure that hosts your
application, and is billed by
consumption
p
By James Staten of Forrester Research
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 6
7. Power generation
g
Past Present
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 7
8. Amazon-Style Cloud Computing
y p g
Elastic Compute Cloud (EC2)
On-demand computing
O d d ti
Elastic Block Storage (EBS)
quot;SAN on demandquot;
SAN demand
Simple Storage Service (S3)
Stores blobs of data
Simple Queue Service (SQS)
Hosted queue-based messaging system
SimpleDB Pay per use
Store data sets services managed
Execute queries by Amazon
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 8
9. What is Amazon EC2?
Virtualized computing environment
Server instances managed through a web service API
IP addresses and host names assigned dynamically
Pay by the hour ($0.10-0.80/hour) + external
bandwidth ($0 10 0 18/Gbyte)
($0.10-0.18/Gbyte)
https://ec2.amazonaws.com/?Action=RunInstances <RunInstancesResponse>
&ImageId=ami-398438493 …
&MaxCount=3 </RunInstancesResponse>
&MinCount=3
cer@arrakis ~
k
$ ssh … root@ec2-67-202-41-150.compute-1.amazonaws.com
Last login: Sun Dec 30 18:54:43 2007 from 71.131.29.181
[root@domU-12-31-36-00-38-23:~]
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 9
10. Instance types
yp
Virtual Compute
p 32// Memory
y Storage $/hr
g $/
Cores Units 64
/core* Bit
Small 1 1 32 bit 1.7G 160G 0.10
High- 2 2.5 32 bit 1.7G 350G 0.20
CPU
Medium
Large 2 2 64 bit 7.5G 850G 0.40
Extra 4 2 64 bit 15G 1690G 0.80
Large
High- 8 2.5 64 bit 7G 1690G 0.80
CPU XL
* EC2 Compute Unit = 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 10
11. Operating systems
p g y
Use Amazon provided Machine Image (AMI)
32-bit
32 bit Fedora Core 4
64-bit Fedora Core 6
Many 3rd parties have p
y p public AMIs
Various Linux distributions
E.g. Redhat, RightScale
Sun provides OpenSolaris
Windows is private beta
Build your own Linux:
Install applications starting with someone else's
AMI and save it
Create an AMI from scratch
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 11
12. One minor thing…
g
Terminate your i t
T i t instance
⇒
y
your local data is lost.
Either very good or very bad
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 12
13. Elastic Block Storage
g
Mountable storage volumes
quot;On-demand SANquot;
quot;O d d
Size: 1 GB to 1 TB
Mount on a single instance
Create snapshots
Stored in S3
Create new volumes from the snapshot
Cost:
$0.10/GByte/month
$
$0.10 per 1 million I/O requests
p q
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 13
14. Elastic IP addresses
Instance IP addresses are dynamically
allocated on start-up
start up
Does not work well for publicly accessible
services, e.g. a website
Elastic IP addresses:
Statically allocated addresses
Associated with your account (max. 5)
Attached to an instance (e g public facing web
(e.g.
server)
You configure DNS to resolve to the elastic IP
address
Pricing:
Non-attached Elastic IP address - $0.01/hour
$0.10
$0 10 per remap (if > 100 in a month)
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 14
15. Regions and availability zones
g y
By default, your database master and slave
could run on the same physical host!
Regions:
Geographically dispersed locations
Currently only one
Availability zone:
Part of a region
d be l
Engineered to b insulated f
d from failure in other
f l h
zones
Specify availability zone when launching
instances:
i t
Same zone as other instances for free data
transfer
Different zone f higher-availability
Diff t for hi h il bilit
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 15
16. What is the Amazon Simple
Storage Service ( )
g (S3)?
Flat storage model consisting of buckets
and objects
Bucket – has a name and contains objects
Objects – has a key, stores 1 byte - 5G
Object key can look like a path ☺
Cost:
$0.15/GB-Month
$0 10 0 18/GB of data transferred
$0.10-0.18/GB
$0.00001-$0.000001/Web Service call
Data transfers between EC2 and S3 are free of
bandwidth charges
Buckets and objects can be:
Public – accessible by anyone
P i
Private – accessible to owner, acl member
ibl l b
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 16
17. S3 REST API
PUT / HTTP/1.1
Create a bucket
Host: <BucketName>.s3.amazonaws.com
Authorization: AWS AWSAccessKeyId:Signature
…
PUT /<ObjectName> HTTP/1.1
Host: <BucketName>.s3.amazonaws.com
Authorization: AWS AWSAccessKeyId:Signature Create an item in a bucket
…
…Bytes…
GET /<ObjectName> HTTP/1.1
Host: <BucketName>.s3.amazonaws.com
Download an item
Authorization: AWS AWSAccessKeyId:Signature
…
DELETE /<ObjectName> HTTP/1.1
Host: <BucketName>.s3.amazonaws.com
Authorization: AWS AWSAccessKeyId:Signature Delete an item
…
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 17
18. Using EC2 and S3 together
g g
AMIs are stored in S3
EC2 instances use S3:
Use REST API
Store database snapshots in S3
Use 3rd party Linux file system that
stores data in S3
Store EBS volume snapshots in S3
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 18
19. So what does this mean?
For developers
Immediate access to many servers
Simplified setup
Great for testing
For deployment
Eliminates capital expenses
Reduces risk of success catastrophe
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 19
20. Agenda
g
Cloud computing with Amazon EC2
Using Amazon EC2
Overview of Cloud Tools
Developing on Amazon EC2
Deploying on Amazon EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 20
21. Signing up for Amazon Web
Services
AWS access
identifiers:
Account Id
Access Id
Secret key
Private key
and certificate
Only takes a
few minutes
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 21
22. EC2 API and Tools
SOAP and Query APIs
Launch and manage instances etc
L h d i t t
Amazon provided CLI tools
CLI equivalents of APIs
AMI creation tools
AWS CLI tools from Tim Kay
y
CLI for S3 and EC2
Alternatives to Amazon CLI tools
ElasticFox
El i F
Awesome Firefox plugin
Launch and manage instances
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 22
23. Using the Q
g Query API
y
https://ec2.amazonaws.com/?queryparame
ters...
Mandatory parameters:
Action – what to do
AWSAccessKeyId – your access id
AWSA K Id
Version – API version
Timestamp – when request was made
p q
Expires – when it expires
Signature – digest of parameters and secret
key
SignatureVersion – set to 1
Other parameters depend on Action
Returns an XML d
R t documentt
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 23
24. Example EC2 requests
p q
Action Parameters
RunInstances MinCount, MaxCount, ImageId,
InstanceType, …
TerminateInstances InstanceId.n
InstanceId n
DescribeInstances InstanceId.n
CreateSecurityGroup GroupName, GroupDescription
AuthorizeSecurityGroupIngress GroupName,
SourceSecurityGroupName,
IpProtocol
DeauthorizeSecurityGroupIngress
h S G …
…
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 24
25. aws - simple access to EC2 and S3
p
http://timkay.com/aws/
Easy to use CLI for EC2 and S3
Implemented in Perl
Authenticates using access id and secret
key stored in ~/.awssecret
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 25
27. Launching instances
g
TIP: launch with a key
y
pair or else you won't
have access
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 27
28. Creating y
g your own image
g
Easier: Modify an existing AMI
Launch AMI
L h
Configure: e.g. yum install …
Harder: Build one from scratch
Launch AMI
Create a file to contain OS installation
Mount as a loopback file
Install OS: yum –-installroot
y
Use AMI tools to bundle and upload to
S3
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 28
29. Agenda
g
Cloud computing with Amazon EC2
Using Amazon EC2
Overview of Cloud Tools
Developing on Amazon EC2
Deploying on Amazon EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 29
30. Deploying a web application on EC2
p y g pp
Not rocket science but there are many servers to configure and multiple files to upload
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 30
31. What's Cloud Tools?
Open source project
32 and 64 bit AMIs
CentOS 5.10
Apache/Tomcat/MySQL/JMeter/JetS3t installed
p y Q
EC2Deploy framework
Launches instances
Configures Tomcat MySQL Apache
Tomcat, MySQL,
Deploys web applications
Runs Jmeter tests
Written in Groovy
Maven and Grails plugins
Quick and easy deployment to EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 31
32. EC2Deploy framework
p y
Provides a DSL for describing a cluster:
Number of Tomcats, M SQL slaves
N b fT t MySQL l
Database scripts
Location of web applications
Launches EC2 instances
Configures MySQL
g y Q
Configures Tomcat and deploys web
applications
Configures Apache to proxy the Tomcat
servers
Runs JMeter tests
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 32
33. Example EC2Deploy Script
p p y p
def ec2 = new EC2(…)
ClusterSpec clusterSpec = new ClusterSpec()
.tomcats(1)
.instanceType(EC2InstanceType.SMALL)
.slaves(1)
.webApp('/home/cer/…/ptrack', quot;ptrackquot;)
ebApp('/home/cer/ /ptrack'
.catalinaOptsBuilder({optsBuilder, databaseHost, slaves ->
optsBuilder.arg(quot;-Xmx500mquot;)
optsBuilder.prop(quot;jdbc.db.serverquot;, databaseHost)})
.schema(quot;ptrackquot;, [quot;ptrackquot;: quot;ptrackquot;],
[ s c/test/ esou ces/testd
[quot;src/test/resources/testdml1.sqlquot;,
.sq ,
quot;src/test/resources/testdml2.sqlquot;])
SimpleCluster cluster = new SimpleCluster(ec2, clusterSpec)
cluster.start()
cluster.loadTest(quot;/home/cer/…/jmeter/SimpleTest.jmxquot;, [5, 10, 15])
cluster.stop()
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 33
34. Domain model
ClusterSpec <<façade>>
SimpleCluster
name
stop()
numberOfTomcats
start()
numberOfMySqlSlaves
loadTest()
...
WebApp
context Jmeter
TomcatServer MySqlServer ApacheServer
explodedWar Application
uploadToS3()
updateTomcat()
<<abstract>>
Application
EC2 EC2Server EC2InstanceState
startPolling() waitUntilRunning() instanceId
stopPolling() stop() state
newServers(n) ssh(command) publicDnsName
... ... privateDnsName
Uses Query API
to run, terminate
Executes ssh and
and describe
EC2Request scp commands
instances Ssh
Executor
ssh(dns, command, …)
executeRequest(params)
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 34
35. Configuration DSL
g
class ApacheServer extends Application {
def configure() {
writeFile fileName: quot;$apacheConfDir/cluster.confquot;,
templateName: quot;/templates/cluster.confquot;,
templateArgs: [tomcats: tomcats]
exec quot;$apacheBinDir/apachectl restartquot;
waitForHttp port:80, path: tomcats[0].contexts[0]
}
…
class MySqlServer extends Application {
}
def
d f configureAsMaster() {
fi A M t ()
writeFile fileName: quot;/etc/my.cnfquot;, templateName: quot;/templates/master.my.cnfquot;
restartService quot;mysqldquot;
exec command: quot;mysql -u rootquot;,
templateName: quot;/templates/createSchema.sqlquot;,
templateArgs: [schemaSpec: schemaSpec]
executeSchemaScripts()
}
}
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 35
36. Efficiently uploading web
applications, etc.
pp ,
Non-durable disks = upload the entire web
application
pp
20+ MBs of jars, etc.
Takes a long time (over a DSL connection)
Web application consists of:
90% 3rd party libraries – rarely changing
10% application code and content – only some of it
changes
Use J tS3t t
U JetS3t to accelerate uploads
l t l d
Incremental upload of exploded web application to S3
bucket
Incremental download to Tomcat webapps/ directory
First upload is slow but subsequent uploads are fast
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 36
38. Grails Plugin
g
Packages E2Deploy as a Grails
g p y
framework plugin
Deploys a Grails application to EC2
$ grails install-plugin <path to plugin>
$ grails cloud-tools-deploy
g p y
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 38
39. Agenda
g
Cloud computing with Amazon EC2
Using Amazon EC2
Overview of Cloud Tools
Developing on Amazon EC2
Deploying on Amazon EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 39
40. Collecting p
g performance metrics
70 7000
60 6000
50 5000
40 4000
TPS
30 3000
ART
20 2000
10 1000
0
1 2 4 8 16 32 64
0
R i
Requires
hardware
Time consuming
Measure transactions/second (TPS), average response time (ART), utilization, etc.
Multiple test runs with different loads, number of servers, etc.
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 40
41. Load testing with Cloud Tools
g
Runs JMeter with
<performanceReport>
<cpus>1</cpus>
<threads>10</threads>
specified number <host>
<name>database</name>
of threads
<cpuUtil>3.2757014224403784</cpuUtil>
</host>
<host>
Collects machine
<name>tomcat0</name>
<cpuUtil>94.32473318917411</cpuUtil>
</host>
utilization stats <host>
<name>apache</name>
<cpuUtil>0.12280614752518504</cpuUtil>
p / p
Generates reports </host>
<host>
<name>jmeter</name>
Executes multiple <cpuUtil>7.033683910704496</cpuUtil>
</host>
test runs
…
<duration>557.943</duration>
<tps>10.753786677133686</tps>
simultaneously <art>916.6578333333</art>
</performanceReport>
mvn cloudtools:jmeter -Dcloudtools.thread.count=1,4,8
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 41
42. Other kinds of testing
g
Testing failover
Launch cluster
Take down servers
Test recovery scripts, e.g. MySQL slave-
>master
Testing d t b
T ti database upgrades
d
Launch cluster
I ll h f
Install snapshot of production data
d i d
Apply database migration script
Verify th t
V if that it works
k
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 42
43. Functional testing
g
Tests can be slow, e.g.
W b tests
Web t t
Database intensive tests
Run tests in parallel on EC2
Multiple test drivers, app servers, DBs
Relatively cheap: >$75/hour developer vs.
$0.10/hour machine
$ /h h
Selenium Grid from Thoughtworks
Open Source framework
Runs Selenium web tests in parallel on EC2
Stay tuned for more general solutions
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 43
44. Building on a fresh machine
g
Debug builds that fail because of a
missing dependency
Maven dependency
Manually installed 3rd party library
Build on a fresh EC2 instance
Great for open-source projects
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 44
45. Agenda
g
Cloud computing with Amazon EC2
Using Amazon EC2
Overview of Cloud Tools
Developing on Amazon EC2
Deploying on Amazon EC2
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 45
46. Deploying applications on Amazon
EC2
Great for startups (especially those
without a business model)
Get up and running ready quickly
No upfront hardware costs
Scale up/down with load
Reduces the risk of a success
d h k f
catastrophe
Great for enterprises
No need to wait for corporate IT
Use f
U for short-term projects
h t t j t
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 46
47. Issues with AWS
Security:
Lack of PCI compliance
Discomfort with sending customer data to a
3rd party
Technology:
Technolog
Not yet suitable for extremely large
relational databases
Lack of very large machines, e.g. 64G
memory
Lack of multicast
Financials:
Cost of bandwidth
Steady state costs > your own hardware
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 47
48. Starter website - $
www.acme.com
Elastic IP A
EC2 Instance
Apache
Tomcat
MySQL
EBS Volume
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 48
49. Highly available - $$
g y
www.acme.com
Elastic IP A Elastic IP B
Availability Zone A Availability Zone B
Apache
Apache
Tomcat Tomcat
Tomcat Tomcat
MySQL
MySQL
(slave)
(Master)
EBS Volume
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 49
50. Batch processing architecture
p g
e.g.
e g media transcoding
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 50
51. Easy upgrades
y pg
Clone production environment
Apply upgrades
Terminate old instances once you are
y
sure that everything works
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 51
52. Cloud bursting
g
Host application on your own
hardware
Use AWS for short-term spikes
e.g. use EC2 instances with slave DBs to
handle read-only requests
Periodic batch jobs
E.g. content rendering/transformation
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 52
53. Using AWS in y
g your application
pp
Simple Storage Service (S3)
Stores blobs of data
Eg. Photo sharing website
Store media
Hand
H d out URL to S3 objects
URLs bj
Simple Queue Service (SQS)
Hosted queue-based messaging system
queue based
Alternative to JMS
Loosely coupling between systems
SimpleDB
Si l DB
Store data sets
Execute queries
q
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 53
54. Java libraries for AWS
Generate the REST/SOAP requests
manually
JetS3t
Rich API for accessing S3
https://jets3t.dev.java.net/
Typica
yp
API for SQS, EC2, SimpleDB
http://code.google.com/p/typica/
SimpleJPA
Si l JPA
Subset of JPA on Simple DB
http://code.google.com/p/simplejpa/
http://code google com/p/simplejpa/
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 54
55. Summary
y
Cloud Computing
Immediate access to many servers
Pay as you go – no upfront
i t t/
investment/commitment required
it t i d
Easily scale up/down
Cloud Tools
Cl d T l
Easy deployment and testing from
Maven and Grails
Configure multiple clusters
Run JMeter tests in parallel
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 55
56. Final thoughts
g
Download Cloud Tools today:
http://code.google.com/p/cloudtools
Buy my book ☺
Send email:
chris@chrisrichardson.net
Visit my website:
http://www.chrisrichardson.net
Talk to me about consulting and
training
t i i
Copyright (c) 2008 Chris Richardson. All rights reserved.
Slide 56