2. Agenda
• Introduction
• Rabits?
• Cloud Power
• The Weakest Link
• Understanding Capacity
• Self Everything
3. Rabits ?
• Rapid bits,
– small public apps like websites & phone apps
– They want to live outside, in the wild
– They need to get there fast
– Once they are there, they’ll need some space to
multiply, scale
– And they move on quickly
4. Examples
• Apps
– Personal & work related
• Branding websites
– Product launches, Special campaigns
• Predictable big events
– Olympics, Elections
• Unpredictable events
– Disasters, Terrorist attacks
– Celebrity Death
5. Business context
• The world has changed over the past decade
– Consumerization of IT
– Technology in every day life
– Globalization
• New and large scale business opportunities
appear
– 2.1 Billion internet users
– 4.6 Billion phones
– 1.4 Billion Households with TV’s
• Elasticity needed as demand varies wildly
6. Key Success Factors
How to prevent
Global, road kill?
Short time to market,
Performant,
Highly scalable,
Highly available,
Relatively cheap,
Easy
16. Key Success Factors
Global,
Short time to market,
Performant,
Highly scalable,
Highly available,
Relatively cheap,
Easy
17. Agenda
• Introduction
• Rabits?
• Cloud Power
• The Weakest Link
• Understanding Capacity
• Self Everything
18. The weakest link
• Overall scalability and availability
– Limited by the weakest component
• If the backend can only handle
30 users
– It doesn’t matter that the front-
end could handle 1.000.000
19. The weakest link
• Typically the weakest link is one of the
following:
– Integration points
– Data stores
– Long processes
20. Remember
• Everything has limits!
• Including azure resources
– Storage account: 5000 requests/sec
– Storage container: 500 requests/sec
– Bandwith depending on instance size
– Etc...
• Luckily you can get multiple of these
21. But what if you can’t?
• Keep them out of the critical path
– Cache view model data or output
– Queue commands
22. Cache
• Windows Azure Appfabric Cache
– Distributed cache
• Reduces queries
– To less scalable components
• Store data close to the app
– Otherwise the whole point is moot
23. Queued command processing
• Avoid being swarmed by incoming
commands
– Use a queue to throttle
• Handle commands at a controlled speed
– that of the least scalable component
26. Side effects of these architectures
• Caches need to be updated regularly
– Time based
– Event based
• User interface must be adapted
– Task orientation required
– ISO 9241-151 requires this anyway
27. What if things break?
• Make sure you have a backup instance!
• Fabric controller
– At least 2 instances in seperate fault domains
• Traffic manager
– Spread over multiple datacenters
• Azure storage
– Automatically replicated across datacenters
• SQL Azure
– Replicate using data sync
28. Multiple instances
• Don’t rely on machine dependencies
– Avoid reliance on memory (except as cache)
– Session state is evil
– WCF default wsdl addressing behavior
– Ensure encryption algorithms use service
certificates
– ...
29. Technology can help
Key Success Factors
Global,
Windows Azure Tech
•Short time to market,
– Queue
Performant, storage
HighlyAppfabric Service Bus
– scalable,
•Highly available, Tech
Framework
– NServiceBus
Relatively cheap,
Easy
– SignalR
30. Agenda
• Introduction
• Rabits?
• Cloud Power
• The Weakest Link
• Understanding Capacity
• Self Everything
31. Keeping it cheap
• Understanding capacity
– Pay for what you can ‘potentially’ use, aka the capacity
– Instances are baskets of capacity : CPU, Memory, …
– Ensure everything is efficiently used before scaling out
Compute Instance
CPU Memory I/O Performance Cost per hour
Instance Size Storage
Extra Small 1.0 GHz 768 MB 20 GB Low (5 Mbps) $0.05
Small 1.6 GHz 1.75 GB 225 GB Moderate (100 Mbps) $0.12
Medium 2 x 1.6 GHz 3.5 GB 490 GB High (200 Mbps) $0.24
Large 4 x 1.6 GHz 7 GB 1,000 GB High (400 Mbps) $0.48
Extra Large 8 x 1.6 GHz 14 GB 2,040 GB High (800 Mbps) $0.96
32. Example
• 1 XS webrole instance (1 Ghz, 768 Mb, 5Mbps)
– Dynamic home page but with relatively static content
• Limited to 50 concurrent users, yet only
– 10% CPU used
– 80% Memory used (by OS)
– Plenty of free disk space
– Limited by bandwidth IO
• Scaling out to 2 instances
– Moves the tipping point
– But wastes 90% cpu, 20% Memory
– Twice
33. Demo: Hammering
the rabit
Yves Goeleven
Capgemini
@YvesGoeleven
http://cloudshaper.wordpress.com
34. Offload static content
• Better is to remove the bottle neck
– In this case IO
• Offload static content to
– Blob storage, CDN
• Leaves more power to handle
dynamic workload
– Increases number of users served
– Better utilization of CPU & Memory
– Relative to bandwidth
CDN = Content Delivery network
• Content cache near internet edges (24 datacenters), static content close to user
• Great response times, > 200% performance improvement in my test
35. Cache, cache, cache
• The internet has multiple levels of cache
– Browser & proxy cache
– Kernel & output cache
– Memory
– Windows Azure Appfabric Cache
• Ensures low latency
– Memory is faster than IO
– Less time waiting for IO
– Means more resources to handle requests
36. Demo: Hammering
the rabit again
Yves Goeleven
Capgemini
@YvesGoeleven
http://cloudshaper.wordpress.com
37. Balance your workloads
• Visual studio projects force you in a 1 logical role
= 1 physical role instance mindset
– Website = web role, Background process = Worker
role
– Becomes expensive and wastes a lot of capacity
• Combine different types of workload
in same webrole instance Website
– Website (Bandwith heavy)
– Background process (Cpu heavy)
Background
Process
• Immediate 50% cost reduction! Web Role Instance
38. Monitor your roles
• Ideally all roles operate at 80% overall capacity
utilisation
– Leaves room for sudden peaks
– Still efficient use of the capacity you rented
• Monitoring your roles is key
– Add performance counters for CPU, Memory, …
– Store measurements in Windows Azure Storage
• On premises monitoring software
– Polls storage for metrics
– F.e Cerabrata Diagnostics Manager
39. The holy grail
• Smart auto scaling & dynamic workload
allocation Bandwidth
Bandwidth
Bandwidth
Memory
Memory
CPU
Memory
Disk
CPU
Disk
Disk
CPU
Role Role Role
Scale out at 80%
40. Key Success Factors
Global,
Short time to market,
Performant,
Highly scalable,
Highly available,
Relatively cheap,
Easy
41. Agenda
• Introduction
• Rabits?
• Cloud Power
• The Weakest Link
• Understanding Capacity
• Self Everything
42. Issues of scale
• Rabits join millions of people all over the world
• Some traditional tasks suddenly become very
hard
• How to do?
– End user training
– Helpdesk & support
– User acceptance tests
– …
43. Self everything
• Self service
– Signup, pay, use, maintain…
• Self marketing
– Use the power of social networks
• Self supporting
– Easy to use, inductive, UI
– Build a community for support
• Self educating, testing
– Offer early beta’s to the public
– Provide means for feedback
44. Key Success Factors
Global,
Short time to market,
Performant,
Highly scalable,
Highly available,
Relatively cheap,
Easy