3. About me
Program Manager @ Edgar Online, RRD
Windows Azure MVP
Co-organizer of Odessa .NET User Group
Ukrainian IT Awards 2013 Winner – Software Engineering
http://cloudytimes.azurewebsites.net/
http://www.linkedin.com/in/antonvidishchev
https://www.facebook.com/anton.vidishchev
5. Windows Azure Storage
Cloud Storage - Anywhere and anytime access
Blobs, Disks, Tables and Queues
Highly Durable, Available and Massively Scalable
Easily build “internet scale” applications
10 trillion stored objects
900K request/sec on average (2.3+ trillion per month)
Pay for what you use
Exposed via easy and open REST APIs
Client libraries in .NET, Java, Node.js, Python, PHP,
Ruby
12. Design Goals
Highly Available with Strong Consistency
Provide access to data in face of failures/partitioning
Durability
Replicate data several times within and across regions
Scalability
Need to scale to zettabytes
Provide a global namespace to access data around
the world
Automatically scale out and load balance data to
meet peak traffic demands
13. Windows Azure Storage Stamps
Access blob storage via the URL: http://<account>.blob.core.windows.net/
Data access
Storage
Location
Service
LB
LB
Front-Ends
Front-Ends
Partition Layer
Partition Layer
Inter-stamp (Geo) replication
DFS Layer
DFS Layer
Intra-stamp replication
Intra-stamp replication
Storage Stamp
Storage Stamp
15. Availability with Consistency for Writing
All writes are appends to the end of a log, which is
an append to the last extent in the log
Write Consistency across all replicas for an
extent:
Appends are ordered the same across all
3 replicas for an extent (file)
Only return success if all 3 replica
appends are committed to storage
When extent gets to a certain size or on
write failure/LB, seal the extent’s replica
set and never append anymore data to it
Write Availability: To handle failures during write
Seal extent’s replica set
Append immediately to a new extent
(replica set) on 3 other available nodes
Add this new extent to the end of the
partition’s log (stream)
Partition Layer
16. Availability with Consistency for Reading
Read Consistency: Can
read from any replica, since
data in each replica for an
extent is bit-wise identical
Read Availability: Send out
parallel read requests if first
read is taking higher than
95% latency
Partition Layer
17. Dynamic Load Balancing – Partition Layer
Spreads index/transaction processing
across partition servers
Master monitors traffic
load/resource utilization on
partition servers
Dynamically load balance
partitions across servers to
achieve better
performance/availability
Does not move data around, only
reassigns what part of the index a
partition server is responsible for
Partition Layer
Index
18. Dynamic Load Balancing – DFS Layer
DFS Read load balancing across replicas
Monitor latency/load on each
node/replica; dynamically select
what replica to read from and start
additional reads in parallel based on
95% latency
Partition Layer
19. Architecture Summary
Durability: All data stored with at least 3 replicas
Consistency: All committed data across all 3 replicas are identical
Availability: Can read from any 3 replicas; If any issues writing seal
extent and continue appending to new extent
Performance/Scale: Retry based on 95% latencies; Auto scale out and
load balance based on load/capacity
Additional details can be found in the SOSP paper:
“Windows Azure Storage: A Highly Available Cloud Storage Service with Strong
Consistency”, ACM Symposium on Operating System Principals (SOSP), Oct.
2011
21. General .NET Best Practices For Azure
Storage
Disable Nagle for small messages (< 1400 b)
ServicePointManager.UseNagleAlgorithm = false;
Disable Expect 100-Continue*
ServicePointManager.Expect100Continue = false;
Increase default connection limit
ServicePointManager.DefaultConnectionLimit = 100; (Or
More)
Take advantage of .Net 4.5 GC
GC performance is greatly improved
Background GC: http://msdn.microsoft.com/enus/magazine/hh882452.aspx
22. General Best Practices
Locate Storage accounts close to compute/users
Understand Account Scalability targets
Use multiple storage accounts to get more
Distribute your storage accounts across regions
Consider heating up the storage for better
performance
Cache critical data sets
To get more request/sec than the account/partition targets
As a Backup data set to fall back on
Distribute load over many partitions and avoid
spikes
23. General Best Practices (cont.)
Use HTTPS
Optimize what you send & receive
Blobs: Range reads, Metadata, Head Requests
Tables: Upsert, Projection, Point Queries
Queues: Update Message
Control Parallelism at the application layer
Unbounded Parallelism can lead to slow latencies and
throttling
Enable Logging & Metrics on each storage
service
24. Blob Best Practices
Try to match your read size with your write size
Avoid reading small ranges on blobs with large blocks
CloudBlockBlob.StreamMinimumReadSizeInBytes/
StreamWriteSizeInBytes
How do I upload a folder the fastest?
Upload multiple blobs simultaneously
How do I upload a blob the fastest?
Use parallel block upload
Concurrency (C)- Multiple workers upload different
blobs
Parallelism (P) – Multiple workers upload different
blocks for same blob
25. Concurrency Vs. Blob Parallelism
•
•
•
C=1, P=1 => Averaged ~ 13. 2 MB/s
C=1, P=30 => Averaged ~ 50.72 MB/s
C=30, P=1 => Averaged ~ 96.64 MB/s
• Single TCP connection is bound by
TCP rate control & RTT
• P=30 vs. C=30: Test completed
almost twice as fast!
• Single Blob is bound by the limits
of a single partition
• Accessing multiple blobs
concurrently scales
10000
8000
6000
4000
2000
Time (s)
XL VM Uploading 512, 256MB
Blobs (Total upload size =
128GB)
0
27. Table Best Practices
Critical Queries: Select PartitionKey, RowKey to avoid hotspots
Table Scans are expensive – avoid them at all costs for latency sensitive
scenarios
Batch: Same PartitionKey for entities that need to be updated
together
Schema-less: Store multiple types in same table
Single Index – {PartitionKey, RowKey}: If needed, concatenate
columns to form composite keys
Entity Locality: {PartitionKey, RowKey} determines sort order
Store related entites together to reduce IO and improve performance
Table Service Client Layer in 2.1 and 2.2: Dramatic performance
improvements and better NoSQL interface
28. Queue Best Practices
Make message processing idempotent: Messages
become visible if client worker fails to delete
message
Benefit from Update Message: Extend visibility time
based on message or save intermittent state
Message Count: Use this to scale workers
Dequeue Count: Use it to identify poison messages
or validity of invisibility time used
Blobs to store large messages: Increase throughput
by having larger batches
Multiple Queues: To get more than a single queue
(partition) target