Sql saturday azure storage by Anton Vidishchev

Windows Azure Storage
Overview, Internals and Best Practices

About me








Program Manager @ Edgar Online, RRD
Windows Azure MVP
Co-organizer of Odessa .NET User Group
Ukrainian IT Awards 2013 Winner – Software Engineering
http://cloudytimes.azurewebsites.net/
http://www.linkedin.com/in/antonvidishchev
https://www.facebook.com/anton.vidishchev

What is Windows Azure Storage?

Windows Azure Storage
 Cloud Storage - Anywhere and anytime access
 Blobs, Disks, Tables and Queues

 Highly Durable, Available and Massively Scalable
 Easily build “internet scale” applications
 10 trillion stored objects
 900K request/sec on average (2.3+ trillion per month)

 Pay for what you use
 Exposed via easy and open REST APIs
 Client libraries in .NET, Java, Node.js, Python, PHP,
Ruby

Abstractions – Blobs and Disks

Abstractions – Tables and Queues

Windows Azure Data Storage Concepts

Container

Blobs

https://<account>.blob.core.windows.net/<container>

Account

Table

Entities

https://<account>.table.core.windows.net/<table>

Queue

Messages

https://<account>.queue.core.windows.net/<queue>

How is Azure Storage used by Microsoft?

Design Goals
Highly Available with Strong Consistency
 Provide access to data in face of failures/partitioning

Durability
 Replicate data several times within and across regions

Scalability
 Need to scale to zettabytes
 Provide a global namespace to access data around
the world
 Automatically scale out and load balance data to
meet peak traffic demands

Windows Azure Storage Stamps
Access blob storage via the URL: http://<account>.blob.core.windows.net/

Data access

Storage
Location
Service

LB

LB

Front-Ends

Front-Ends

Partition Layer

Partition Layer

Inter-stamp (Geo) replication

DFS Layer

DFS Layer

Intra-stamp replication

Intra-stamp replication

Storage Stamp

Storage Stamp

Architecture Layers inside Stamps

Partition Layer

Index

Availability with Consistency for Writing
All writes are appends to the end of a log, which is
an append to the last extent in the log
Write Consistency across all replicas for an
extent:
 Appends are ordered the same across all
3 replicas for an extent (file)
 Only return success if all 3 replica
appends are committed to storage
 When extent gets to a certain size or on
write failure/LB, seal the extent’s replica
set and never append anymore data to it

Write Availability: To handle failures during write
 Seal extent’s replica set
 Append immediately to a new extent
(replica set) on 3 other available nodes
 Add this new extent to the end of the
partition’s log (stream)

Partition Layer

Availability with Consistency for Reading
Read Consistency: Can
read from any replica, since
data in each replica for an
extent is bit-wise identical

Read Availability: Send out
parallel read requests if first
read is taking higher than
95% latency

Partition Layer

Dynamic Load Balancing – Partition Layer
Spreads index/transaction processing
across partition servers
 Master monitors traffic
load/resource utilization on
partition servers
 Dynamically load balance
partitions across servers to
achieve better
performance/availability



Does not move data around, only
reassigns what part of the index a
partition server is responsible for

Partition Layer

Index

Dynamic Load Balancing – DFS Layer
DFS Read load balancing across replicas
 Monitor latency/load on each
node/replica; dynamically select
what replica to read from and start
additional reads in parallel based on
95% latency

Partition Layer

Architecture Summary
 Durability: All data stored with at least 3 replicas
 Consistency: All committed data across all 3 replicas are identical
 Availability: Can read from any 3 replicas; If any issues writing seal
extent and continue appending to new extent
 Performance/Scale: Retry based on 95% latencies; Auto scale out and
load balance based on load/capacity



Additional details can be found in the SOSP paper:

 “Windows Azure Storage: A Highly Available Cloud Storage Service with Strong
Consistency”, ACM Symposium on Operating System Principals (SOSP), Oct.
2011

General .NET Best Practices For Azure
Storage
 Disable Nagle for small messages (< 1400 b)
 ServicePointManager.UseNagleAlgorithm = false;

 Disable Expect 100-Continue*
 ServicePointManager.Expect100Continue = false;

 Increase default connection limit
 ServicePointManager.DefaultConnectionLimit = 100; (Or
More)

 Take advantage of .Net 4.5 GC
 GC performance is greatly improved
 Background GC: http://msdn.microsoft.com/enus/magazine/hh882452.aspx

General Best Practices
 Locate Storage accounts close to compute/users
 Understand Account Scalability targets

 Use multiple storage accounts to get more
 Distribute your storage accounts across regions

 Consider heating up the storage for better
performance
 Cache critical data sets

 To get more request/sec than the account/partition targets
 As a Backup data set to fall back on

 Distribute load over many partitions and avoid
spikes

General Best Practices (cont.)
 Use HTTPS
 Optimize what you send & receive

 Blobs: Range reads, Metadata, Head Requests
 Tables: Upsert, Projection, Point Queries
 Queues: Update Message

 Control Parallelism at the application layer

 Unbounded Parallelism can lead to slow latencies and
throttling

 Enable Logging & Metrics on each storage
service

Blob Best Practices
 Try to match your read size with your write size
 Avoid reading small ranges on blobs with large blocks
 CloudBlockBlob.StreamMinimumReadSizeInBytes/
StreamWriteSizeInBytes

 How do I upload a folder the fastest?
 Upload multiple blobs simultaneously

 How do I upload a blob the fastest?
 Use parallel block upload

 Concurrency (C)- Multiple workers upload different
blobs
 Parallelism (P) – Multiple workers upload different
blocks for same blob

Concurrency Vs. Blob Parallelism

•
•
•

C=1, P=1 => Averaged ~ 13. 2 MB/s
C=1, P=30 => Averaged ~ 50.72 MB/s
C=30, P=1 => Averaged ~ 96.64 MB/s

• Single TCP connection is bound by
TCP rate control & RTT
• P=30 vs. C=30: Test completed
almost twice as fast!
• Single Blob is bound by the limits
of a single partition
• Accessing multiple blobs
concurrently scales

10000
8000
6000
4000
2000

Time (s)

XL VM Uploading 512, 256MB
Blobs (Total upload size =
128GB)

0

Blob Download
 XL VM Downloading
50, 256MB Blobs (Total
download size = 12.5GB)
C=1, P=1 => Averaged ~ 96 MB/s
C=30, P=1 => Averaged ~ 130 MB/s

120

Time (s)

•
•

140

100

80
60
40
20
0
C=1, P=1

C=30, P=1

Table Best Practices
 Critical Queries: Select PartitionKey, RowKey to avoid hotspots

 Table Scans are expensive – avoid them at all costs for latency sensitive
scenarios

 Batch: Same PartitionKey for entities that need to be updated
together
 Schema-less: Store multiple types in same table
 Single Index – {PartitionKey, RowKey}: If needed, concatenate
columns to form composite keys
 Entity Locality: {PartitionKey, RowKey} determines sort order

 Store related entites together to reduce IO and improve performance

 Table Service Client Layer in 2.1 and 2.2: Dramatic performance
improvements and better NoSQL interface

Queue Best Practices
 Make message processing idempotent: Messages
become visible if client worker fails to delete
message
 Benefit from Update Message: Extend visibility time
based on message or save intermittent state
 Message Count: Use this to scale workers
 Dequeue Count: Use it to identify poison messages
or validity of invisibility time used
 Blobs to store large messages: Increase throughput
by having larger batches
 Multiple Queues: To get more than a single queue
(partition) target

Sql saturday azure storage by Anton Vidishchev

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (7)

Semelhante a Sql saturday azure storage by Anton Vidishchev

Semelhante a Sql saturday azure storage by Anton Vidishchev (20)

Mais de Alex Tumanoff

Mais de Alex Tumanoff (20)

Último

Último (20)

Sql saturday azure storage by Anton Vidishchev