Slides for my presentation at the Seattle Hadoop/NoSQL Meetup (http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/events/40509972/).
These slides are based on this earlier presentation: http://www.slideshare.net/MichaelRys/scaling-with-sql-server-and-sql-azure-federations.
5. MySpace’s Solution
Propagate data changes from one DB to
other DBs using reliable, async Message
Service (Service Broker)
And also used for
6. MySpace’s Service Dispatcher
Coordination point between all SQL Servers
Load-balanced across 30 SQL Servers
Enables multicast/broadcast functionality
18,000 ~2k msgs/sec per dispatcher SQL
Server
7. MySpace Architecture
1-1000 3001-4000
My DB I change
gets updated my status
Service
Service
TX3 Broker
TX2 TX1
Dispatcher Service userId=1024
Broker
2001-3000 Service
1001-2000
Broker
TX4 TX5
4001-5000 5001-6000 Web Tier
Data Tier
8. Many other customers using similar
patterns
Online electronic stores (cannot give names
)
Travel reservation systems (e.g. Choice
International)
11. Social
Social
Networks
Network
Management s
Ops Services
Data Backend
Data Backend
Services
Data Backend
Services
Services
MSN
MSN Games
Games l
Web Portal
Front Door
GameBar Router
WLM Services
WLM Games
Host
Games Web Portal
Bing
Bing Auth
Games STS
STS
Games l
Web Portal
Azure Data Centers
15. Partitioned over 100 SQL Azure DBs
Find Friends’ Profiles
Social Get my Profile
Social
Services
User … DB Service Publish feed, read feed
Find Friends’ Profiles
Get Friends highscores
Gamer Last Played
Gamer
Services Favorites
STS Services Game Preferences
STS
Leaderb … Social Leaderboards
DB
oard
Partitioned over 298 SQL Azure DBs Game
Game Disable/Enable
Ingestion Games from
Front Door Write user specific game infos Ingestion
accessing services
Router
Services
Game
Game binaries
User … DB Catalog
Game metadata
250 instances
Partitioned over 100 SQL Azure DBs 250 instances
16. • Fanout: Parallel calls to
multiple database
partitions
• Quorum: Able to
tolerate a percentage of
request failures during
Fanout
• Retry: Retry on database
requests error
25. Federation
Azure DB with Federation Root
Represents the data being sharded
Federation Root
Federation Directories, Federation Users,
Database that logically houses federations,
contains federation meta data Federation Distributions, …
Federation Key
Value that determines the routing of a piece
of data (defines a Federation Distribution)
Federation Member (aka Shard)
Federation “Orders_Fed”
A physical container for a set of federated (Federation Key: CustomerID)
tables for a specific key range and reference
tables
Member: PK [min, 100)
Federated Table
Table that contains only atomic units for the AU AU AU
member’s key range PK=5 PK=25 PK=35
Reference Table
Non-sharded table
Atomic Unit Member: PK [100, 488)
All rows with the same federation
key value: always together! AU AU AU
Connection
PK=105 PK=235 PK=365
Gateway
Member: PK [488, max)
AU AU AU
Sharded PK=555 PK=2545 PK=3565
Application 25
31. Federation with a Single Shard
Existing Database
CREATE FEDERATION sales
(customer_id bigint RANGE)
Database root contains:
sales
• Federation root = DB level object
containing federation scheme
Gateway
• Federation users
• Federation metadata incl. federation
Connection:
map
Server=az1cl321.db.windows.net;
Range: Min...Max
Database=MyDB;
User=AppUser;
Passwd=****;
Federation Member
31
32. Introducing Two Connection Modes
• Filtered Connection
– Guarantees that any queries or DML will produce the
same results independent of changes to the physical
layout of the federation members
– Scoped to an “Atomic Unit”
• Unfiltered Connection
– Scoped to a Federation Member
– Management Connection
32
35. More Detail
• Supported data types for federation key : bigint, int, GUID, and varbinary
(900)
– Only range partitioning
• Federation key must be part of unique index
• Foreign key constraints only allowed between federated tables and from
federated table to reference table
• Not all Azure programmability features supported
– Sequence, timestamp
• Additional surface area restrictions
– Indexed views, drop database (members)
• Schemas are allowed to diverge over time
– Furthermore, in v1, schema updates to existing members must be done in each
member (where the change is needed)
• USE FEDERATION “rewires a connection”
– Connection is reestablished
– All existing settings and context of the connection is lost (sp_reset_connection)
– Must be in a batch by itself
35
36. Connect to Atomic Unit: Filtered
Existing Database
When using into a specific key
value, SELECT will only return
sales
records from federated tables that
match that value. It will still return
Gateway
all records from non-federated
tables.
USE FEDEDERATION sales (customer_id=3)
Connection:
Inserts and UPDATES operating
WITH FILTERING=ON, RESET;
outside of the value will fail.
Server=az1cl321.db.windows.net;
Range: Min...Max
customer order product
3
Database=MyDB;
SELECT * from customer 3
User=AppUser;
Passwd=****;
SELECT * from product
SELECT * from order
Federation Member
36
37. More on Connection Filtering
• Most operations behave differently in filtered vs
unfiltered connections
• Connection filtering is a property of the session
– Filter injected dynamically at runtime
– Cannot inspect source code to determine how it behaves
• E.g., running stored proc written for filtered mode on unfiltered
connection could lead to unintended results
• There are several operations that will not work in
filtered connection in v1
– DDL, DML on reference tables, …
• Fan-out, bulk operations not efficient in filtered mode
– For now, filter=off is our best offer
37
38. Support Matrix
Connection Type Filtered Unfiltered Named
Operation (unfiltered)
Dynamic SELECT P P P
DML* (federated tables) P P P
DML* (reference tables) X P P
DDL X P P
Views (not indexed) P P P
UDF - activate P P P
Stored Proc - activate P P P
Trigger (all modes) - activate P P P
CREATE/UPDATE Stats X P P
Bulk Ops
openrowset bulk, bcp, bulk
insert X P P
* not including SELECT & modules
^ autostats will work on all connections
System stored procs, intrinsics will be unaffected (run unfiltered)
38
39. Splitting a Member
Existing Database
ALTER FEDERATION sales
sales SPLIT AT (customer_id=50)
Gateway
USE FEDERATION ROOT
Connection:
WITH RESET
Using to the
Server=az1cl321.db.windows.net;
federation ROOT
will pop you out of
a member back Range: Min...Max
into the database
that hosts the customer order product
federation 3 3
Database=MyDB;
User=AppUser;
Passwd=****;
40 58
58 58
Federation Member
39
40. Two New Members
Existing Database
ALTER FEDERATION sales
sales SPLIT AT (customer_id=50)
Gateway
USE FEDEDERATION ROOT
WITH RESET
Connection:
Server=az1cl321.db.windows.net;
Range: Min...50 Range: 51...Max
customer order product customer order product
3
Database=MyDB;
3
User=AppUser;
58
Passwd=****;
40 58 58
Federation Member Federation Member
40
41. Two New Members
Existing Database
sales
Gateway
USE FEDEDERATION sales (customer_id=40)
Connection:
WITH FILTERING=ON, RESET;
Server=az1cl321.db.windows.net;
Range: Min...50 Range: 51...Max
customer order product customer order product
Database=MyDB;
SELECT *
User=AppUser;
58
Passwd=****;
from customer 40
40 58 58
SELECT * from order
41
Editor's Notes
3000 web servers WS2003 IISThere is an app tier between the data and the web tier (not shown to focus on the partitioning aspect).A component in the app tier uses the partitioning function to connect to the relevant database in the data tier
For multicast/broadcast and feedback functionality: messages include a header (as part of the payload) that contains the target list and the initiator database
MyDB sends a message with my status change and a target list specifying the DBs that store my friends data.The Service Dispatcher forwards the message these DBs.Each DB processes the message updating my status in a partitioned tableTalk about other MySpace scenarios: Clean state (e.g. on account close) Deploy business logic (stored procedures)
250 Router Services
Note: Big-sized companies invest resources in building these platforms instead of using existing relational platforms!
Note: Big-sized companies invest resources in building these platforms instead of using existing relational platforms!
Performance and Scale:Map/Reduce PatternsEventual consistency (trade-off due to CAP)ShardingCachingAutomate management Lifecycle:Elastic Scale on demand (no need to pay for resources until needed)Automatic Fail-overScalable Schema version rolloutPerf troubleshootingAuto alertingAuto loadbalancingAuto resourcing (e.g., auto splits based on policies)Declarative policy-based management
Code First and revise quicklyWorking software over comprehensive documentationResponding to change over following a planApplication-model first (before database) Dictates the data model and queriesFlexible data modelsNo a priori modeling: Data first, schema later/Open SchemaKey/Value storesReduced impedance mismatch: JSON, XML, YAMLYou don’t know exactly what you are looking forMap/Reduce for adhoc analysisProvide Search across all your data instead of just queryLower Pain of adoption and maintenance From code to deployment & “monetization” of data, services, apps and tenantsRich Services out of the BoxData and services mashupEasy troubleshooting of deployed appsNo DB or OS Admin telling me what to do