3. Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
4. Symmetric Multiprocessing
SMP
Single DB instance
“Shared Everything” Architecture
Server/CPU’s share
memory
disks
Can lead to resource contention as you scale
5. Massively Parallel Processing
MPP
Server/CPU’s have their own dedicated resources
“Shared Nothing” Architecture
“Secret Sauce” is parallelizing operations
Lightning-fast Queries, Data Loads and Updates
Linear Scalability
Problem needs to be partitionable
6. SMP vs MPP
SMP MPP
HW advancements increasing HW advancements increasing
ability to scale-up ability to scale-up & scale-out
Scaling is limited Scaling to 1 PB+
High end SMP very expensive Scale out is relatively low cost
Extremely high concurrency for Relatively high concurrency for
some workloads complex workloads
Less than 1-2 TB of data SMP > 2 TB up to 1 PB
will almost always be better Limited SQL Server functionality
Full SQL Server functionality HA is built in
HA must be architected in
7. Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
8. How some solve the problem today
Big SAN
Biggest 64-core Server
Connected together!
What’s wrong with
this picture???
9. System out of balance
This server can consume 16 GB/Sec of IO, but
the SAN can only deliver 2 GB/Sec
Even when the SAN is dedicated to the SQL Data
Warehouse, which it often isn’t
Lots of disks for Random IOPS BUT
Limited controllers Limited IO bandwidth
System is typically IO bound and queries are
slow
Despite significant investment in both Server and
Storage
10. Where Does an I/O Go?
Understand potential throughput of the hardware
Each component in the path has associated
speed/bandwidth
Know where the potential bottlenecks exist
Switch
Controllers/Processors
Front End Ports
Cache
Host
Switch
PCI Bus HBA Fiber Channel Ports Array Processors Disks
11. Potential Performance Bottlenecks
DISK DISK
SQL SERVER
CPU CORES
A
FC SWITCH
FC
SERVER
WINDOWS
A
CACHE
HBA B LUN
CACHE
A STORAGE A
B CONTROLLER B DISK DISK
FC A
HBA B
B
LUN
CPU Feed Rate SQL Server HBA Port Rate Switch Port Rate SP Port Rate LUN Read Rate Disk Feed Rate
Read Ahead Rate
12. The alternative: A balanced system
Design a server + storage configuration that can
deliver all the IO bandwidth that CPUs can
consume when executing a SQL Relational DW
workload
Avoid sharing storage devices among servers
Avoid overinvesting in disk drives
Focus on scan performance, not IOPS
Layout and manage data to maximize range
scan performance and minimize fragmentation
13. Sequential I/O
Sequential I/O Random I/O
Ideal for data warehousing Ideal for OLTP
Large reads & writes Small reads and writes
Scans on large data stores are OLTP usually random-read centric
usually read with sequential read Seek queries are a goal in OLTP query
patterns and not random read optimization
patterns Seeks usually cause random reads
Not as predictable & scalable for
data warehousing
Scalable, predictable performance Requires large number of drives
Requires 1/3 or fewer drives for
same performance
All databases contain both scans and seeks among with other types of reads and writes, DW workload indicate
that the vast majority of reads are sequential – not all
14. What is Fast Track Data Warehouse?
A method for designing a cost-effective,
balanced system for Data Warehouse
workloads
Reference hardware configurations developed
in conjunction with hardware partners using
this method
Best practices for data layout, loading and
management
Relational Database Only – Not SSAS, IS, RS
15. Fast Track Scope
Presentation Layer Systems
Reference Architecture Scope (dashed)
Presentation Data
Presentation Data
Web Analytic Tools
Reporting Services
Dedicated SAN,
Storage Array Data Staging,
Bulk Loading
Data Warehouse Analysis Services
Cubes
SharePoint Services
Microsoft Office SharePoint
PerformancePoint
Excel Services
16. Benefits of Fast Track appliance
model
Lower TCO
Minimizes risk of overspending on un-balanced hardware
configurations
Commodity Hardware
Choice
HW platform
Implementation vendor
Reduced Risk
Validated by Microsoft
Encapsulates best practices
Known performance & scalability
17. Fast Track DW Reference Configurations
CPU Initial Max
Server CPU SAN Data Drive Count
Cores Capacity* Capacity**
HP Proliant (2) AMD Opteron Istanbul 12 (3) HP MSA2312fc (24) 300GB 15k 6TB 12TB
DL 385 G6 six core 2.6 GHz RPM SAS
HP Proliant (4) AMD Opteron Instanbul 24 (6) HP MSA2312fc (48) 300GB 15k SAS 12TB 24TB
DL 585 G6 six core 2.6 GHz
HP Proliant (8) AMD Opteron Istanbul 48 (12) HP MSA2312 (96) 300GB 15k SAS 24TB 48TB
DL 785 G6 six core 2.8 GHz
Dell PowerEdge R710 (2) Intel Xeon Nehalem 8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB
quad core 2.66 GHz
Dell Power Edge R900 (4) Intel Xeon Dunnington 24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB
six core 2.67GHz
IBM X3650 M2 (2) Intel Xeon Nehalem 8 (2) IBM DS3400 (16) 200GB 15K FC 4TB 8TB
quad core 2.67 GHx
IBM X3850 M2 (4) Intel Xeon Dunnington 24 (6) IBM DS3400 (24) 300GB 15k FC 12TB 24TB
six core 2.67 GHz
IBM X3950 M2 (8) Intel Xeon Nehalem four 32 (8) IBM DS3400 (32) 300GB 15k SAS 16TB 32TB
core 2.13 GHz
Bull Novascale R460 (2) Intel Xeon Nehalem 8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB
E2 quad core 2.66 GHz
Bull Novascale R480 (4) Intel Xeon Dunnington 24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB
E1 six core 2.67GHz
* Core-balanced compressed capacity based on 300GB 15k SAS not including hot spares and log drives. Assumes 25% (of raw disk space) allocated for Temp DB.
** Represents storage array fully populated with 300GB15k SAS and use of 2.5:1 compression ratio. This includes the addition of one storage expansion tray per enclosure.
30% of this storage should be reserved for DBA operations
18. Fast Track DW
Core-Balanced Architecture Using 300GB 15k SAS drives
Each HBA port rated at 4Gb/s each LUN rated at 125MB/s
or 400MB/s and 1600MB/s for all Each SP rated at 500MB/s each SP controls 4 LUN’s at 500MB/s
4 SP ports. or 1000MB/s for both SP’s or 1000MB/s per MSA DAE
RAID GP01 RAID GP02 RAID GP05
S
P 01 02 03 04 09 10
LUN1 LUN3
LUN0
A LUN2
(Logs)
SMP LUN4 HS ONLY 8
SWITCH
Server
per
RAID GP03 RAID GP04
data
4-Cores
disks !!!
S
P 05 06 07 08
LUN5 LUN7
B LUN6 LUN8
Per MSA2312 Drive Details
Each SP port rated at 4Gb/s • Each MSA can hold 12 drives, this configuration requires 11
or 400MB/s and 1600MB/s for all • MSA is 2U in total (capacitor eliminates need for battery)
4 SP ports. • Each MSA SP port controls 4 LUNs
• Each pair of LUNs consists of (2) 300GB 15k SAS drives RAID1
19. Fast Track Data Warehouse Components
Software:
• SQL Server 2008
Enterprise
• Windows Server 2008
Configuration guidelines:
• Physical table structures
• Indexes
• Compression
• SQL Server settings
• Windows Server settings
• Loading
Hardware:
• Tight specifications for servers,
storage and networking
• ‘Per core’ building block
20. RA: Tightly Spec'd
RAs include not only hardware but best
practices in:
Window OS configuration
SQL Server startup options
Database physical layout
Table types
Indexing
Statistics
Managing fragmentation
Loading procedures
21. Fast Track Case Study - Results
Teradata SQL Server Comparison
Fast Track DW
Loading – 5:10:21 total time 51:31 total time R SQL Server 6x
Subject Area 1 faster
Loading – 4:36:08 total time 1:50.01 total time R SQL Server 2.5x
Subject Area 2 faster
Query times – 3:03 avg query time 0:15 avg query time R SQL Server 12x
Subject Area 1 (using 9 benchmark (using 9 benchmark faster
queries) queries)
Query times – 56:44 avg query time 8:09 avg query time R SQL Server 7x
Subject Area 2 (using 4 benchmark (using 4 benchmark faster
queries) queries)
22. Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
23. About DATAllegro…
Technology
Partners
Proprietary Appliance
Management and
MPP Database
Open Source
Database and OS
Industry
Standard
Servers
Industry
Standard
Networking
Industry
Standard
Storage
24. Integration Plans
Provide scale out through MPP on SQL Server and Windows
Offer ‘Appliance like’ user experience to Data Warehouse customers
Lower TCO to high end Data Warehousing
Offer integrated BI platform to small and very large Enterprises
OPEN SOURCE
DATABASE
& OS
Industry Standard
Servers
Industry Standard
Networking
Industry Standard
Storage
25. MPP Additional Considerations
Principles & approach of SMP carry forward
Deeper level of complexity –
High Availability
Parallelization
Inter node data movement
26. Modular building blocks
Balanced CPU and storage
Both SMP and MPP are based on building blocks that scale
by the CPU core
Adds network, storage processing and disk bandwidth for
each core
Based on maximizing & sustaining true sequential I/O while
minimizing disks
Generally changes balance of systems so more can be
spent on CPU and SW than on storage to give better
overall performance for a given budget
Building blocks can be adjusted for multiple MPP
configurations – high performance, archive and
extreme performance
27. The future of SQL Server Data Warehousing
– Project "Madison"
Predictable Scale out through MPP
Customers with over 400 TB data warehouses
29. Ultra Shared Nothing
An extension of traditional shared nothing design
Push shared nothing architecture into SMP node
IO and CPU affinity within SMP nodes
Eliminate contention per user query
Use full resources for each user query
Multiple physical instances of tables
Distribute large tables
Replicate small tables
Distribute AND Replicate medium tables
Re-Distribute rows “on-the-fly” when necessary
30. Madison Server Components
Database Servers
Control Nodes SQL
Control
Active / Passive
SQL Compute
SQL
SQL
Storage
SQL
Landing Zone
Dual Fiber Channel
SQL
Dual Infiniband
SQL Backup
SQL
Management
SQL
Failover/Spare
Spare Database Server
31. System Architecture 20Gbs Infiniband
DMS Backbone
Database Servers
Control Nodes SQL
Active / Passive
SQL
SQL
Client Drivers
SQL
SQL
Dual Fiber Channel
SQL
Dual Infiniband
Data Center
Monitoring SQL
SQL
SQL
ETL Load Interface
8Gbs Fiber Channel
Corporate Backup Local San
Solution Spare Database Server
IPoIB
Dedicated LAN
Corporate Network Private Network
32. Software Architecture
Nexus
MS BI
Query Compute Nodes
(AS, RS) Compute Nodes
Tool
Compute Nodes
DMS
IIS
JDBC
Admin Console User Data
OLE-DB
SQL Server
ODBC
Ado.Net
Madison Service Landing Zone
DMS Loader
DMS SQL SSIS
Core Engine DMS Client
DSQL
SQL OS Services Manager
Backup Node
SQL OS
DMS
DW DW DW
DW Schema Management Node
Authentication Configuration Queue
SQL Server
HPC AD
Existing MS software Built by DWPU 3rd Party
33. Control Node & Client Drivers
Client connections always go through the control node
Clustered to a passive node
Processes SQL requests
Prepares execution plan
Orchestrates distributed execution
Local SQL Server to do final query plan processing / result
aggregation
Will use same set of drivers used by DATAllegro
Provided by DataDirect
ODBC, OLE-DB, JDBC and Ado.Net client drivers
Wire protocol (SeQuel Link)
Available drivers for 32 and 64 bits
34. Compute Nodes
A SQL Server 2008 instance
DB engine nodes autonomous on local data
SQL as primary interface
Each MPP node is a highly tuned SMP node with
standard interfaces
35. Landing Zone
Provides high capacity storage for data files
from ETL processes
Integration services available on the landing
zone
Connected to internal network
Available as sandbox for other applications and
scripts that run on internal network.
Landing Data Compute
Source
Zone Files Loader Nodes
36. Backup Node
Builds on SQL Server native backup/restore
facility
Use VDI interface to plug into backup pipeline
Database-level backup
Coordinated backup across the nodes
Quiesce write activity to synchronize
Can only restore to another appliance with
exactly the same number of distributions
37. Data Distribution & Replication
Control Node Compute Nodes Storage Nodes
Tables Are Hash
Distributed Or
Replicated
Landing Zone
Node
Spare Node
Text
File
Text
File
Text
File
Text
File
38. Database Distributed & Replicated Tables
Date Dim
D_DATE_SK D
Customer
D_DATE_ID
C I
D_DATE
C-CUSTOMER_SK
C_CUSTOMER_ID
D_MONTH SS
… Item CD P
C_CURRENT_ADD
R
I_ITEM_SK S
…
I_ITEM_ID
I_REC_START_D
ATE
I_ITEM_DESC
…
Store Sales
Ss_sold_date_sk D D
Ss_item_sk
Ss_customer_sk C I C I
Ss_cdemo_sk D
Ss_store_sk
SS SS
Ss_promo_sk CD P C I CD P
Ss_quantity SS
… S S
Promotion CD P
Customer
P_PROMO_SK S
Demographics
P_PROMO_ID
CD_DEMO_SK P_START_DATE_
CD_GENDER SK D
P_END_DATE_SK D
CD_MARITAL_STATUS Store
CD_EDUCATION … C I
C I
… S_STORE_SK SS
SS
S_STORE_ID CD P
S_REC_START_DAT CD P
D D
E S
S_REC_END_DATE S
C I C I
S_STORE_NAME
… SS SS
CD P CD P
S S
39. Physical Storage Configuration – Single Node
LUN 1 LUN 2 LUN 3 LUN 8
FG Dist A FG Dist B FG Dist C FG Dist H
DistData1.mdf DistData3.ndf DistData5.ndf DistData7.ndf
DistData2.ndf DistData4.ndf DistData6.ndf DistData8.ndf
Database(s)
Replicated FG
User
ReplData1.mdf ReplData3.ndf ReplData5.ndf ReplData7.ndf
ReplData2.ndf ReplData4.ndf ReplData6.ndf ReplData8.ndf
FG Stage A FG Stage B FG Stage C FG Stage H
StageData1.mdf StageData3.ndf StageData5.ndf StageData1.ndf
StageData2.ndf StageData4.ndf StageData6.ndf StageData2.ndf
Database
Staging
Replicated FG
ReplData1.mdf ReplData3.ndf ReplData5.ndf ReplData7.ndf
ReplData2.ndf ReplData4.ndf ReplData6.ndf ReplData.ndf
Local Drive 1 Local Drive 2 Local Drive 3 Local Drive 4 Local Drive 5 Local Drive 6
TempDB
TempDB1.mdf TempDB2.ndf TempDB3.ndf TempDB4.ndf TempDB5.ndf TempDB6.ndf
Log LUN
UserDB Log StageDB Log TempDB Log
40. Create Table – Behind the Scenes
Create Table store_sales
with
distribute_on (ss_item_sk)
partition_on(ss_sold_date_sk)
cluster_on (ss_sold_date_sk)
8 Filegroups
Create Table mad_store_sales_a 1 Table per FG
Create Table mad_store_sales_ … Distribution_a
Create Table mad_store_sales_h thru
Distribution_h
12 Partitions
(ss_sold_date_sk)
8K
8K
8K N-number of
8K Pages
8K
Tuple
Microsoft Confidential
41. High Availability
Multiple levels of redundancy:
• Leveraging MSCS for node availability
• Cluster aware services:
• SQL Server, Madison, DMS
8x1
• Leveraging MSCS for SQL Services, DMS
• 1 spare node for every 8* compute nodes
42. Security and Encryption
Retain DA v3 design
Authentication and authorization done by Madison server
Users and Roles as first class principals
Nested role capabilities
Connection to SQL back-ends through high privilege account
SQL nodes reside on private network
No support for integrated auth
Leverages TDE to expose DB-level encryption
Supports key rotation
43. The Logical Data Model
Multiple databases per appliance
Each user database maps to one SQL Server db per
node
Tables
Replicated, Distributed, Replicated + Distributed
Leverage SQL Server compression
Supports Partitioning
Supports secondary indexes
Views
44. SQL Server Data Types DAv3 Madison
Data Types bigint
binary
P P
Most scalar data types supported bit P
char / nchar P P
by SQL Server 2008 are supported date, time P
by Madison datetime (was date in DA) P P
datetime2 P
Main exceptions datetimeoffset P
Character and binary strings limited to 8K decimal P P
(i.e. no BLOB support) float P P
XML geometry / geography
hierarchyid
Sql-Variant
Int (was integer in DA) P P
System and CLR UDTs money P
Latin1_General with binary real P
smalldatetime P
comparison only smallint P P
smallmoney P
sql_variant
text / ntext / image
timestamp
tinyint P P
varchar / nvarchar / varbinary P P
v*(max)
uniqueidentifier
xml
45. Supported SQL Syntax
Aligned with ANSI SQL 92
Basic INSERT, UPDATE, DELETE, SELECT
CREATE TABLE AS SELECT
Limited analytical function support
Teradata extensions
Quantile, Sample,…
46. Configuration and Monitoring
Challenge: Is it an appliance or a collection of nodes?
Madison services instrumented
Logs and Performance Counters
Capture and forward SNMP alerts from devices within the appliance
Small subset of DMVs to union underlying node DMVs
Leverage HPC for monitoring
47. Manageability
Web-based main administrative user interface
Based on DATAllegro manageability UI
Monitoring system health and activity
Leveraging HPC pack 2008
Systems management
Monitoring
Cluster health
48. Query Tools
GUI Tool:
Nexus (CoffingDW)
Table & view object
explorer
Interactive query
execution
Command line tool:
Replacement for DA-
SQL
Flavor of SqlCmd
49. MS BI Integration
Integration Services
Madison enabled as a source
Data movement, lookup operations, etc.
Will add a new SSIS destination
Ensure integrated high performance loads
Reporting Services
Fully supported; including parameterized queries
Will customize experience for report builder and report
designer
Analysis Services
Will get connectivity through OLE-DB provider
Will enable both MOLAP and ROLAP storage
50. High Level Release Definitions
Will start
running MTPs V2+
in the
summer Closer functional alignment with SQL Server
Better integration with SQL and MS ecosystem,
tools and technologies
“Madison” (aka v1)
Focus on time to market
Compatibility with DATAllegro v3
MS BI integration
H1 2010
51. Recap
Data Warehousing Reference Architectures
available today!
SQL Server Fast Track
SQL Server “Madison”
Built for advanced, large scale data warehouses
Shared-nothing MPP architecture
Early evaluation programs starting soon
All feedback welcome:
fransidi@microsoft.com
Thank you!