3. Goals For This Session
Discuss:
• How VNX storage pools work
• How common workloads compare
• Which workloads are compatible
• How to monitor performance
• How to mitigate performance problems
4. Goals For This Session
Also check out this session at 2:55
EMC Session: VNX Performance
Optimization and Tuning - David Gadwah,
EMC
5. VNX Basics
• VNX shines at mixed workloads
IOPS - Mixed Workloads
CX4
VNX Series with Rotating drives
VNX Series with Flash drives
# of
Users
VNX5100 VNX5300 VNX5500 VNX5700 VNX7500
CX4-120 CX4-240 CX4-480 CX4-960
Platform
6. VNX Basics
• VNX is EMC’s mid-tier unified storage
array
• FC, iSCSI or FCoE block connectivity
• Multiple SAS buses backend
• NFS and CIFS file connectivity
• Built for flash
7. VNX Architecture Object:
Application servers Exchange servers Clients Virtual servers Oracle servers Atmos VE
SAN LAN
VNX Unified Storage
10Gb 10Gb
FC iSCSI FCo Enet
FC iSCSI FCo Enet
E E
VNX X-Blade Failover VNX X-Blade
VNX X-Blade
VNX X-Blade
VNX X-Blade VNX X-Blade
VNX X-Blade
VNX X-Blade VNX OE FILE
VNX SP Failover VNX SP VNX OE BLOCK
Power Supply
Power Supply
SPS SPS
LCC LCC
Flash drives
Near-Line SAS drives SAS drives
8. VNX Architecture
• Two Storage Processors with DRAM
cache, frontend ports (FC, iSCSI,
FCOE) and backend ports (6 Gb SAS)
• Each LUN owned by one SP, and
accessible by both
• Both SP’s have active connections
9. VNX Architecture
• FAST Cache
– Second layer of read/write cache, housed
on solid state drives
– Operates in 64 KB chunks
– Reactive in nature
– Great for random I/O
– Don’t use it for sequential I/O
10. VNX Architecture
• Storage Pools
– Based on RAID
• RAID 5, RAID 1/0, RAID 6
– FAST VP: Fully Automated Storage Tiering
• Pools with multiple drive types: EFD, SAS, NL-
SAS
• Sub-LUN tiering
• Operates at 1 GB chunks
• Adjusts over time, not immediately
– FASTCache is more immediate
11. VNX Architecture
When should I use traditional RAID Groups? As the
exception:
• Very specific performance tuning (MetaLUNs)
• Internal array features (write intent logs, clone private
LUNs)
• Maybe Recoverpoint journals
• Supportability (I’m looking at you, Meditech)
Remember the limitations:
• Maximum of 16 drives
• Expand via metaLUNs
• No tiering
13. VNX Architecture
• Real-world effect of write penalty:
– 10x 600 GB 15k SAS drives = 1800 read
IOPS
• With RAID 1/0, capable of 900 write IOPS
• With RAID 5, capable of 450 write IOPS
– 1 write operation takes 4 I/O operations
• With RAID 6, capable of 300 write IOPS
– 1 write operation takes 6 I/O operations
14. Workloads
Common workloads seen in the field.
Virtual Disks/VMFS (RAID5)
DB – Data files (RAID5)
DB – Transaction files (RAID 10)
Unstructured Data, Backups (RAID6)
15. Real World Workloads
Standard Performance Evaluation Corporation
Benchmarking Real World Performance
Non-profit
Uses generic applications rather than
specific applications
SPEC benchmarks rely on a mix of I/O to simulate a
generic application
This balances the need for real world performance and
consistency over time
16. Ideal Scenario
• Array with single application
• No budget constraints
• Separate storage pools for different
sub-workloads
17. Ideal Scenario
• The ideal SQL:
– PCIe flash and XtremeSW on the host
– FAST Cache in the array
– tempDB:
• Data files on separate RAID 5 storage pool
– User DB’s:
• Each has tlogs on separate RAID 1/0 storage pool
• Each has data files on one or more RAID 5 storage
pools, with the appropriate drive configuration
(EFD+FAST)
– Backups / Dump files:
• Separate RAID 6 storage pool, maybe a separate
array
18. Reality – Can’t isolate every
workload
Cost prohibitive, and do we have to?
• Business Critical Application … maybe
• Management & Lower-tier application… probably not
19. Basic Storage Pool
Layout
• One or Two RAID 5 pools
(ex: Gold & Silver)
– FAST with EFDs, SAS, NL-SAS according
to skew or 5/20/75 rule
• RAID 1/0 pool for transaction logs.
– 15k SAS drives
• RAID 6 pool for backup files and
unstructured data
– 7.2k NL-SAS drives
20. RAID 5 Pools
• VMFS
• DB Data Files
• Good for random read/write mix
• Use FASTCache
Example:
• Gold Pool: 5x EFD, 15x SAS, 16x NL-SAS
• Silver Pool: 15x SAS, 16x NL-SAS
22. RAID 1/0 Pool
• Transaction Logs for many applications
• Specifically for small sequential writes
• Do Not Use FAST Cache
– It’ll be wasted
– It’ll hurt performance
Example:
• 8x 15k SAS drives
23. RAID 6 Pool
• Unstructured data
– Office Files (.doc, .xls, .etc)
– Images
• Backup files
– Split into separate pool if necessary
• Low I/O & high capacity
• Good for long sequential writes
• Do Not Use FAST Cache
– It’ll be wasted
Example:
• 16x 7.2k rpm NL-SAS drives
25. Monitoring and Troubleshooting
There is no “Set it and forget it”
Workloads change over time
• Users get added
• Transaction load increases
• Requirements change
Often no one tells us
27. Troubleshooting Metrics
Where do we start? What do we look at?
• Cache Utilization
– Exceeding a high water marks, need to flush
cache to disk
– Forced Flushes
• SP performance
– Balance the SP load
• Pool LUN migration (metadata)
• Online LUN migration
29. The “Toolbox”
VNX Monitoring and Reporting (Off array)
– Historical Data Collection
– Streamlined application based on Watch4net
30. The “Toolbox”
EMC miTrend
– Leverages NAR (Navisphere analyzer data) that can be
retrieved from the array
– Need EMC or partner (us) to perform the analysis
31. Troubleshooting /
Problem Mitigation
Several options for mitigating a performance problem:
• Add drives
– OE 32 required to rebalance existing data
– Pre OE 32, must increase pool by originating drive
count, existing data will not be rebalanced
• Migrate to a different pool
– Live migration avoids the need for an outage
– Performance Throttling minimizes performance impact
32. Troubleshooting /
Problem Mitigation
• Rebalance at the application layer
– Storage vMotion
– Host-based data migration (Open Migrator, etc)
• Migrate data between arrays
– SANCopy
– Replication (Mirroring/RecoverPoint)
• Reduce workload
– Reschedule for off-hours (backups for example)
– Decommission non-critical workloads
DatabaseMix of RAID5 (Database) and RAID10 (Log)Database – Random Read/Write (XX/XX)Log – Small block sequentialRAID 6Large block sequentialAdded protection (Rebuild/Double disk failure)Typical read more than written Network attached storage (VNX File) RAID 6 – large sequential writesLarge sequential writesLarger drives longer rebuildsBackup/Unstruc- If backup windows don’t overlap