This document provides an overview of advanced features and functions of Data Domain systems. It covers topics such as virtual tape libraries (VTL), snapshots, replication, DD Boost integration, capacity and throughput planning, and system monitoring tools. The document consists of multiple lessons that describe these topics in detail and includes configuration examples.
2. Module Objectives
2
Upon completion of this module, you will be able to:
• Describe VTL and VTL library planning
• Describe snapshots, fastcopy, and data retention
• Describe data replication and recovery
• Describe DD Boost and integration with EMC
NetWorker
• Describe capacity and throughput planning
• Describe Data Domain system monitoring tools
3. Lesson: Virtual Tape Library (VTL)
EMC CONFIDENTIAL—INTERNAL USE ONLY. 3
Upon completion of this lesson, you will be able to:
• Describe a Data Domain VTL
• Describe VTL library planning
5. Configuration Terms
EMC CONFIDENTIAL—INTERNAL USE ONLY. 5
Barcode • Unique ID assigned to virtual tape when you create it
• In the Data Domain OS aka: label, tape label
CAP • Cartridge access port (CAP), emulated tape enter/eject point for
moving tape to/from a library
• In the Data Domain OS aka: mail slot
Library •Emulates physical tape library with tape drives, changer, CAPs, and
slots (cartridge slots)
• In the Data Domain OS aka: autoloader, tape silo, tape mount, tape
jukebox, vault
Pool Collection of tapes that map to a directory on a file system, used to
replicate tapes to a destination
Tapes •Represented in a system as files. You can export/import from a vault to
a library, move within a library across drives, slots, and CAPs
• In the Data Domain OS aka: cartridge
Vault Unused tapes stored in vault, tapes are in library/vault
6. VTL Library Planning
•256 virtual tape drives
(DD880 only)
• 128 virtual LTO-1, LTO-2,
LTO-3 tape drives
(all other models)
Robot loadschanges
tape cartridge
Up to 1,00,000
tape cartridges
in virtual vault
Fibre channel
VTL
• 64 virtual libraries
• 20,000 slots per library
• 100 CAPs per library
• 1000 CAPs per system
• 800 GiB per tape
EMC CONFIDENTIAL—INTERNAL USE ONLY. 6
7. Capacity Planning
EMC CONFIDENTIAL—INTERNAL USE ONLY. 7
• More planning needed at installation
• Expired tapes NOT deleted, space not reclaimed till
tape is overwritten/deleted
• Always create more slots that you need
• Load tapes when you need them
• Stop loading tapes once retention requirements are
met
8. Lesson: Fastcopy, Snapshots, and Data
EMC CONFIDENTIAL—INTERNAL USE ONLY. 8
Retention
Upon completion of this lesson, you will be able to:
• Describe Data Domain fastcopy
• Describe Data Domain snapshots
• Describe Data Domain data retention
• Explain the Data Domain cleaning process
9. Fastcopy
EMC CONFIDENTIAL—INTERNAL USE ONLY. 9
Copy
If you change source or target directory
while copying, they will not be equal.
Source
directory
Target
directory
10. Snapshots
Original copy Snapshot copy
/data/ coll /backup
/data/ coll /backup/files
/data/ coll /backup / .snapshot
/data/ coll /backup/files/ .snapshot
Snapshot taken at 22:24 GMT 22:24 GMT snapshot saved
EMC CONFIDENTIAL—INTERNAL USE ONLY. 10
11. Retention Lock
Archive software
or user initiates
Prevents retention-locked files from being deleted/modified for up to 70 years
Licensed feature
Retention locked files can be stored, encrypted, and replicated
EMC CONFIDENTIAL—INTERNAL USE ONLY. 11
12. Retention Lock Flow
1. License/enable retention lock
2. Set min/max retention period3. Create file
4. Lock file (set retention period)
- Extend retention-locked file
- Delete expired retention-locked file
5. Transfer file to Data Domain system
EMC CONFIDENTIAL—INTERNAL USE ONLY. 12
13. currenttime
Configure Client File Retention Period
minimum retention period valid atime period
maximum retention period
2. Data Domain system administrator sets min/max retention periods on Data Domain system
1. User creates file and sets last access time (atime) to desired retention period
Client must initiate retention lock
3. File either committed as a retention-locked file or ignored
EMC CONFIDENTIAL—INTERNAL USE ONLY. 13
14. File System Cleaning
FileA deleted with no retention lock
File B deleted, retention lock initiated
FileA deleted at next cleaning
File B maintained until retention lock period ends
SW backups to Data Domain
EMC CONFIDENTIAL—INTERNAL USE ONLY. 14
Cleaning reclaims physical storage occupied by deleted objects
15. Cleaning
Disk blocks
What?
Reclaim space
Disk block
Disk block
Why?
House keeping (reclaim “dead” segments)
Performance (rewrite duplicate data)
House keeping
Performance tuning
Container 1
Container 2
Container 2Container 3
dead copy Forward valid
Container 1
or
Free space
Free space
EMC CONFIDENTIAL—INTERNAL USE ONLY. 15
16. Lesson: Replication and Recovery
EMC CONFIDENTIAL—INTERNAL USE ONLY. 16
Upon completion of this lesson, you will be able to:
• Describe the types of Data Domain replication
• Identify how replication improves storage
• Describe the data recovery process
17. Data Replication
New deduplicated compressed data is automatically replicated to destination
WAN
LAN
Source Destination
EMC CONFIDENTIAL—INTERNAL USE ONLY. 17
18. Data Domain Replication Types
EMC CONFIDENTIAL—INTERNAL USE ONLY. 18
• Collection: for entire site backup
• Directory: for partial site backup
• Pool: for VTL files/tape backup
19. Data Domain Collection Replication
/backup /backup
EMC CONFIDENTIAL—INTERNAL USE ONLY. 19
Source Destination
•Immediate accessibility
• Read only
•User accounts/passwords
replicated from source
• Works with encrypted files
• Works with retention lock
Recovers
entire system
20. Data Domain Directory Replication
/backup/dir a
Source
• Destination must have
available storage
• CIFS and NFS clients ok
• Do not mix CIFS/NFS
data in same directory
• Destination directory
created automatically
• Works with encryption
• Works with retention
lock
/backup/dir b
/backup/dir a
/backup/dir b
Destination
Destination
Recovers selected data
EMC CONFIDENTIAL—INTERNAL USE ONLY. 20
21. Data Domain Pool Replication
Source
Destination
• Works like directory replication
• Destination doesn’t require VTL license
pool 1
pool 3
pool 2
pool 1
pool 3
pool 2
EMC CONFIDENTIAL—INTERNAL USE ONLY. 21
24. Recover Data
Backup
serverFile serverClients
On site
Off site
disaster recovery
WAN
Replication
In case of disaster,
recover off-site replica
EMC CONFIDENTIAL—INTERNAL USE ONLY. 24
You can configure a Data Domain system to store
backup data and retain onsite for 30-90 days
25. Why Resynchronize Recovered Data?
WAN
Source
Resynchronization
Destination
Recreate deleted context Out of space
Convert collection to
directory replication
EMC CONFIDENTIAL—INTERNAL USE ONLY. 25
26. Lesson: Data Domain Boost
EMC CONFIDENTIAL—INTERNAL USE ONLY. 26
Upon completion of this lesson, you will be able to:
• Describe DD Boost
• Describe replica awareness
• Describe how DD Boost works with EMC NetWorker
• Describe supported network topologies
• Describe DD Boost advanced load balancing and link
failover feature
27. DD Boost
EMC CONFIDENTIAL—INTERNAL USE ONLY. 27
• Provides standard/centralized management features
through backup software
• Works with industry standard backup software
– EMC Networker
– Symantec NetBackup (Data Domain plug-in required)
– Symantec Backup Exec (Data Domain plug-in required)
• Enables advanced load balancing and failover
• Requires licenses on Data Domain System
– DD Boost
– Replication (if used) Note: Your backup software might require
license to enable the feature. Verify your backup software
documents.
28. DD Boost (contd.)
Backup Server
OST
plug-in
DD
Boost
Clients
Clients send
data to backup
server
Less data sent
over LAN
Deduplication/compression
occur in backup server
LANLAN
Optimized
protocol
for high
throughput
Manages connections
between backup applications
and Data Domain systems with DD Boost
Deduped Data
Stored
EMC CONFIDENTIAL—INTERNAL USE ONLY. 28
29. Replica Awareness
Backup
Server
WAN
replication
Backup site Disaster recovery site
Initiates and tracks
replication for easy
management and
disaster recovery
You manage replication from
backup server console
OST
plug-in
Archive to tape as needed
DD
Boost
DD
Boost
EMC CONFIDENTIAL—INTERNAL USE ONLY. 29
30. DD Boost Advantage
• Without DD Boost
– Backup server(s) not aware of
Data Domain replica(s)
– Recovery is manual process
• With DD Boost
– Backup server dedupes data
and minimizes network
bandwidth use
– Replication and recovery are
centrally configured and
monitored
Backup
Replication
Optimized
deduplication
Replication
engine
Backup
server
DD Boost server
DD Boost server
WithoutOST
Manually
configured replication
With DD Boost
OST
plug-in
EMC CONFIDENTIAL—INTERNAL USE ONLY. 30
31. NetWorker – Work Flow
Start clone
(Clone 1)
4
NetWorker
Server
Control
Data
Local
Data Domain system
Remote
Data Domain system
5
Data
transfer
6
Done (Clone 1)
Clone 1
Save
Set 1
7
Clone 1
update
control
data
Clone 1
Save
Set 1
New data backup
(Save Set 1)
1
2
Done (Save Set 1)
Save Set 1
update
control data
3
EMC CONFIDENTIAL—INTERNAL USE ONLY. 31
32. Lesson: Capacity and Throughput Planning
EMC CONFIDENTIAL—INTERNAL USE ONLY. 32
Upon completion of this lesson, you will be able to:
• Describe capacity planning and its importance
• Describe throughput planning and its importance
33. Monitor File System Space Use
EMC CONFIDENTIAL—INTERNAL USE ONLY. 33
• Factors that effect how fast data on disk grows
– Size of data sets getting backed up
– Compressability of data getting backed up
– Retention period specified in backup software
• Monitor disk use closely when you back up large data
sets that show low compression factors and have
large retention times
• You can get more accurate space-use view from CLI
• Use filesys show space to monitor post-compression
data growth
34. Space Graph
Cumulative physical data
written to DDSAmount of Data within
Backup Application
Compression Ratio:
Pre-compression/ Data Collection
Available Space on DDS
EMC CONFIDENTIAL—INTERNAL USE ONLY. 34
35. Space Graph (contd.)
What does the saw-tooth line for compression ratio represent?
EMC CONFIDENTIAL—INTERNAL USE ONLY. 35
36. Compression Factor Calculation
Original bytes
Data Domain system data written
ompression factor
C
What does cleaning do to this equation?
It decreases the Data Domain system data written (denominator)
and thus increases the compression factor.
EMC CONFIDENTIAL—INTERNAL USE ONLY. 36
37. How much?
•Data size (TB)
• Data type
• Full backup size
•Compression rate
(deduplication)
Capacity Planning: Determine Capacity
Needs
Capacity
needs
How long?
• Retention policy
(duration)
• Schedule
EMC CONFIDENTIAL—INTERNAL USE ONLY. 37
38. Determine Capacity Needs (contd.)
EMC CONFIDENTIAL—INTERNAL USE ONLY. 38
• Data Domain system internal indexes and other
components use variable storage amounts
depending on data type and file sizes
• If different data sets are sent to identical systems,
one system may, over time, have room for
more/less backup data than another
• Challenging data types
– Pre-compressed (multimedia, .zip, and .tiff)
– Encrypted
39. Compression Requirements with Variables
EMC CONFIDENTIAL—INTERNAL USE ONLY. 39
• 5x – Nearline and archive
Incremental + weekly full backup with two weeks retention
–
Daily full backup with one week retention
–
Nearline and archival use compression tends to be capped here
–
• 10x – Overall compression
Incremental + weekly full backup with one month of retention
–
Daily full backup with two-three weeks retention
–
• 20x – Overall compression
Incremental + weekly full backup with two-three months retention
–
Daily full backup with three-four weeks retention
–
40. Calculate Required Capacity
1st full backup
Incremental
backup 4
Weekly full
backup
number of
weeks
Required
capacity
Total space required
Required
capacity
EMC CONFIDENTIAL—INTERNAL USE ONLY. 40
41. Calculate Required Throughput
Largest backup
Backup time window
6 TB
10 hrs
Example
Required throughput
600 GB/hr
EMC CONFIDENTIAL—INTERNAL USE ONLY. 41
42. System Model Capacity and Performance
• Maximum capacity is amount of usable data storage
space
• Maximum capacity based on max number of drives
supported by a model
• Maximum throughput is achieved using either VTL
interface and 4Gbps Fibre Channel or DD Boost and
10Gb Ethernet
• Current model throughput and capacity specifications
http://www.datadomain.com/products/
EMC CONFIDENTIAL—INTERNAL USE ONLY. 42
43. Select Model
Maximum throughput
• Be conservative when determining which model to use
• Use 75-85% of model capacity and throughput
(factor 15-25% buffer for capacity and throughput)
Required capacity
Maximum logical capacity
Required throughput
Capacity %
Throughput %
100
EMC CONFIDENTIAL—INTERNAL USE ONLY. 43
100
44. Calculate Capacity Buffer for Selected
Models
Required capacity
Maximum capacity
DD140 example
840 GB
860 GB
840 GB
1650 GB
DD610 example
97%
51%
100%
100%
100%
% of Maximum capacity
3% Buffer not ok
51% Buffer ok
EMC CONFIDENTIAL—INTERNAL USE ONLY. 44
45. Match Required Capacity to Model
Specifications
OR?
1,650 GB with 7 drives
DD610DD140
860 GB
For example
Required capacity = 840 GB
Ensure capacity buffer is big enough
EMC CONFIDENTIAL—INTERNAL USE ONLY. 45
46. Calculate Performance Buffer for Selected
Models
Required throughput
Maximum throughput
DD610 example
600 GB/hr
675 GB/hr
600 GB/hr
1126 GB/hr
DD630 example
89%
53%
100%
100%
100%
% of Maximum throughput
11% Buffer not ok
47% Buffer ok
EMC CONFIDENTIAL—INTERNAL USE ONLY. 46
47. Match Required Capacity to Model
Specifications
OR?
1.1 GB/hr
DD630DD 610
860 GB/hr
For example
Required throughput = 600 GB/hr
Ensure performance buffer is big enough
47
48. Lesson: System Monitoring Tools
48
Upon completion of this lesson, you will be able to:
• Describe Data Domain system monitoring tools
– SNMP
– syslog
– autosupport
– SUP
49. Alert
Monitoring a Data Domain System
Data Domain
system administrator
Daily
alerts and
autosupport reports
Daily
alerts and
autosupport reports
Data Domain
technical support
2. syslog 3. autosupport 4. SUB1. SNMP
49
50. SNMP
50
• You can monitor a Data Domain system via SNMP utilities
• You can integrate the Data Domain Management
Information Base (MIB) into SNMP monitoring
51. Syslog (Remote Logging)
syslog server
LAN
Port 514
System messages
Port 514
Sends system messages to remote syslog server
Uses TCP port 514
You collect logs
51
-
52. Autosupport
52
• Easy to install – just once at system setup
• Helps solve/prevent system problems
– Provides timely notification of significant issues
– Enables rapid response time to address or prevent problems
– Includes critical system data to aid support case triage and
management
55. Autosupport Via Enterprise Manager
• Data Domain systems provide alerts,
autosupport reports, and logs
• Access through Enterprise Manager
55
56. Autosupport Reports
• Using SMTP, sent to Data Domain
technical support daily at 6 am
local time (default)
• Contains system ID, uptime
information, system command
outputs, runtime parameters, logs,
system settings, status and performance
data, and other debugging information
• Long text report (500-800K)
• Sections parsed into data warehouse
for analysis and reporting
Subscribers
receive daily
detailed
reports
56
61. Logs
Every Sunday at 3 am
1. New log file opened
2. Old log file renamed
CLI: log view filename
61
62. Support Upload Bundle (SUB)
62
• Large (multi-GB sized) tar file
• Contains
– OS settings and log files
– System files (not customer data files) identified as needed
for system diagnosis by Data Domain support and
engineering
• Used to triage and diagnose a Data Domain system in the field
• CLI commands used to generate and optionally send (via http)
SUB to Data Domain support site
• Generated by sysadmin on Data Domain system via GUI/CLI
63. Module Summary
63
Key points covered in this module include:
• VTL and VTL library planning
• Snapshots, fastcopy, and data retention
• Data replication and recovery
• DD Boost and integration with EMC NetWorker
• Capacity and throughput planning
• Data Domain system monitoring tools
64. Product Demo
Click the link below to view a demonstration on Data
Domain.
Launch Demo
64
66. PROPERTIES
On passing, 'Finish' button:
On failing, 'Finish' button:
Allow user to leave quiz:
User may view slides after quiz:
User may attempt quiz:
Goes to Next Slide
Goes to Next Slide
At any time
At any time
Unlimited times