Mais conteĂșdo relacionado
Semelhante a Has Your Data Gone Rogue? (20)
Mais de Tony Pearson (20)
Has Your Data Gone Rogue?
- 1. © 2016 IBM Corporation
Has Your Data Gone Rogue?
Using IBM Flash and solutions to obtain
enhanced business insights
Tony Pearson, IBM
Master Inventor and Senior IT Architect
- 2. © 2016 IBM Corporation
1
What is
Happening?
Why did it
Happen?
What might
happen next?
What
actions
should
we take?
Client 1: Rebel Alliance
Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Prescriptive Analytics
- 3. © 2016 IBM Corporation
Structured,
Repeatable,
Linear
OLAP
cube
Unstructured,
Exploratory,
Iterative
Rebels are Inquisitive!
Reports Visualization and
Discovery
Hadoop / Spark
Data
warehousing
Stream
Computing
Integration and
Governance
Text Analytics
Business
Analyst
Data
Scientist
Analyze data2
2
âItâs no longer hard
to find the answer
to a given question;
the hard part is
finding the right
question. And as
questions evolve,
we gain better
insight into our
ecosystem and our
business.â
-- Kevin Weil,
Lead Analyst at Twitter
- 4. © 2016 IBM Corporation
Clients are facing explosive growth in Unstructured Data,
which is exactly why Analytics is so critical
3
*Exabytes
0
20
40
60
80
100
120
2009 2010 2011 2012 2013 2014 2015 2016 2017
Unstructured Data
Structured Data
Source: IDC
Unstructured
data growth of
60â80%
per year
creates
Web-scale
storage needs
Problem â Traditional Legacy Storage Designed for
Transactional, Structured Data
- 5. © 2016 IBM Corporation
IBM Systems Storage Portfolio
Flash for all primary storage workloads
DS8880
FlashSystem
A9000
IBM FlashCoreâą Technology Optimized
FlashSystem
A9000R FlashSystem
V9000
All flash array -
virtualizing the hybrid
Data Center
âą Best performance with
storage services &
selectable data
reduction
âą Targeting database/
analytics workloads
All flash array for cloud
service providers
âą Best performance with
full time data reduction
âą Targeting VDI and
VMware
FlashSystem 900
All flash array for application acceleration
XIV Gen3
High End
Capacity
Optimized
All flash array for
large deployments
âą Best performance
with full time data
reduction
âą Targeting mixed
workloads
High End
Server
- Mainframe
- Power
âą Extreme
reliability and
replication
âą Available in All
Flash & Hybrid
configurations
Storwize
V7000
V7000F
Mid-Range
Storwize
V5000
V5000F
Entry /
Mid-Range
SVC
DeepFlash
Elastic
Storage
Server
(ESS)
âą Extreme performance
âą Targeting database acceleration
& Spectrum Storage booster
Big Data
Flash
4
- 6. © 2016 IBM Corporation
IBM Systems
New Class of Flash: Big Data Flash
Scalable capacity and performance at low price points for big data
Performance can lead
to business results,
faster time to insights
Often do not benefit
from data reduction
technology, already
compressed files
Written once but
read often: video and
images
Source: IDC, 2015
Performance consistently better than that of the best HDDs
today
Cost comparable to that of performance optimized HDDs
Flash media that leverages flash Economics
Systems implementations that support massive scalability and
meet enterprise Requirements
Targeted primarily at big data and secondary storage
environments
â
â
Petabyte Scale of
unstructured data
and growing rapidly
Big Data
Attributes
Source: IDC, 2015
5
- 7. © 2016 IBM Corporation
HDD DeepFlash Conventional Flash
Price $ $$ $$$
Performance 10âs of milliseconds Sub Milliseconds Micro Seconds
Attributes âą High ingest rate
âą Low change rate
âą High read rate
âą Extremely latency sensitive
âą Can justify price premium
Typical use
cases
Big Data analytics (ex: video, health
care data), Hadoop, Spark
VDI, Server Virtualization, Database
and Application Acceleration
Not conventional Flash, a new class of Flash: Big Data Flash
Scalable capacity and performance at low price points for
big data
âą Performance consistently better than that of the best HDDs
âą Cost comparable to that of performance-optimized HDDs
âą Systems implementations that support massive scalability and meet enterprise
requirements
6
- 8. © 2016 IBM Corporation
IBM InfoSphere BigInsights is a 100% standard Hadoop distribution
By default, open source components are always deployed
Elect to use proprietary capabilities depending on your needs
In some cases, proprietary capabilities offer significant benefits
Open standards first, but with freedom of choice
7
HDFS
YARN
HIVE
MapReduce
PIG
Spectrum
Scale
Platform
Symphony
Big SQL
Adaptive
MapReduce
BigSheets
Share data with non-Hadoop applications
and simplify data management
Re-use existing tools and expertise,
Avoid additional development costs
Boost performance, support time-critical
workloads, do more with less
True multi-tenancy to boost service levels and
avoid duplication on infrastructure
Simplify access for end-users,
minimize software development
- 9. © 2016 IBM Corporation
Hadoop Analytics â HDFS vs IBM Spectrum Scaleâą
HDFS Save
Results
Discard
Rest
IBM
HDFS Transparency
Connector allows
HDFS-based programs
to process data without
application changes
(100% compatible)
IBM Spectrum Scale
Application data
stored on IBM
Spectrum Scale is
readily available for
analytics
Save
Results
JFS2
NTFS
EXT4
Data Sources
mashup of structured and
unstructured data from a variety
of sources
Actionable Insights
Provides answers to the
Who, What, Where,
When, Why and How
Business Intelligence
& Predictive Analytics
> Competitive Advantages
> New Threats and Fraud
> Changing Needs
and Forecasting
> And More!
8
- 10. © 2016 IBM Corporation
Elastic Storage Server (ESS) with Spectrum Scale
5146-GLx models
GL2, GL4, GL6
60-drive 4U drawers
âą SSD and Nearline HDD
5146-GSx models
GS1, GS2, GS4, GS6
24-drive 2U drawers
âą All SSD
âą SSD and 10K HDD
IBM POWER8
servers
NSD
Client
Twin-tailed
Elastic
Storage
Server
TCP/IP or
RDMA
DeepFlash ESS (5147-GFx)
64-drive 3U drawers
âą Pre-loaded with 32 drives
âą All SSD (8 TB)
- 11. © 2016 IBM Corporation
IBM Systems
New Big Data alternative: instead of HDD, use Big Data Flash
For clients who value application response time and/or throughput per rack unit
Improve application response time by 8X
Improve throughput/rack unit by 2.8X
Improve MTBF
Improve power & cooling costs by 30%-50%
8X faster response time
and same throughput
as the HDD version
28U
25GB/S
File Server
HardDrivesHardDrives
File Server
All Flash
All Flash
Move from Big Data
HDD configuration
To this Big Data Flash
configuration
10U
25GB/S
10
- 12. © 2016 IBM Corporation
IBM DeepFlash 150 storage enclosure
|
11
- 13. © 2016 IBM Corporation
Introducing IBM DeepFlashTM
Elastic Storage Server
8X faster response time, 8X lower latency compared to HDD version*
2 Enclosures, 10U
360 TB of usable Flash
Max Read 26.6 GB/sec;
Max Write 16.6 GB/sec
1 Flash Enclosures, 7U
180TB of usable Flash;
Max Read 13.6 GB/sec;
Max Write 9.3 GB/sec
ESS GF1 ESS GF2
*based on SPEC SFS results
Spectrum Scale
I/O server
(POWER8)
DeepFlash
JBOF
DeepFlash
JBOF
12
- 14. © 2016 IBM Corporation
Data Protection Schemes
Tolerate 1 drive failure Tolerate 2 drive failures Tolerate âMâ failures
RAID-1 / RAID-10
K pieces 2 x K slices
RAID-5
K pieces K + 1 slices
2.0X
1.2X
3.0X
1.5X
1.3XTriplication
K pieces 3 x K slices
RAID-6
K pieces K + 2 slices
Erasure Coding
K pieces K+M = N
slices
- 15. © 2016 IBM Corporation
Share-Nothing versus Shared-Disk Deployments
Data
Data
Data
Parity
Data
Data
Data
Copy
Copy
Copy
Copy
Copy
Copy
TCP/IP
or RDMA
Need more compute?
Add another node!
Elastic Storage Server reduces storage to
one copy of the data with Erasure Coding
Scale compute and storage
capacity separately
Many solutions
keep 3 replicas
of the data
Need more
storage capacity?
Add another
node!
3x versus 1.3x
TCP/IP
or RDMA
Data
- 16. © 2016 IBM Corporation
Introducing Spectrum Control Storage InsightsâŠ
âą Convergence of analytics, cloud, and data management
âą Designed to
Reduce storage costs, without the traditional up-front investments
Enable actionable visibility within minutes
Provide rapid insights to critical assets
15
Deployed instantly from the cloud
Understand the storage environment and its
usage
Monitor capacity and performance
Reclaim allocated, but unused space
Optimize data placement with advanced analytics
IBM is the only major storage vendor with a
cloud-based SaaS offering for Storage Management
- 17. © 2016 IBM Corporation
16
Client 2: Galactic Empire
Our major project is
behind schedule!
A major test is
imminent!
Too many
clones!
How do we
keep these
plans secret?
- 18. © 2016 IBM Corporation
IBM FlashSystem Models
17
900 V9000 A9000 A9000R
Tier 0 â Lean & Mean Tier 1 â Robust functionality
Optimized for:
âą Application Boost
Optimized for:
âą Traditional SAN
âą Databases
âą Automated Tiering
âą Virtualize almost 400
vendor devices
Optimized For:
âą Cloud / Multi-tenancy
âą Virtual Desktop
Infrastructure (VDI)
âą Virtual Machines
âą VMware, HyperV, etc.
- 19. © 2016 IBM Corporation
Source: IDC, The Copy Data Problem: An Order of Magnitude Analysis, doc #239875
50+
Copies
COPY DATA GROWTH
StorageGrowth
Time
Primary Data
~35%
YoY
Copy Data will be a $51B problem by 2018
âą Consumes as much as 60% of disk capacity
âą Drives 65% of Storage Software and 85% of the Storage
Hardware spending
âą Almost all copies sit idle
Copy Data
Mgmt Gap
Geometric Copy
Data Growth
Linear Data Growth1 Resilient workload
(Disk Backup) 23
Non prod workload
(Test/Dev or DevOps) 6
Resilient workload
(Mirror) 1
Compliance workload
(Archive) 1
Big Data workload
(Analytics) ?
Primary
Data 1
Todayâs IT Challenge: Too many clones!
18
- 20. © 2016 IBM Corporation
Your Infrastructure
IBM Storwize
V7000,V5000, V3000
IBM Spectrum Copy Data
Management
Software-Defined
Copy Data Management
Platform
âą Cloud integrated
âą DevOps enabled
Transfor
m
Catalog
âą Discover
âą Search
Automate
âą SLA compliance
âą Policy-based
LEVERAGE
Use Cases
Protection and
Disaster Recovery
Hybrid Cloud
Applications
IBM FlashSystem A9000
IBM FlashSystem A9000R
IBM FlashSystem V9000
Also supports:
SAN Volume
Controller Spectrum
Virtualize Spectrum
Accelerate XIV
Storage Arrays
VersaStack
EMC VNX and Unity
NetAPP
DevOps, Test/Dev
Automated Copy
Management
IT Modernization through âIn Placeâ Copy Data
Management
19
- 21. © 2016 IBM Corporation
Security Strength is based on Algorithm and
Number of Bits in Key
20
AES RSA ECC Years
1024 160 106
2048 224 109
128 3072 256 1015
192 7680 384 1033
256 15360 512 1051
Data*Data
Data* Data
*
*
Symmetric Key (AES 256)
âą Same key is used to encrypt/decrypt
âą Fast, ideal for large amounts of data
âą Must keep the key secret
Encryption âPublicâ Key
Decryption âPrivateâ Key
Pairs of different keys are used to
encrypt & decrypt data
Encrypt with âPublicâ key; it may be
distributed widely available without
fear of compromise
Decrypt with âPrivateâ key; must
keep this key secret
Asymmetric Key (RSA 2048)
ED
Key
Pair
Data
Data
Data Data
E
DAES â Advanced Encryption Standard
RSA â Rivest Shamir Adleman
ECC â Elliptical Curve Cryptography
- 22. © 2016 IBM Corporation
Two-Tier Encryption Scheme
21
Problem:
Realtors, Landlords, and Apartment
managers must carry hundreds of
keys, one unique to each dwelling
unit
Solution:
All units have their unique key
kept inside a locked box hanging
on the door knob.
Realtors, Landlords, and
Apartment managers carry a
single master key that opens
every lockbox
Data
A
E
D
A
Data
B
B
Encryption:
Each flash, disk, or tape assigned a
unique symmetric âData Keyâ
Data key itself is encrypted or
âwrappedâ
with master
âencryption keyâ
Decryption:
Data key is decrypted with master
âdecryption keyâ
Unique data key for this flash, disk, tape
used to read and write contents
- 23. © 2016 IBM Corporation
How Encryption Keys are used in different
IBM storage devices
22
Data
A
A
Data
B
B
âą System power-on
âą System restart / firmware update
âą User-initiated re-key operation
âą Tape mount
1 key pair per system 1 key pair per
cartridge
FlashSystem
900
XIV,
DS8000
Spectrum
Virtualize
Enterprise Tape
1 key per
self-encrypting
flash card
1 key per
self-encrypting
drive (SED)
1 key per
storage pool
1 key per
cartridge
ED
Key
Pair
A B
External Master Key:
Asymmetric keys (RSA 2048-bit) stored in volatile memory
Needed only for:
Internal Data Key:
Symmetric key (AES 256-bit) randomly generated, encrypted by master key
and stored on the storage media, used for high-speed read/write activity
- 24. © 2016 IBM Corporation
keystore
IBM Security Key Lifecycle Manager (SKLM)
23
SKLM
Security
Admin
Storage
Admin
secure communication
ED
Key
Pair
External Master Key:
Asymmetric keys (RSA
2048-bit) stored in volatile
memory, only needed for:
âą System power-on
âą System restarts (such as
firmware upgrades)
âą Re-key operations
Device requests key from IBM SKLM,
SKLM sends master key to device
Storage admin requests USB
thumb drive from Security team,
inserts into device
lockbox Or just leave USB thumb drive
in device all the time
- 25. © 2016 IBM Corporation
SKLM
IBM SKLM supports
flash, disk and tape
storage
Spectrum Virtualize supports
either USB or
IBM SKLM
Encrypted storage
pools can mix
devices
Where is Encryption Performed?
24
IBM Spectrum Virtualizeâą
SVC, Storwize, FlashSystem V9000, VersaStack
SAS
Internal
storage,
Expansion
drawers
CPU
FlashSystem
900
XIV, DS8000,
FlashSystem
A9000/R
Non-encrypting
storage TS1120,
LTO4 and
newer
SAN
SAS controller
uses HW chip
Uses AES-NI
instructions
Smart enough not to
âdouble encryptâ
- 26. © 2016 IBM Corporation
Motivations for Data-at-Rest Encryption
Broken drives Decommission Mandate Theft
Without
encryption
â90% of drives
returned had
readable dataâ
-- Seagate
Physically destroy
drive, or do not
return them to
manufacturer
Hire storage vendor to
securely erase drives,
using Department of
Defense (DoD) method
of multiple over-writes
Fail government
or corporate
compliance
audits
Declare data breach
Pay for all affected
clients and
employees credit
monitoring
Encryption-- USB
drive left in
device
Return broken
drives to
manufacture for
warranty
replacement
Overwrite or erase
decryption keys data
is âcryptographically
erasedâ
Remove USB
drives before
auditors or
inspectors
arrive!
Encryption--
Lockbox or
SKLM server Pass audits
No breach if thieves
do not have access
to decryption keys
25
- 27. © 2016 IBM Corporation
26
Galactic Empire
âą Project is behind schedule, and a
major test is imminent
âą IBM FlashSystem
âą IBM Spectrum Copy Data
Management
âą Need to protect secret plans
âą IBM Security Key Lifecycle
Manager
Rebel Alliance
âą Reckless, aggressive, and
undisciplined
âą Rebels are inquisitive!
âą IBM DeepFlash ESS
âą IBM Spectrum Control Storage
Insights
- 28. © 2016 IBM Corporation
And now⊠enjoy the movieâŠ
27
May the Force be with us!
- 29. © 2016 IBM Corporation
About the Speaker
Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line. Tony joined IBM Corporation in
1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings on storage topics
covering the entire IBM Storage product line, IBM Spectrum Storage software products, and topics related to Cloud Computing,
Analytics and Cognitive Solutions. He interacts with clients, speaks at conferences and events, and leads client workshops to
help clients with strategic planning for IBMâs integrated set of storage management software, hardware, and virtualization
solutions.
Tony writes the âInside System Storageâ blog, which is read by thousands of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by âNetworking Worldâ magazine, and #1
most read IBM blog on IBMâs developerWorks. The blog has been published in series of books, Inside System Storage: Volume
I through V.
Over the past years, Tony has worked in development, marketing and consulting for various storage hardware and software
products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in Electrical
Engineering, both from the University of Arizona. Tony holds 19 patents for inventions on storage hardware and software
products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
tpearson@us.ibm.com
Tony Pearson
Master Inventor
Senior IT Architect
IBM Storage
2
8
- 30. © 2016 IBM Corporation
The Right Flash for the Right Workload
Key Attributes
Typical
Workloads,
Applications & Use
Cases
Business Critical Storage
z/OS Support
High Performance
Highest Availability
z/OS (GDPS)
Power HA
Power i HA
Three-site/Four-site
Six 9âs Reliability
Enterprise Scalability
High-availability/Low RTO
applications
High-performance OLTP
Real time analytics
High-performance data
warehouse
IBM DS8888
Virtual Storage Infrastructure
Heterogeneous Enterprise-class
Data Services
Dynamic Data Migration
Multi-Vendor Management
Data Reduction (Compression)
Multi-site active-active
Traditional structured workloads
required block storage
Systems of Record
OLTP
Data Warehousing w/ Oracle,
DB2, SQL Server, MySQL,
SAP, SAS
Analytics
FlashSystem V9000
Storwize V7000F
Storwize V5000F
Grid Scale Cloud Storage
Cloud-optimized (QOS, Multi-
Tenancy)
Predictable High Performance
with Data Reduction
Technologies (including
deduplication)
Ease-of-management
Large-scale distributed block
workloads & applications
VDI
SAP (Oracle)
Exchange
VMware / KVM server
environments
CSPs (Mixed workloads,
Multi-tenancy)
Hybrid cloud architectures
FlashSystem A9000
FlashSystem A9000R
Big Data Storage
Multi-protocol support
Policy-driven tiering
Single namespace data ocean
High-performance file storage
High bandwidth
Distributed file/object
Hadoop (M/R)
Media Streaming / Video
SAS
Spark (In-Memory)
HPC
Content Repositories
High-performance backup
target
NAS filer consolidation
IBM DeepFlash ESS
w/ IBM Spectrum Scale
29
- 31. © 2016 IBM Corporation
Spectrum Control âice breakerâ Assets
30
- 32. © 2016 IBM Corporation
IBM Spectrum Control on IBM Cloud Marketplace
http://www.ibm.com/marketplace/cloud/analytics-driven-data-management/us/en-us
31
- 33. © 2016 IBM Corporation
Email:
tpearson@us.ibm.com
Twitter:
twitter.com/az990tony
Blog:
ibm.co/Pearson
Books:
www.lulu.com/spotlight/990_tony
IBM Expert Network on Slideshare:
www.slideshare.net/az990tony
Facebook:
www.facebook.com/tony.pearson.16121
Linkedin:
https://www.linkedin.com/in/az990tony
Additional Resources from Tony Pearson
32
- 34. © 2016 IBM Corporation
IBM Tucson Executive Briefing Center
âą Tucson, Arizona is home for storage
hardware and software design and
development
âą IBM Tucson Executive Briefing Center
offers:
âą Technology briefings
âą Product demonstrations
âą Solution workshops
âą Take a video tour!
âą http://youtu.be/CXrpoCZAazg
33
- 35. © 2016 IBM Corporation
Trademarks and Other Disclaimers
34
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central
Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are
trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT,
and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S.
Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell
Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP,
IBM Corp. and Quantum in the U.S. and other countries.
STAR WARS ROGUE ONE is a trademark of Lucasfilm Ltd. LLC.
Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind
The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM
list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability,
or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such
commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount
of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements
equivalent to the ratios stated here.
Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your
geography.
Photographs shown may be engineering prototypes. Changes may be incorporated in production models.
© IBM Corporation 2016. All rights reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the
World Wide Web at http://www.ibm.com/legal/copytrade.shtml. ZSP03490-USEN-00