11. Put your energy into the constraint
Top 5 constraints in IT
1. Dev environments setup
2. QA setup
3. Code Architecture
4. Development
5. Product management
- Gene Kim
Surveyed
• 14000 companies
• 100s of CIOs
17. Development Pipeline for QA
17
0 2 4 6 8 10 12 14 16 18 20 22 24
Reset
Test
Reset
Test
Reset
Test
Physical Data
Wait Time
Hours
Refresh
( > 80%)
Testing(< 20%)
18. Data Management not Agile
18
• 20% SDLC time lost waiting for data
• 60% dev/QA time consumed by data tasks
Conclusion:
Data management does not scale to
Agile
- Infosys
Data is the Constraint
25. 2. Bad data leads to bugs: late stage bugs
Dev QA UAT Production
26. 2. Bad data leads to bugs: late stage bugs
Dev QA UAT Production
#
bugs
Found
27. Dev Testing UAT Production
2. Bad data leads to bugs: late stage bugs
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct
Software
Engineering
Economics
– Barry Boehm (1981)
28. Developer Asks for
DB
Get
Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)
3. Slow environment builds: delays
29. Why are hand offs so expensive?
1hour
1 day
9 days
3. Slow environment builds: delays
51. Physical Data : late stage bugs
Dev QA UAT Production
0
50
100
150
200
250
300
350
400
450
500
Dev Testing UAT Production
Bugs Discovered Legacy
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct
Cost
To
Correct
52. Physical Data : find bugs fast
Dev QA UAT Production
Dev Testing UAT Production
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct
53. RefreshTest Refresh
Test
Refresh
Test
Virtual Data : Fast Refresh
53
0 2 4 6 8 10 12 14 16 18 20 22 24
Hours
Virtual Data
Physical Data
Bookmark, Reset
99% Less Downtime Data FederationVersion Control
Bookmark and BranchQuickly Refresh Sync across data sources
54. Virtual Data: Version Control
54
Dev Dev
2.1 2.2
Production Time Flow
Live Archive data for years
• Archive EBS R11 before upgrade to R12
• Sarbanes-Oxley
• Dodd-Frank
• Financial Stress tests
Production
62. Tradition Protection: Network & Perimeter
EndpointsPerimeter Defense
Protect the
Interior
Encryption
Network
Intrusion
Detection
Endpoint
Defense
“Organizations should use data
Masking to protect sensitive data
at rest and in transit from insiders'
and outsiders' attacks.”
- Gartner
Magic Quadrant for
Data Masking Technology
63. Insider Threats Are Costly
$1,075
$1,900
$7,378
$33,565
$81,500
$85,959
$96,424
$126,545
$144,542
Botnets
Viruses, worms,…
Malware
Stolen devices
Malicious code
Phishing & social…
Web-based attacks
Denial of services
Malicious insiders
Average Annualized Cyber Crime Cost Weighted by
Attack Frequency
Consolidated view, n = 252 separate companies
2015 Global Cost of Cyber Crime Study,
Ponemon Institute
64. Costs more
Quality is
lower
Hard to mask
consistently
Moving data
from prod to
non-prod
takes a long
time
Ease of Use
Instant data
Consistent
65. Virtual Data Masking
• Automates discovery
• Provides different masking algorithms for different data types
• Mask once clone many with thin cloning
Mask Data
6 hours Clone 18 Hours
Clone
15 min
Mask Data
Mask
4 hours
Mask
Data
80. 9TB database 1TB change day : 30 days
0
10
20
30
40
50
60
70
week1
week2
week3
week4
original
Oracle
Delphix
Storage
Required
(TB)
Days
81. RPO & RTO
81
• RPO
– Any time in last 30 days
– Down to the second
• RTO
– Minutes
– Push button
0
5
10
15
week1
week2
week3
week4 original
Delphix
87. • Projects “12 months to 6 months.”
– New York Life
• Insurance product “about 50 days ... to about 23 days”
– Presbyterian Health
• “Can't imagine working without it”
– State of California
Virtual Data Quotes
90. A database refresh in 15 minutes?
That is mind blowing!
Delphix nailed it for us.
- Matt Lawrence , Sr Director Wind River (Intel)
Took 3 weeks to build a dev env
now with Delphix takes less than a day
the db part is less than 15 minutes
- Marty Boos , Stubhub (Ebay)
Delphix goes beyond storage
Delphix so much more than
We thought it was
-Michael Brow State of Colorado
91. Worth investing on this product
the technology is strong and
value prop is high
- Deloitte
I'm convinced about Delphix's
technology Delphix can really
increase the quality of Dev / QA
- Oaktable Member
Delphix allows us to move fast and setup database copies in seconds
Delphix is powerful and allowed us to scale from 2 projects to 11
We need Delphix to scale our agile environment
– Tim Campos, CIO, Facebook
92. The Goal : eliminate the constraint
Improvement
not made
at the constraint
is an illusion
Theory of Constraints
if you look at what’s really impeding flow from development to operations to the customer, it’s typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment. When that happens terrible things happen. People actually horde environments. They invite people to their teams because the know they have reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
if you look at what’s really impeding flow from development to operations to the customer, it’s typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment. When that happens terrible things happen. People actually horde environments. They invite people to their teams because the know they have reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
if you look at what’s really impeding flow from development to operations to the customer, it’s typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment. When that happens terrible things happen. People actually horde environments. They invite people to their teams because the know they have reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
if you look at what’s really impeding flow from development to operations to the customer, it’s typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment. When that happens terrible things happen. People actually horde environments. They invite people to their teams because the know they have reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
if you look at what’s really impeding flow from development to operations to the customer, it’s typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment. When that happens terrible things happen. People actually horde environments. They invite people to their teams because the know they have reputation for having a cluster of test environments so people end up testing on environments that are years old which doesn’t actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
IT bottlenecks
Setting Priorities
Company Goals
Defining Metrics
Fast Iterations
IT version of
“The Goal”
by E. Goldratt
“One of the most powerful things that organizations
can do is to enable development and testing to get
environment they need when they need it“
What happens now in the industry
Typically the application development life cycle is something like this
We have some production database with production applications running on top of the database
And we have developers either customizing that application or writing new functionality for the application
We need copies of that data to make sure our code runs correctly when it gets to production develop and
We have teams of people, DBAs, sys admins, storage admins, etc making these copies
It’s slow work to copy all this data
It’s tedious work
All the while we have developers and QA testers waiting for these copies
Not enough resources
Contention on shared environments
Lack of enough environments
Late stage bug discovery
Faulty Data leading to bugs
Subsets
Synthetic data
Old data
Slow environment builds
Delays
Developers waiting
QA slow and expensive
Not enough resources
Contention on shared environments
Lack of enough environments
Late stage bug discovery
Faulty Data leading to bugs
Subsets
Synthetic data
Old data
Slow environment builds
Delays
Developers waiting
QA slow and expensive
Not sure if you’ve run into this but I have personally experience the following
When I was talking to one group at Ebay, in that development group they
Shared a single copy of the production database between the developers on that team.
What this sharing of a single copy of production meant, is that whenever a
Developer wanted to modified that database, they had to submit their changes to code
Review and that code review took 1 to 2 weeks.
I don’t know about you, but that kind of delay would stifle my motivation
And I have direct experience with the kind of disgruntlement it can cause.
When I was last a DBA, all schema changes went through me.
It took me about half a day to process schema changes. That delay was too much so it was unilaterally decided by
They developers to go to an EAV schema. Or entity attribute value schema
Which mean that developers could add new fields without consulting me and without stepping on each others feat.
It also mean that SQL code as unreadable and performance was atrocious.
Besides creating developer frustration, sharing a database
also makes refreshing the data difficult as it takes a while to refresh the full copy
And it takes even longer to coordinate a time when everyone stops using the copy to make the refresh
All this means is that the copy rarely gets refreshed and the data gets old and unreliable
Not sure if you’ve run into this but I have personally experience the following
When I was talking to one group at Ebay, in that development group they
Shared a single copy of the production database between the developers on that team.
What this sharing of a single copy of production meant, is that whenever a
Developer wanted to modified that database, they had to submit their changes to code
Review and that code review took 1 to 2 weeks.
I don’t know about you, but that kind of delay would stifle my motivation
And I have direct experience with the kind of disgruntlement it can cause.
When I was last a DBA, all schema changes went through me.
It took me about half a day to process schema changes. That delay was too much so it was unilaterally decided by
They developers to go to an EAV schema. Or entity attribute value schema
Which mean that developers could add new fields without consulting me and without stepping on each others feat.
It also mean that SQL code as unreadable and performance was atrocious.
Besides creating developer frustration, sharing a database
also makes refreshing the data difficult as it takes a while to refresh the full copy
And it takes even longer to coordinate a time when everyone stops using the copy to make the refresh
All this means is that the copy rarely gets refreshed and the data gets old and unreliable
KLA Tencore
Stateado
To circumvent the problems of sharing a single copy of production
Many shops we talk to create subsets.
One company we talked to , spends 50% of time copying databases
have to subset because not enough storage
subsetting process constantly needs fixing modification
Now What happens when developers use subsets -- ****** -----
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
Internet vs browser
Automate or die – the revolution will be automated
The worst enemy of companies today is thinking that they have the best processes that exist, that their IT organizations are using the latest and greatest technology and nothing better exists in the field. This mentality will be the undermining of many companies.
http://www.kylehailey.com/automate-or-die-the-revolution-will-be-automated/
Data IS the constraint
Business skeptics are saying to themselves that data processes are just a rounding error in most of their project timelines, and that they are sure their IT has developed processes to fix that. That’s the fundamental mistake. The very large and often hidden data tax lay in all the ways that we’ve optimized our software, data protection, and decision systems around the expectation that data is simply not virtual. The belief that there is no agility problem is part of the problem.
http://www.kylehailey.com/data-is-the-constraint/
Due to the constraints of building clone copy database environments one ends up in the “culture of no”
Where developers stop asking for a copy of a production database because the answer is “no”
If the developers need to debug an anomaly seen on production or if they need to write a custom module which requires a copy of production they know not to even ask and just give up.
Fastest query is the query not run
In the physical database world, 3 clones take up 3x the storage.
In the virtual world 3 clones take up 1/3 the storage thanks to block sharing and compression
Delphix radically changes this paradym
Delphix is software that we provide as
a virtual machine OVA file that you spin up on any commodity intel hardware
You give us any storage
Delphix maps it’s own proprietary file system on to the storage
We have a web UI
With the web UI you can point us to any database or data source such as
Oracle, SQL Server, Sybase, Postgres, flatfiles etc
At link time we take one full copy.
We only do it once and never again
We compress the data so
If the data is 3TB on source it will be
1TB on Delphix
From then and forever we just pull in the changed blocks
With the changed blocks Delphix builds up a timeline of data versions
The default window is 2 weeks but you can configure it to be 2 months or 2 years
You can spin up a copy of the data down to the second at any point in time in the time window
Now with a few clicks of a mouse and in a few minutes we can spin up copies on
Developer machines, QA machines, UAT etc
When we make copies there is no data being moved
We just point the copies to data that already exists on Delphix
There is no data on the target machines
All the data is on Delphix
Delphix looks like a NAS or NFS file server to the target machines
We give them a read writeable point in time snapshot o the data
We also track all the block changes on the virtual databases
With the block change tracking on the virutal database we can do cool thigs links
Roll them back, branch them, version them, share them, book mark the data
All this is super simple to run
Delphix can generally be be run by a junior DBA in quarter time
The coolest thing, especially for DevOps process, is self server interface for developers and testers
Where they can refresh data from production
Roll back changes
Bookmark and share data between dev and QA
We can treat data the way we treat code
For example Stubhub went from 5 copies of production in development to 120
Giving each developer their own copy
Stubhub estimated a 20% reduction in bugs that made it to production
Slow downs mean bottlenecks
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
Physically independent but logically correlated
Cloning multiple source databases at the same time can be a daunting task
One example with our customers is Informatica
Who had a project to integrate 6 databases into one central database
The time of the project was estimated at 12 months
With much of that coming from trying to orchestrating
Getting copies of the 6 databases at the same point in time
Like herding cats
Walmart.com
Informatical had a 12 month project to integrate 6 databases.
After installing Delphix they did it in 6 months.
I delivered this early
I generated more revenue
I freed up money and put it into innovation
won an award with Ventana Research for this project
Data masking should be a budgeted item in the enterprise IT spending. JP Morgan—joined by other banks and major companies—is going to spend a large amount on cybersecurity, yet still doesn’t feel like this sum is enough. Why is that?
Traditional security is network security, AKA perimeter defense. Keeps the exterior protected.
Enhanced by endpoint defense, which locks down phones/laptops in this era of bring your own device (BYOD)
That being said, organizations are taking increasingly longer to detect network and system intrusions. According to a Trustwave survey, an external party informed the company of the breach in 80% of cases.
That’s why it’s so important to protect the interior—protecting the data itself. As an analogy, perimeter security is like building castle walls—but protecting the interior means strong body armor for all of the knights you send out onto the open battlefield.
Unshackle yourself from massive infrastructure drag and bureaucratic quagmires
And put a jetpack on your IT organizations and application development projects
Moving the data IS the big gorilla. Eliminating the data tax is crucial to the success of your company. And, if huge databases can be ready at target data centers in minutes, the rest of the excuses are flimsy.
virtual data – virtualized data – uses a small footprint. A truly virtual data platform can deliver full size datasets cheaper than subsets. A truly virtual data platform can move the time or the location pointer on its data very rapidly, and can store any version that’s needed in a library at an unbelievably low cost. And, a truly virtual data platform can massively improve app quality by making it reliable and dead simple to return to a common baseline for one or many databases in a very short amount of time. Applications delivered with agile data can afford a lot more full size virtual copies, eliminating wait time and extra work caused by sharing, as well as side effects. With the cost of data falling so dramatically, business can radically increase their utilization of existing hardware and storage, delivering much more rapidly without any additional cost. An agile data platform presents data so rapidly and reliably that the data becomes commoditized – and servers that sit idle because it would just take too long to rebuild can now switch roles on demand.
Now let’s look at Delphix Data as a Service
With Delphix and Data as a Service, provisioning copies of data becomes push button functionality that finishes in minutes.
How does this work?
Delphix is provided as software
Delphix software is a virtual machine.
Delphix virtual machine manages storage and maps its own advanced specialized file system onto storage.Delphix an be used with any storage such as EMC, Netapp, Fujitsu, JBODs etc
Once Delphix is installed and has been allocated storage it can be point at a data source.
Once and only once, Delphix will pull in a full copy of the data source in and compress it.
From then and forever Delphix just pulls in the changed data blocks and stores them creating a time line of data.
From that timeline clone copies of production can be spun up in minutes on target machine.
The clones can be made at any point in time from the time flow storage on Delphix down to the second.
Each clone is for all intents and purposes a completely independent, full size, read/write copy of production
Delphix can typically be managed by a single person in just a faction of their time
Delphix provides a special developer centric self service interface for developers and QA where
Developers can provision their own copies of data and have access to typical developer features such
As rollback, bookmark, branching and refresh.
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>