A quick look at some of the available functionality for SQL Server developers who have access to Visual Studio 2010 and SQL-Hero.
With Visual Studio 2010 Premium (and Professional to a degree) delivering similar capabilities to what was available in VS 2008 Database Pro Edition, the ability to generate a mass amount of sample data for your database has only gotten more accessible with time.
Realizing that other tools exist in this space and not all SQL developers use Visual Studio, we’ll also take a look at the third party data generation facility available in SQL-Hero, seeing how we can create thousands (or millions!) of records very quickly using a powerful rules engine, plus automate this process to support continuous integration strategies.
SQL Server Managing Test Data & Stress Testing January 2011
1. Managing Test Data and Stress
Testing Your SQL Applications
Speaker: Joel Champagne
San Francisco SQL Server User Group
January 12, 2011
Mark Ginnebaugh, User Group Leader
www.bayareasql.org
2. Tonight’s Speaker
g p
Joel Champagne
Developer of large enterprise applications for 20 years
Focus on data, in particular using the Microsoft stack (SS*S)
(SS S)
and .NET
Involved in all stages of application life-cycle, from
envisioning through implementation
Areas of Interest:
Tool development work
Developer productivity
Tonight’s T i Managing Test D t and Stress Testing
T i ht’ Topic: M i T t Data d St T ti
Your SQL Applications
3. Upcoming Training
Upcoming Training
• Upcoming full‐day training (minimal cost, target is
p g y g( , g
late Feb or March 2011):
www.codexframework.com/training
– SQL Source Control
SQL Source Control
– Stress/Volume Testing
– SQL Unit Testing
– SQL‐Hero – more details (www.codexframework.com)
• Email Joel: joelc@codexframework.com
l l l d f k
• Twitter: @sqlheroguy
4. What I want to cover…
What I want to cover…
• The why & how
The why & how
• Large data volumes – benefits, practical looks
• Specific examples, in detail
S ifi l i d il
• Obfuscation of existing data
• Load testing
• Both MS and non MS options
Both MS and non‐MS options
• Let’s keep it interactive
5. In the beginning…
In the beginning…
• … of the development process
… of the development process
– We can know characteristics of entities
– We can know ways to optimize (e.g. indexes)
y p ( g )
– We can have good intentions
• Ultimately, the little details matter:
Ultimately, the little details matter:
– Style counts! – not always shortest or most
elegant performs best
– SQL can seem like an art instead of a science
sometimes
6. Problem is…
Problem is…
• How can we know what we’ve got is going to:
o ca e o at e e got s go g to:
– Perform well, not just as we develop, but years from
now, in production
– Perform well if reality changes
– Actually behave as expected with lots of data
• A d
And, we’d like to:
’d lik t
– Work with semi‐realistic data, even before users have
had a chance to do a lot of interaction with the app
had a chance to do a lot of interaction with the app
– In some cases may want to work with a “well known”
data set, to support repeatable unit & system testing
7. Solutions
• VS 2010
VS 2010
– Database project ‐> Data Generation Plan
– Premium/Ultimate
/
– Custom generator extensibility
• SQL‐Hero
SQL Hero
– Generate data option
• Others
– Custom developed (scripting, bcp, PowerShell,
etc.)
8. Things to consider…
Things to consider…
• “Realism”
Realism
– Cardinality
– Use of NULL
– Foreign key lookups
– Implied rules (sequence number example)
p ( q p )
– … essentially rules at both column and table level
(constraints)
– Names, addresses, etc. – pros, cons
• Deterministic vs. True Random
10. Scenario #1
Scenario #1
• New development effort empty database
New development effort, empty database
• We have a search screen we’ve written –
seems like it s fast but not a lot of data
seems like it’s fast but not a lot of data
• … what about 3 years from now?
– 12,000,000+ Customers
– 26,000,000+ Orders
– 16,500,000 Addresses
11. Options #1
Options #1
• VS 2010
VS 2010
• SQL‐Hero
• Data Model
• Demos
12.
13. Scenario #2
Scenario #2
• Let’s take an end‐to‐end look at a “real example”
et s ta e a e d to e d oo at a ea e a p e
from a customer…
– Team structure
– “The strategy”
• A fourth database, user participation
– Design doc from BA
Design doc from BA
– The process of creating data and testing
– Tuning efforts
Tuning efforts
– Re‐testing
– Conclusions…
14.
15.
16.
17. Scenario #3
Scenario #3
• Large database, on‐going development work,
g , g g p ,
post‐implementation
• As we try to modify or fix bugs, some issues rely
on production‐quality data
on production quality data
• Option: Copy prod to dev/QA
• Problems:
– Security
– Coordinating with on‐going dev work
• Another scenario: just looking to add more data
to an existing DB
18. Options #3
Options #3
• VS 2010
VS 2010
– Data Transformation
• SQL Hero
SQL‐Hero
– “Scramble” existing data option
• Demos
20. Scenario #4
Scenario #4
• Actual “stress testing”, being high concurrent
Actual stress testing being high concurrent
load
• Need to understand the results of high load
Need to understand the results of high load
– Slowness
– Of
Often times, blocking
i bl ki
– Sometimes deadlocks we didn’t anticipate
– Have seen lead to on‐going monitoring, tuning
efforts
21. Options #4
• SQLIO
– SQLIOStress creates separate data and log files to simulate the I/O
patterns that SQL Server will generate to its data file (.mdf) and its log file
(.ldf). SQLIOStress does not use the SQL Server engine to perform the
(.ldf). SQLIOStress does not use the SQL Server engine to perform the
stress activity so it can be used to exercise a computer before you install
SQL Server." (From SQLIOStress Readme.doc)
• SQL Profiler
– C
Can collect a workload and “replay”
ll kl d d “ l ”
• VS 2010
– Extensive Load Testing support – Test rigs, multiple agents possible
– Invocation of test cases in NET pros and cons
Invocation of test cases in .NET – pros and cons
• SQL‐Hero
– Executing script with concurrency option – generate high load using “real”
trace simple, visualization for results
trace – simple, visualization for results ‐ successful
– Data visualization of collected trace information (often the key in analysis)
– Data visualization of trace information, over longer timeframes
– Production monitoring g
– Screen shots: daily life examples
– Future: Template to build web test code to invoke from VS2010 test rig
22.
23.
24.
25.
26.
27. Putting it together…
Putting it together…
• Build process
u d p ocess
– Build DB from SC ‐> Populate Fully ‐> Use
• Advantage: you know will match source of truth
• Disadvantage: longer‐term testing, relying on existing data
– Scripting / automation
– Microsoft guidance doc:
Microsoft guidance doc:
http://vsdatabaseguide.codeplex.com
– Hybrid options common
• Another important element: Unit testing
– Knowing if something becomes broken, early
28. To learn more or inquire about speaking opportunities, please
q p g pp ,p
contact:
Mark Ginnebaugh, User Group Leader mark@designmind.com