1st in the "Rewriting the Rules of Perfomance Testing" series. Scott Barber and Dan Bartow discuss ways load and performance teams have "cheated" in the past due to constraints that are eliminated with new cloud-based approaches to testing.
1. SOASTA Webinar Series
CLOUD TESTING RULE 1:
Rewriting the Rules of Stop Cheating and
Start Running
Performance Testing
Realistic Tests
2. BC: (Before Cloud)
We Worked With What We Had…
Before the web, when apps served
hundreds, there was…
Circa 1991
When apps peaked at
thousands, we had a few more
options
Turn of 21st
Century
“Virtual Users” were a valuable
commodity
1 VU = $1200!
Yet many were left
wanting
Untested websites, 2011: 75%
3. Necessity Led to Workarounds
How we’ve “cheated” to get the job done
1) Modified “Think Time” to stretch VUs
Example 2 virtual users ≠ 1 divided in 2
≠
2) Extrapolated results based on small lab
tests
Educated or assisted guessing is no match
for measuring at real scale
4. Necessity Led to Workarounds
How we’ve “cheated” to get the job done
3) Tested pages or assets in a silo, ignoring
realistic pace and flow of user behavior
Optimizes limited test hardware, but
disregards session states, caching, etc.
4) Accepted blind spots by focusing on
limited, single metrics (e.g. response time)
Without complete end-to-end
views, everything’s a black box
5. Let’s Look at the NEW RULES
Establishing Accuracy and Realism
Scott Barber
6. 1) Modifying Think Time: The Wrong Way
“If all you have is a hammer, everything looks like a nail”
-- Bernard Baruch
To Cheat a Software License
• We did what we had to so we could generate some semblance of load
• We often found real and serious performance issues
• Compared to *not* cheating, we added increased value
• But they were often not the “right” ones
• We still couldn’t simulate production, and we still got burned
Stretch Limited Hardware
• We had the same issue with hardware, so we overloaded what we had
• Again, we found real and serious performance issues
• Again, it increased value, but again, we rarely found the “right” issues
• And, again, we got burned in production
7. 1) Modifying Think Time: The Right Way
The only way to simulate production…
…is to simulate production.
Users Think… and Type
• Guess what? They all do it at different speeds!
• Guess what else? It's your job to figure out how to model and script those
varying speeds
Determine how long they think
• Log files
• Industry research
• Observation
• Educated guess/Intuition
• Combinations are best
8. 1) Modifying Think Time: The Right Way
When you get it wrong, it’s… When you get it right, it’s…
Not
Frightening
Frightening
9. 2) Extrapolating Capacity: The Wrong Way
Extrapolating performance test results is black magic
DON’T DO IT
Unless you are, or were trained by, Connie Smith, Ph.D.
The most common type of bad extrapolation…
• 1 leg of an n leg system ≠ 1/nth capacity
• Fractional virtual resources ≠ fractional capacity
Other types of bad extrapolation...
• Faster processors in production ≠ faster response time
• More resources ≠ faster response time
• Any extrapolation that presumes linear correlations
10. 2) Measuring Capacity: The Right Way
Realistically, there are 3 ways to predict capacity
Trust your gut & cross your fingers
• Gut feelings are sometimes very accurate
• They can also cost you your job
Reverse cross-validate
• Use post-release production data to modify & re-measure test environment
• Use new results to make predictions for prod
• Check new predictions vs. reality, revise repeat
Find a way to run some tests in the actual production environment
• You can learn a lot from loads below expected peak
• A few of hours of scheduled maintenance in the middle of the night can
change *everything*
11. 3) Modeling User Flows: The Wrong Way
You can’t test everything…
…the possibilities are literally endless.
Implementing functional use cases or scenarios…
• Will have you scripting until the sun explodes, AND
• Will regularly miss “easy” stuff by choosing and prioritizing poorly
Picking the most common, or most “important” flow
• Is unlikely to catch the worst performance issues
• Is likely to lead the application to be “hyper-tuned” for that scenario
• Is likely to yield unwanted surprises
13. 3) Modeling User Flows: The Right Way
Tell lots of little lies?
…No! FIBLOTS
Common activities (get from logs)
e.g. Resource hogs (get from developers/admins)
Even if these activities are both rare and not risky
SLA’s, Contracts and other stuff that will get you sued
What the users will see and are mostly likely to
complain about. What is likely to earn you bad press
New technologies, old technologies, places where it’s
failed before, previously under-tested areas
Don’t argue with the boss (too much)