3. Overview
What is Performance?
Why Performance?
How to Measure Performance?
Improving Performance
Tools
Best Practices
4. What is Performance?
3 parts of performance
Speed
How quickly application responds to a user
Scalability
How application handles expected users and
beyond
Stability
How stable the application is under load
Could be expected or unexpected
7. How to Measure Performance
• Live Single user execution
• How quickly the application responds
throughout the full application call
• Automated unit and integration testing
• Multi-user performance testing
• Load testing
• Stress testing
• Peak testing
• Duration testing
• Failover testing
8. Improving Performance
80-90% of the end-user response time is spent
on the frontend. Start there
- Steve Souders
9. Front End – Asset Size
Compress assets
• GZIP (all but PNG)
• http://zoompf.com/blog/2012/02/lose-the-wait-
httpcompression
Minify JS & CSS
Don’t overload asset contents
10. Front End – Client Rendering
Excess DOM
Understand JS events
DOM manipulation
Fonts
Avoid repaints and reflows
Cache JS
References:
• http://perfectionkills.com/profiling-css-for-fun-and-
profit-optimization-notes/
• http://blog.monitor.us/2012/03/27-website-
performance-javascript/
11. Front End – Progressive
Enhancement
Chunked encoding
• ASP.NET doesn’t chunk by default
• If you turn it on and writeln to response, each write will get
chunked (big perf hit for large HTML)
AJAX
Defer JS
CSS on top
JS on the bottom
JS load asynchronously
12. Latency
Use CDNs
Caching
• 35% reduction in bandwidth
Combine JS and CSS files
Load JS asynchronously
Sprites
Inline images
Prefetch and cache assets for future use
13. Tools for Frontend Measurement
Grading Tools Profiling Timing
PageSpeed SpeedTracer Blaze.io
-can be a plugin -Chrome add-on -mobile timing
-can be a CDN
Yslow! Web developer Webpagetest
-add-on tools -can select location and enter URL
-runs locally (Chrome/IE/Firefo -mobile section uses Blaze.IO
x)
Firebug PCAP Web Performance Analyzer
-uses hars/pcaps to analyze
webpages
Loads.in
-tests how long page loads from
various locations
14. Tools for Performance analysis
Analytics Network Analyzer APIs
Google analytics Fiddler Navigation timing
((http://www.w3.org/T
R/navigation-timing/)
-shows all parts of
network in page load
-doesn’t split by
resource (future)
Statsd Wireshark (not free) Boomerang.js
-nodejs library that -measures network
collects stats traffic
Tcpdump
-logs tcp calls to a url
cUrl
-web crawler
Dig
-investigates DNS
15. Tools for Content Delivery
JSON deliver CDN
Jdrop CloudFlare
-Store JSON data in the cloud
CloudFront
PageSpeed
16. Tools for Testing
Performance Testing Code analysis
Blitz.io Benchmark.js
-ability to run up to 250 users in 60 -framework for measuring method
seconds for free. response times
-can be automated (use in
continuous integration)
Browsermob-proxy
-captures har data from tests
during functional tests
Css-stress
-profiles css stylesheets
17. Tools for Mobile Optimizations
Images Font
src.sencha.io FontSquirrel
-resizes the image to fit the -generates font that’s best for your
physical screen (mobile) device
Imagealpha
-converts 24-bit PNG to 8-bit PNG
(mobile)
18. Backend - SOA
• Rely on service oriented architecture
• Separate your data
• Transactional vs. reporting
• Separate I/O and CPU bound processes on
different machines
• Utilize event sourcing patterns
• Concurrent operations
• Be careful!
• Eventual Consistency
19. Backend - Cache
• Cache as much as you can
• But not too much!
• Use the right caching tool
• Understand different caching patterns
• Primed Cache
• Demand Cache
• Take advantage of ASP.NET Caching
20. Backend - DB
• Long running queries
• Write/write conflicts
• Large joins
• indexes
21. Backend
• BIGGEST PROBLEM CONTEXT SWITCHING
• Measure GC pauses
• Optimize worker threads
• Thread deadlocks
• Be careful of Large object fragmentation
• Be careful of Object Relational Mappings
• Don’t rely on Exceptions for logic
• Utilize Connection Pooling
• Threads
• DB
• Utilize Batch requests and responses
• Understand operation impact on .NET Performance
• Take advantage of memory utilization
22. Tools for Profiling
Web profiling Instrumented profiling
Glimpse ANTS Performance Profiler ($499)
Mini-profiler JetBrains dotTrace ($399)
-can integrate with ReSharper
dynaTrace (varies)
-can be integrated with CI
-can do monitoring/profiling/diagnostics and much
more
-most expensive
VS2010 Profiler
-offered in Visual Studio
23. Tools for Profiling
Heap profiler Benchmarking
PerfView MeasureIt
PerfMon
Comparison for APM and BTM tools
(http://www.scribd.com/doc/53541961/Competitive-
Analysis-Application-Performance-Management-and-
Business)
Goes over a set of tools (none are free and most are
enterprise) for monitoring and transaction tracing
25. Best Practices – What to test
• Critical business transactions
• Frequently used transactions
• Performance-intensive transactions
26. Best Practices - notes
• DO NOT prematurely optimize
• DO start with 3rd party tools but roll your own
solutions if necessary for performance
• DO NOT be afraid to modify standard
solutions
• DO go for the first bottleneck and always
retest after
• Follow “test fix retest” pattern
27. Contact
@dimtruck
Dimitry.ushakov@accenture.com
http://dimitryushakov.com/codecamp2012/pres
entation
• Will have slides of the presentation and
references
Notas do Editor
2 minutes (4 total)-quickly cover what you mean by performance. Performance could mean different things to different people.
3 minutes (7 minutes)
4 minutes (11 minutes)Quickly go through different types of testingLive profilingAutomated testingIncludes unit tests (Nunit)Includes integration tests (Selenium)Load testingLoad regular load testingStress keep increasing number of users until it breaksPeak test against peak loadBreak kill various parts of the applications and monitor user experienceFailover if you have multiple data centers/kill the data center and test againSTART DEMO HEREGo over what it is: facebook for DOGSRun through the applicationPoint out slowness in loading imagesPoint out slowness in loading entire pagePoint out how going back to the page does not help
97% of mobile and 85% of desktop is spent on frontend
Measure with waterfall (SHOW EXAMPLE) - know the differences (http://www.stevesouders.com/blog/2011/08/26/waterfall-ui-conventions/)GZIP improves by 13-25%unavailable in some old browsersAsset size-don’t download unnecessary information (extra css and js)-excess dom-don’t download and hide content-don’t download and shrink content-minify js and cssCAVEAT: mobile pattern:-embed JS and CSS in the first response. save JS and CSS to localStorage on phone, then subsequent requests would only load those files that are needed based on cookie [set with stored js/css]. -More work needed to figure out which page to return but still faster.-caveat on caveat localStorage read is MUCH SLOWER than object read since it’s disk read [data size stored does not matter per key] (http://calendar.perfplanet.com/2011/localstorage-read-performance/)SHOW DEMO (STEP 1):Turn on GZIPUse minified files
Optimize client rendering-optimize images with css sprites, gradients and incline images-use jsperf to compare different functions-understand dom behavior for different events-e.g. understand difference between bubbling and capturing events-avoid reflows (DOM manipulation, layout change, window resizing, table manipulation, absolute/fixed better than static/relative position, font change, stylesheet add/remove)-avoid repaints (add outline, change visibility, change bg color)-DOM manipulation -- add subtrees to dom tree at once-clean up objects after you’re done (delete key)-primitive operators over function callsFONTS- Host the fonts on a CDN- GZIP all except .woff (already compressed)- Cache all font files for 30+ days- Remove excess glyphs (characters) from the font files- Ensure @font-face is the first rule of the first stylesheet on the page (IE)- Load through javascript (http://www.artzstudio.com/2012/02/web-font-performance-weighing-fontface-options-and-alternatives/)SHOW DEMO:-remove bad fonts and update through tool-remove unnecessary repaints and reflows
Progressive Enhancement-deliver HTML first and quickly (chunked encoding)-load visible content first-defer JS until the end-decorate after everything is loaded-use controlJSfile for async loading http://stevesouders.com/controljs/-media queries-Synchronous scripts block all following elements from rendering- insertBefore (async) vs. script tag (sync)- async- xhreval (bad), xhr injection, script in iframe, script dom element, script defer, document.write- use anywhere.js/bootstrap.js to ansynchronously load third-party libraries for JS with dependenciesSHOW DEMO (STEP 2):-put css on top-put js on the bottom
3 minutes (24 minutes)-TCP slow start - to control congestion inside the network- server starts by sending a few packets, waits for acknowledgement and then increases the # subsequentlyTalk about latency fluctuations-65-145ms on desktop vs up to 900ms on mobileMobile-Wireless provider slowdown-Android closes TCP sockets after 6 secs of inactivity-3g device establishes radio link to carrier’s cell tower (consumes battery)takes 1-2 seconds to connect to cell towerSteve Souders wrote about these findings http://www.stevesouders.com/blog/2011/09/21/making-a-mobile-connection/CDN-Japan is the fastest growing mobile commerce (Rakuten's Online Commerce Revenue, Desktop vs Mobile)-Cloudfront from Amazon (doesn’t do gzip out of the box)-Cloudflare (free service)No redirects-redirecting on mobile are expensive (due to latency)Avoid css importsDon’t download the same asset twice (especially if not cached)Prefetch-predownload necessary assets before they’re requiredCaching -adding Expires Header-35% reduction in bandwidthWork around TCP slow startDropped packetsDNS lookupshttp://www.websiteoptimization.com/speed/tweak/inline-images/SHOW DEMO (STEP 3):-add all images in webpage as sprite-combine js and cssfiles [COMBRES]-Turn on caching [web.config]
SHOW YSLOW!SHOW SPEEDTRACERSHOW BLAZE.IO FINISHSHOW FIREBUGSHOW LOADS.IN
-blitz.iobrowsermob-proxy (http://opensource.webmetrics.com/browsermob-proxy/)- benchmark - allows benchmarking on functions (http://benchmarkjs.com/)- cssstress - to test css profiling (https://github.com/andyedinborough/stress-css)
src.sencha.io -> (http://www.sencha.com/learn/how-to-use-src-sencha-io/) resizes the image to fit the physical screen (MOBILE)- imagealpha -> (http://pngmini.com/) converts 24-bit PNG to paletted 8-bit PNG (MOBILE)8-bit is kinda like a gif (256 color palette)
Rely on service oriented architectureSeparate your data Transactional vs. reportingSeparate I/O and CPU bound processes on different machinesRandom disk i/oUtilize event sourcing patternsqueuing/messaging- service bus- cqrs- eventual consistency- concurrent operations- async- be careful of tail latency (concurrent requests that the parent request depends on all completing)- duplicate work to rid of latency (multiple requests for same data to different servers. Only take the first responding and disregard the rest) (http://highscalability.com/blog/2012/3/12/google-taming-the-long-latency-tail-when-more-machines-equal.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HighScalability+%28High+Scalability%29&utm_content=Google+Feedfetcher- EVENTUAL CONSISTENCY- use a write DB and a read DB and sync between the two.- no locks, etc.- (http://pbs.cs.berkeley.edu/#demo)- understand the tradeoffs
CACHE- Use memcached or Redis- A significant portion of an application’s resources is consumed by I/O operations, which usually are data accesses.- Increases memory usage- too much memory utilized will start using swap (& move pages to virtual mem - hard disk) - increased I/O - no caching benefits- If data req can be predicted, then prefetch into cache; else, lazy load- Primed cache - set of required resources can be predicted (PREFETCH)- Demand cache - set of required resources cannot be predicted- Cache Accessor- Cache Search Sequence- Cache Collector- Cache Replicator- Cache Statistics- ASP.NET Caching (http://msdn.microsoft.com/en-us/library/cc668225)- Application Cache - not stored in memory for life of the application- Page output cache - store rendered HTML- cache whole page or part of it- for those pages that are always the same (profile page)- Caching data for ASP.NETSHOW DEMO (STEP 4):-update to use memory cache
Long & short running queries- Write/write conflicts- Large joins
Be careful of Object Relational MappingsDon’t rely on Exceptions for logicPoolingThreadsDBbatch requests- batch responses (stored procedures returning multiple results)- Ahead of-time compilation (NGen in CLR) may improve over JIT- Large object fragmentation- Split arrays into smaller units so that they remain below 85kB (and so are never allocated on the large object heap).- You can allocate the largest and longest-living objects first (if your objects are files which are queued for processing, for example).- In some cases, it may be that periodically stopping and restarting the program is the only option.understand operation impact in .NET on performance- Integer arithmetic is 1 instruction count while object allocation can be more than 1,000 (http://msdn.microsoft.com/en-us/magazine/cc507639.aspx)Again memory is good. The right memory is best:Memory consumption- CPU-intensive application manipulating large amount of data- L1 Cache ⇒ L2 Cache ⇒ RAM ⇒ DISK (speed decreases 10 fold in memory and 10,000 fold in disk- Cold startupWORKER THREADS- The request architecture of ASP.NET balances between request threads and available resources. - Thread gating - number of concurrently executing requests for which CPU power available. - Monitor thread gating in the Windows Performance monitor using the Pipeline Instance Count performance counter.- Page request generally stops until external resource (db/service requests) responds- The result can be many concurrently executing requests and many waiting threads in the ASP.NET worker process or application pool. - To reduce this effect on performance, you can manually set the limit on the number of threads in the process. - To do so, change the MaxWorkerThreads and MaxIOThreads attributes in the processModel section of the Machine.config file.- You can determine what the appropriate number of threads is by performance-testing your application.thread deadlocks- single thread can be blocked waiting on disk (fetching data), network (cross-machine resources), event/locks (waiting on other threads)
SHOW DEMO FOR GLIMPSESHOW DEMO FOR PROFILER
Show example forperfmon and measureit
5 minutes (59 minutes)SDLCRequirementsperformance goals- Architecture - performance metrics/SLAs- Development - profiling/prototyping- Testing - load testing- Continuous Integration testing- blitz.io- http://code.google.com/p/harstorage/ with Selenium testing- profiling on Unit Tests - (VS 2010 profiler - http://blogs.msdn.com/b/profiler/archive/2010/01/08/how-to-profilenunit-tests-using-the-visual-studio-2010-profiler.aspx)
- Identify critical business transactions- frequently used transactions- performance-intensive transactions
- do not prematurely optimize- CQRS- split up write services and read services- Start with 3rd party tools but if necessary, roll out your own- Sometimes standard solutions don’t fit - don’t be afraid to write custom (http://drjosiah.blogspot.com/2011/09/improving-performance-by-1000x.html)- When diagnosing, go for bottleneck and retest. That will lead you to your next bottleneck- Identify critical business transactions- frequently used transactions- performance-intensive transactionsSPDY (http://dev.chromium.org/spdy/spdy-whitepaper)BEFOREsingle request per connectionexclusively client-initiated requestsuncompressed request/response headers (gzip doesn’t help)redundant headers (User-Agent, Host, Accept* are usually static)AFTER:- multiple requests per connection (TCP conn doesn’t close)- compresses/eliminates headers- SSL first class- request prioritization- server push (X-Associated-Content)- suggest to the client to ask for specific resources (X-Subresources)- HTTP 2.0 - MS - just a proposal unlike SPDY (http://tools.ietf.org/html/draft-montenegro-httpbisspeed-mobility-01)- ResourceTiming (http://w3c-test.org/webperf/specs/ResourceTiming/)- ZeroMQSHOW DEMO:Show difference between the first response in waterfall to the last response in waterfall and the difference (RUN YSLOW AGAIN AND SHOW DIFFERENCE)