O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Performance tests - it's a trap

377 visualizações

Publicada em

Slides from Devoxx UA 2018

Publicada em: Software
  • Seja o primeiro a comentar

Performance tests - it's a trap

  1. 1. Performance tests - it’s a trap Andrzej Ludwikowski
  2. 2. About me ➔ ➔ aludwikowski.blogspot.com ➔ github.com/aludwiko ➔ @aludwikowski
  3. 3. Disclaimers ● Performance tests with Gatling ● no micro-benchmarking
  4. 4. 3
  5. 5. Performance tests? It’s a trap!
  6. 6. Simulate production as close as possible ● hardware ○ CPU, RAM, storage, ... ● software ○ OS, Virtualization, DBs, … ● load ● isolation
  7. 7. Data, data, data ● monitoring
  8. 8. Monitoring ● hosts ● DBs ● message buses ● applications ● everything! Prometheus
  9. 9. Monitoring
  10. 10. Monitoring AWS DC/OS host Docker application JVM
  11. 11. Monitoring
  12. 12. Data, data, data ● monitoring ● logging
  13. 13. Logging ● DBs queries response time ● HTTP endpoints response time ● correlation id ● dynamic configuration ● etc.
  14. 14. Logging - streams
  15. 15. Logging - alerts
  16. 16. Logging
  17. 17. Logging
  18. 18. Data, data, data ● monitoring ● logging ● profiling
  19. 19. Profiling - Java VisualVM
  20. 20. Profiling - Java VisualVM
  21. 21. Profiling - JMC + Flight Recorder
  22. 22. Profiling - JMC + Flight Recorder
  23. 23. Profiling - Honest Profiler
  24. 24. Data, data, data ● monitoring ● logging ● profiling
  25. 25. Your performance intuition is wrong! 1. collect the data (monitoring, logging, profiling) 2. find the bottleneck (based on data) 3. fix the bottleneck 4. collect the data and check the assumptions 5. go to 1.
  26. 26. Math Lies, damned lies, and statistics: ● arithmetic mean = 2.9 ● median = 1 ● standard deviation = 6 (only for normal distribution)
  27. 27. Anscombe’s quartet http://bravenewgeek.com/tag/coordinated-omission/ Property Value Mean of x 9 Sample variance of x 11 Mean of y 7.50 Sample variance of y 4.125 Correlation between x and y 0.816 Linear regression line y = 3 + 0.5x Coefficient of determination of the linear regression 0.67
  28. 28. Math Lies, damned lies, and statistics: ● arithmetic mean = 2.9 ● median = 1 ● standard deviation = 6 (only for normal distribution) Use: ● percentiles ○ 50th = 1 ○ 70th = 1 ○ 90th = 2.9 ○ 95th = 11.45 ○ 99th = 18.29
  29. 29. Math Lies, damned lies, and statistics: ● arithmetic mean = 2.9 ● median = 1 ● standard deviation = 6 (only for normal distribution) Use: ● percentiles ○ 50th = 1 ○ 70th = 1 ○ 90th = 2.9 ○ 95th = 11.45 ○ 99th = 18.29 Check percentiles implementation!
  30. 30. Percentiles val recordedTimes = //normal behavior (1 to 100000).map(_ => random.nextInt(100)) ++ //standard peacks (1 to 100).map(_ => 1000) ++ //very high peacks (1 to 10).map(_ => 10000) //print histogram for the same data set (1 to 3).foreach{_ => printHistogram(recordedTimes) } -----------1 75: 75.000000 95: 95.000000 99: 98.000000 99.9: 1000.000000 max: 1000 -----------2 75: 75.000000 95: 94.000000 99: 98.000000 99.9: 99.000000 max: 99 -----------3 75: 77.000000 95: 95.000000 99: 99.000000 99.9: 1000.000000 max: 10000
  31. 31. Percentiles Reservoir implementation: ● ExponentiallyDecayingReservoir ● UniformReservoir ● SlidingTimeWindowReservoir ● SlidingWindowReservoir
  32. 32. Percentiles - sliding window 10, 15, 20, 10, 20, 55, 15, 20, 15, 10 95th ~ 50
  33. 33. Percentiles 95th ~ 40
  34. 34. HDRHistogram ● dynamic range of values ● small memory footprint ● reasonable precision Value Percentile TotalCount 1/(1-Percentile) 0 0.000000000000 970 1.00 63 0.500000000000 64196 2.00 127 0.750000000000 100000 4.00 127 0.875000000000 100000 8.00 127 0.937500000000 100000 16.00 127 0.968750000000 100000 32.00 127 0.984375000000 100000 64.00 127 0.992187500000 100000 128.00 127 0.996093750000 100000 256.00 127 0.998046875000 100000 512.00 1023 0.999023437500 100100 1024.00 1023 0.999511718750 100100 2048.00 1023 0.999755859375 100100 4096.00 1023 0.999877929688 100100 8192.00 16383 0.999938964844 100110 16384.00 16383 1.000000000000 100110 #[Mean = 57, StdDeviation = 129] #[Max = 16383, Total count = 100110] #[Buckets = 20, SubBuckets = 2]
  35. 35. Which percentile? ● 70 th ● 80 th ● 90 th ● 95 th ● 99 th ● 99.9 th ● 99.99 th http://bravenewgeek.com/tag/coordinated-omission/
  36. 36. Coordinated omission problem ● Coordinated omission problem by Gil Tene http://bravenewgeek.com/tag/coordinated-omission/
  37. 37. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ system.exit(0)
  38. 38. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ system.exit(0) samples 31 50 th 1 70 th 1 90 th 1 95 th 1 99 th 21.30 99.9 th 29.13 99.99 th 29.91 avg 1.93
  39. 39. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ system.exit(0) samples 1001 50 th 1 70 th 1 90 th 1 95 th 1 99 th 1 99.9 th 1 99.99 th 27.1 avg 1.02
  40. 40. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ system.exit(0)
  41. 41. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ system.exit(0) samples 60 50 th 1 70 th 12 90 th 24 95 th 27 99 th 29.4 99.9 th 29.9 99.99 th 29.99 avg 8.25
  42. 42. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/ samples 1001 50 th 1 70 th 1 90 th 1 95 th 1 99 th 1 99.9 th 1 99.99 th 27.1 avg 1.02 samples 31 60 50 th 1 15.5 70 th 1 30 90 th 1 30 95 th 1 30 99 th 21.30 30 99.9 th 29.13 30 99.99 th 29.91 30 avg 1.93 15.5
  43. 43. Coordinated omission problem
  44. 44. Coordinated omission problem total time 60 s max 30 s 99th 1 s
  45. 45. Coordinated omission problem total time 60 s max 30 s 99th 1 s time in % for max 50%
  46. 46. Coordinated omission problem total time 60 s max 30 s 99th 1 s time in % for max 50% expected time in % for 99th 50% - 1% = 49%
  47. 47. Coordinated omission problem total time 60 s max 30 s 99th 1 s time in % for max 50% expected time in % for 99th 50% - 1% = 49% real time for 99th 60 s * 49% = 29.4 s
  48. 48. Coordinated omission problem http://bravenewgeek.com/tag/coordinated-omission/
  49. 49. WRK2 http://bravenewgeek.com/tag/coordinated-omission/ ● constant throughput load ● accurate latency details ● LUA api ● no Coordinated Omission problem
  50. 50. Performance tests - why so hard? ● Tests…
  51. 51. Performance tests - why so hard? ● Tests…
  52. 52. Tools
  53. 53. Gatling
  54. 54. Why Gatling? + non-blocking, asynchronous stack (scala, akka, netty) + scala !!!111oneoneone (maven, sbt support) + DSL + recorder + math is good + reports + integration & performance tests - not resistant to Coordinated Omission problem
  55. 55. Readable DSL class FirstScenario extends MySimulation { val firstScenario = scenario("First Scenario Name") .exec(http("Go to root page").get("/")) .pause(7 seconds) setUp(firstScenario.inject(atOnceUsers(1)).protocols(httpConf)) }
  56. 56. Modular DSL class ComplexScenario extends MySimulation { val complexScenario = scenario("Complex demo scenario") .exec(http("root page").get("/")).pause(7) .exec(http("search for").get("/computers?f=macbook")).pause(2) .exec(http("position at").get("/computers/6")).pause(3) .exec(http("root page").get("/")).pause(2) .exec(http("go to page").get("/computers?p=1")).pause(670 milliseconds) .exec(http("open new computer form").get("/computers/new")).pause(1) .exec(http("add new computer") .post("/computers") .formParam("name", "MyComputer").formParam("introduced", "2012-05-30").formParam("company", "37")) setUp(complexScenario.inject(atOnceUsers(1)).protocols(httpConf))
  57. 57. Modular DSL class ComplexScenarioV2 extends MySimulation { val complexScenario = scenario("Complex demo scenario") .exec(goToRootPage).pause(7) .exec(searchFor("macbook")).pause(2) .exec(positionAt(6)).pause(3) .exec(goToRootPage).pause(2) .exec(goToPage(1)).pause(670 milliseconds) .exec(openNewComputerForm).pause(1) .exec(addNewComputer) setUp(complexScenario.inject(atOnceUsers(1)).protocols(httpConf)) }
  58. 58. Modular DSL class ComplexScenarioV3 extends MySimulation { val search = exec(goToRootPage).pause(7) .exec(searchFor("macbook")).pause(2) .exec(positionAt(6)).pause(3) val addComputer = exec(goToRootPage).pause(2) .exec(goToPage(1)).pause(670 milliseconds) .exec(openNewComputerForm).pause(1) .exec(addNewComputer) val complexScenario = scenario("Complex demo scenario").exec(search, addComputer) setUp(complexScenario.inject(atOnceUsers(1)).protocols(httpConf)) }
  59. 59. DSL - checks scenario("DSL demo") .exec(http("go to page") .get("/computers") .check(status.is(200)) .check(status.in(200 to 210))
  60. 60. DSL - checks scenario("DSL demo") .exec(http("go to page") .get("/computers") .check(regex("computers") .find(1) .exists) https://www.pinterest.com/pin/491666484294006138/
  61. 61. DSL - checks scenario("First Scenario Name") .exec(http("Request name") .get("/computers") .check(jsonPath("$..foo.bar[2].baz").ofType[String].notExists) .check(xpath("//input[@id='text1']/@value").is("test")) .check(css("...").transform(_.split('|').toSeq).is(Seq("1", "2")))
  62. 62. DSL - virtual user session scenario("DSL demo") .exec(http("Authorize") .get("/auth") .check(regex("token").find(1).exists .saveAs("authorizationToken"))) .exec(http("Authorized resource") .get("/authorized_resource?token=${authorizationToken}"))
  63. 63. DSL - looping repeat(5, "i") { exec(goToPage("${i}".toInt)) .pause(1) } ● repeat ● foreach ● during ● asLongAs ● forever https://blog.hubspot.com/blog/tabid/6307/bid/32019/Why-Every-Marketer-Needs-Closed-Loop-Reporting.aspx#sm.0005lrqj811waf3ntmn1cul3881gr
  64. 64. DSL - polling exec( polling .every(10 seconds) .exec(searchFor("thinkpad")) ) http://www.firmus-solutions.com/terms-conditions.html
  65. 65. DSL - conditions doIf(session => s session("user").startsWith("admin")) { exec(goToAdminPage) } ● doIf ● doIfEquals ● doIfOrElse ● doSwitch ● doSwitchOrElse ● randomSwitch https://en.wikipedia.org/wiki/Decision_tree
  66. 66. DSL - error management exec(sendMoney) .tryMax(10){ exec(checkIfMoneyReceived) } ● tryMax ● exitBlockOnFail ● exitHereIfFailed Alice Bob Kafka
  67. 67. DSL - setup setUp(myScenario .inject( nothingFor(4 seconds), atOnceUsers(10), rampUsers(10) over (5 seconds)) .protocols(httpConf)) .maxDuration(10 minutes) ● constantUsersPerSec ● rampUsersPerSec ● splitUsers ● heavisideUsers
  68. 68. DSL - setup setUp(myScenario .inject( nothingFor(4 seconds), atOnceUsers(10), rampUsers(10) over (5 seconds)) .protocols(httpConf)) .maxDuration(10 minutes) ● constantUsersPerSec ● rampUsersPerSec ● splitUsers ● heavisideUsers
  69. 69. DSL - setup setUp(myScenario .inject(atOnceUsers(10)) .protocols(httpConf)) .assertions( global.responseTime.max.lt(50), global.failedRequests.percent.is(0) )
  70. 70. DSL - setup setUp(myScenario .inject(atOnceUsers(10)) .protocols(httpConf)) .throttle( reachRps(100) in (30 second), holdFor(1 minute), jumpToRps(50), holdFor(2 hours) )
  71. 71. DSL - feeders val companies = List("apple", "lenovo", "hp") val randomCompanyName = Iterator.continually( Map("company" -> companies(Random.nextInt(companies.size)))) val searching = scenario("Searching") .feed(randomCompanyName) .exec(searchFor("${company}")) ● RecordSeqFeederBuilder ● CSV ● JSON ● JDBC ● Sitemap ● Redis ● … http://favim.com/orig/201104/23/Favim.com-22725.jpg
  72. 72. DSL - feeders val companies = List("apple", "lenovo", "hp") val randomCompanyName = Iterator.continually( Map("company" -> companies(Random.nextInt(companies.size)))) val searching = scenario("Searching") .feed(randomCompanyName) .exec(searchFor("${company}")) ● RecordSeqFeederBuilder ● CSV ● JSON ● JDBC ● Sitemap ● Redis ● … http://favim.com/orig/201104/23/Favim.com-22725.jpg
  73. 73. Reporting ================================================================================ ---- Global Information -------------------------------------------------------- > request count 10 (OK=10 KO=0 ) > min response time 40 (OK=40 KO=- ) > max response time 177 (OK=177 KO=- ) > mean response time 55 (OK=55 KO=- ) > std deviation 41 (OK=41 KO=- ) > response time 50th percentile 42 (OK=42 KO=- ) > response time 75th percentile 43 (OK=43 KO=- ) > response time 95th percentile 117 (OK=117 KO=- ) > response time 99th percentile 165 (OK=165 KO=- ) > mean requests/sec 0.909 (OK=0.909 KO=- ) ---- Response Time Distribution ------------------------------------------------ > t < 800 ms 10 (100%) > 800 ms < t < 1200 ms 0 ( 0%) > t > 1200 ms 0 ( 0%) > failed 0 ( 0%) ================================================================================
  74. 74. Reporting
  75. 75. Reporting
  76. 76. Reporting
  77. 77. DSL - other goodies ● Custom validators ● HTTP ○ SSL ○ SSE (Server Sent Event) ○ basic cookies support ● WebSocket ● JMS ● Pluginable architecture: ○ cassandra plugin ○ kafka plugin ○ rabbitMQ ○ AMQP
  78. 78. Distributed tests
  79. 79. Distributed tests IT system
  80. 80. Distributed tests ● manually ● Gatling FrontLine ● Flood.io IT system
  81. 81. Flood.io
  82. 82. Gatling FrontLine
  83. 83. Are performance tests useless? It depends
  84. 84. Are performance tests useless? It dependson context!
  85. 85. About me ➔ ➔ aludwikowski.blogspot.com ➔ github.com/aludwiko ➔ @aludwikowski

×