Anúncio

Java Performance Tuning

Software Developer em Thomson Reuters Thailand
14 de Oct de 2008
Anúncio

Mais conteúdo relacionado

Anúncio

Java Performance Tuning

  1. Java Performance Tuning Atthakorn Chanthong
  2. What is software tuning?
  3. User Experience The software has poor response time. I need it runs more faster
  4. Software tuning is to make application runs faster
  5. Many people think Java application is slow, why?
  6. There are two major reasons
  7. The first is the Bottleneck
  8. The Bottleneck Increased Memory Use Lots of Casts Automatic memory management by Garbage Collector All Object are allocated on the Heap. Java application is not native
  9. The second is The Bad Coding Practice
  10. How to make it run faster?
  11. The bottleneck is unavoidable
  12. But the man could have a good coding practice
  13. A good design
  14. A good coding practice
  15. Java application normally run fast enough
  16. So the tuning game comes into play
  17. Knowing the strategy
  18. Tuning Strategy 1 Identify the main causes 2 Choose the quickest and easier one to fix 3 Fix it, and repeat again for other root cause
  19. Inside the strategy
  20. Tuning Strategy Need more faster, repeat again Profile, Measure Problem Priority Yes, it’s better Identify the location of bottleneck Test and compare Still bad? Before/after alteration The result Think a hypothesis isn’t good enough Code alteration Create a test scenario
  21. How to measure the software performance?
  22. We use the profiler
  23. Profiler is a programming tool that can track the performance of another computer program
  24. The two common usages of profiler are to analyze a software problem Profiler Profile application Monitor application performance memory usage
  25. How we get the profiler?
  26. Don’t pay for it!
  27. An opensource profiler is all around
  28. Some interesting opensource profilers
  29. Opensource Profiler JConsole http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html
  30. Opensource Profiler Eclipse TPTP http://www.eclipse.org/tptp/index.php
  31. Opensource Profiler NetBeans Built-in Profiler http://profiler.netbeans.org/
  32. Opensource Profiler This is pulled out from NetBeans to act as standalone profiler VisualVM https://visualvm.dev.java.net/
  33. And much more
  34. Opensource Profiler JRat Cougaar DrMem InfraRED Profiler4j Jmeasurement TIJMP
  35. We love opensource
  36. Make the brain smart with good code practice
  37. 1st Rule Avoid Object-Creation
  38. Object-Creation causes problem Why?
  39. Avoid Object-Creation Lots of objects in memory means GC does lots of work Program is slow down when GC starts
  40. Avoid Object-Creation Creating object costs time and CPU effort for application
  41. Reuse objects where possible
  42. Pool Management Most container (e.g. Vector) objects could be reused rather than created and thrown away
  43. Pool Management VectorPoolManager V1 V3 V4 V5 getVector() returnVector() V2
  44. Pool Management public void doSome() { for (int i=0; i < 10; i++) { Vector v = new Vector() … do vector manipulation stuff } } public static VectorPoolManager vpl = new VectorPoolManager(25) public void doSome() { for (int i=0; i < 10; i++) { Vector v = vectorPoolManager.getVector( ); … do vector manipulation stuff vectorPoolManager.returnVector(v); } }
  45. Canonicalizing Objects Replace multiple object by a single object or just a few
  46. Canonicalizing Objects public class VectorPoolManager { private static final VectorPoolManager poolManager; private Vector[] pool; private VectorPoolManager(int size) { .... } public static Vector getVector() { if (poolManager== null) poolManager = new VectorPoolManager(20); ... return pool[pool.length-1]; } } Singleton Pattern
  47. Canonicalizing Objects Boolean b1 = new Boolean(true); Boolean b2 = new Boolean(false); Boolean b3 = new Boolean(false); Boolean b4 = new Boolean(false); 4 objects in memory Boolean b1 = Boolean.TRUE Boolean b2 = Boolean.FALSE Boolean b3 = Boolean.FALSE Boolean b4 = Boolean.FALSE 2 objects in memory
  48. Canonicalizing Objects String string = quot;55quot;; Integer theInt = new Integer(string); No Cache String string = quot;55quot;; Integer theInt = Integer.valueOf(string); Object Cached
  49. Canonicalizing Objects private static class IntegerCache { private IntegerCache(){} static final Integer cache[] = new Integer[-(-128) + 127 + 1]; static { for(int i = 0; i < cache.length; i++) cache[i] = new Integer(i - 128); } } public static Integer valueOf(int i) { final int offset = 128; if (i >= -128 && i <= 127) { // must cache return IntegerCache.cache[i + offset]; } return new Integer(i); } Caching inside Integer.valueOf(…)
  50. Keyword, ‘final’ Use the final modifier on variable to create immutable internally accessible object
  51. Keyword, ‘final’ public void doSome(Dimension width, Dimenstion height) { //Re-assign allow width = new Dimension(5,5); ... } public void doSome(final Dimension width, final Dimenstion height) { //Re-assign disallow width = new Dimension(5,5); ... }
  52. Auto-Boxing/Unboxing Use Auto-Boxing as need not as always
  53. Auto-Boxing/UnBoxing Integer i = 0; //Counting by 10M while (i < 100000000) { i++; } Takes 2313 ms Why it takes 2313/125 =~ 20 times longer? int p = 0; //Counting by 10M while (p < 100000000) { p++; } Takes 125 ms
  54. Auto-Boxing/UnBoxing Object-Creation made every time we wrap primitive by boxing
  55. 2nd Rule Knowing String Better
  56. String is the Object mostly used in the application
  57. Overlook the String The software may have the poor performance
  58. Compile-Time String Initialization Use the string concatenation (+) operator to create Strings at compile-time.
  59. Compile-Time Initialization for (int i =0; i < loop; i++) { //Looping 10M rounds String x = quot;Helloquot; + quot;,quot; +quot; quot;+ quot;Worldquot;; } Takes 16 ms for (int i =0; i < loop; i++) { //Looping 10M rounds String x = new String(quot;Helloquot; + quot;,quot; +quot; quot;+ quot;Worldquot;); } Takes 672 ms
  60. Runtime String Initialization Use StringBuffers/StringBuilder to create Strings at runtime.
  61. Runtime String Initialization String name = quot;Smithquot;; for (int i =0; i < loop; i++) { //Looping 1M rounds String x = quot;Helloquot;; x += quot;,quot;; x += quot; Mr.quot;; x += name; } Takes 10298 ms String name = quot;Smithquot;; for (int i =0; i < loop; i++) { //Looping 1M rounds String x = (new StringBuffer()).append(quot;Helloquot;) .append(quot;,quot;).append(quot; quot;) .append(name).toString(); } Takes 6187 ms
  62. String comparison Use appropriate method to compare the String
  63. To Test String is Empty for (int i =0; i < loop; i++) { //10m loops if (a != null && a.equals(quot;quot;)) { } }. Takes 125 ms for (int i =0; i < loop; i++) { //10m loops if (a != null && a.length() == 0) { } } Takes 31 ms
  64. If two strings have the same length String a = “abc” String b = “cdf” for (int i =0; i < loop; i++) { if (a.equalsIgnoreCase(b)) { } } Takes 750 ms String a = “abc” String b = “cdf” for (int i =0; i < loop; i++) { if (a.equals(b)) { } Takes 125 ms }
  65. If two strings have different length String a = “abc” String b = “cdfg” for (int i =0; i < loop; i++) { if (a.equalsIgnoreCase(b)) { } } Takes 780 ms String a = “abc” String b = “cdfg” for (int i =0; i < loop; i++) { if (a.equals(b)) { } Takes 858 ms }
  66. String.equalsIgnoreCase() does only 2 steps It checks for identity and then for Strings being the same size
  67. Intern String To compare String by identity
  68. Intern String Normally, string can be created by two ways
  69. Intern String By new String(…) String s = new String(“This is a string literal.”); By String Literals String s = “This is a string literal.”;
  70. Intern String Create Strings by new String(…) JVM always allocate a new memory address for each new String created even if they are the same.
  71. Intern String String a = new String(“This is a string literal.”); String b = new String(“This is a string literal.”); a “This is a string literal.” The different memory address b “This is a string literal.”
  72. Intern String Create Strings by Literals Strings will be stored in Pool Double create Strings by laterals They will share as a unique instances
  73. Intern String String a = “This is a string literal.”; String b = “This is a string literal.”; a Same memory address “This is a string literal.” b
  74. Intern String We can point two Stings variable to the same address if they are the same values. By using String.intern() method
  75. Intern String String a = new String(“This is a string literal.”).intern(); String b = new String(“This is a string literal.”).intern(); a Same memory address “This is a string literal.” b
  76. Intern String The idea is … Intern String could be used to compare String by identity
  77. Intern String What “compare by identity” means?
  78. Intern String If (a == b) Identity comparison (by reference) If (a.equals(b)) Value comparison
  79. Intern String By using reference so identity comparison is fast
  80. Intern String In traditionally style String must be compare by equals() to avoid the negative result
  81. Intern String But Intern String… If Strings have different value they also have different address. If Strings have same value they also have the same address.
  82. Intern String So we can say that (a == b) is equivalent to (a.equals(b))
  83. Intern String For these string variables String a = quot;abcquot;; String b = quot;abcquot;; String c = new String(quot;abcquot;).intern() They are pointed to the same address with the same value
  84. Intern String for (int i =0; i < loop; i++) { if (a.equals(b)) { } } Takes 312 ms for (int i =0; i < loop; i++) { if (a == b) { } Takes 32 ms }
  85. Intern String Wow, Intern String is good Unfortunately, it makes code hard understand, use it carefully
  86. Intern String String.intern() comes with overhead as there is a step to cache Use Intern String if they are planed to compare two or more times
  87. char array instead of String Avoid doing some stuffs by String object itself for optimal performance
  88. char array String x = quot;abcdefghijklmnquot;; for (int i =0; i < loop; i++) { if (x.charAt(5) == 'x') { } } Takes 281 ms String x = quot;abcdefghijklmnquot;; char y[] = x.toCharArray(); for (int i =0; i < loop; i++) { if ( (20 < y.length && 20 >= 0) && y[20] == 'x') { } } Takes 156 ms
  89. 3rd Rule Exception and Cast
  90. Stop exception to be thrown if it is possible Exception is really expensively to execute
  91. Avoid Exception Object obj = null; for (int i =0; i < loop; i++) { try { obj.hashCode(); } catch (Exception e) {} } Takes 18563 ms Object obj = null; for (int i =0; i < loop; i++) { if (obj != null) { obj.hashCode(); } Takes 16 ms }
  92. Cast as Less We can reduce runtime cost by grouping cast object which is several used
  93. Cast as Less Integer io = new Integer(0); Object obj = (Object)io; for (int i =0; i < loop; i++) { if (obj instanceof Integer) { byte x = ((Integer) obj).byteValue(); double d = ((Integer) obj).doubleValue(); float f = ((Integer) obj).floatValue(); } } Takes 31 ms for (int i =0; i < loop; i++) { if (obj instanceof Integer) { Integer icast = (Integer)obj; byte x = icast.byteValue(); double d = icast.doubleValue(); float f = icast.floatValue(); } } Takes 16 ms
  94. 4th Rule The Rhythm of Motion
  95. Loop Optimization There are several ways to make a faster loop
  96. Don’t terminate loop with method calls
  97. Eliminate Method Call byte x[] = new byte[loop]; for (int i = 0; i < x.length; i++) { for (int j = 0; j < x.length; j++) { } } Takes 109 ms byte x[] = new byte[loop]; int length = x.length; for (int i = 0; i < length; i++) { for (int j = 0; j < length; j++) { } Takes 62 ms }
  98. Method Call generates some overhead in Object Oriented Paradigm
  99. Use int to iterate over loop
  100. Iterate over loop by int for (int i = 0; i < length; i++) { for (int j = 0; j < length; j++) { } } Takes 62 ms for (short i = 0; i < length; i++) { for (short j = 0; j < length; j++) { } } Takes 125 ms
  101. VM is optimized to use int for loop iteration not by byte, short, char
  102. Use System.arraycopy(…) for copying object instead of running over loop
  103. System.arraycopy(….) for (int i = 0; i < length; i++) { x[i] = y[i]; } Takes 62 ms System.arraycopy(x, 0, y, 0, x.length); Takes 16 ms
  104. System.arraycopy() is native function It is efficiently to use
  105. Terminate loop by primitive use not by function or variable
  106. Terminate Loop by Primitive for(int i = 0; i < countArr.length; i++) { for(int j = 0; j < countArr.length; j++) { } } Takes 424 ms for(int i = countArr.length-1; i >= 0; i--) { for(int j = countArr.length-1; j >= 0; j--) { } } Takes 298 ms
  107. Primitive comparison is more efficient than function or variable comparison
  108. The average time of switch vs. if-else is about equally in random case
  109. Switch vs. If-else for(int i = 0; i < loop; i++) for(int i = 0; i < loop; i++) { { if (i%10== 0) switch (i%10) { { case 0: break; } else if (i%10 == 1) case 1: break; { ... ... case 7: break; } else if (i%10 == 8) case 8: break; { default: break; } else if (i%10 == 9) } { } } } Takes 2623 ms Takes 2608 ms
  110. Switch is quite fast if the case falls into the middle but slower than if-else in case of falling at the beginning or default case ** Test against a contiguous range of case values eg, 1,2,3,4,..
  111. Recursive Algorithm Recursive function is easy to read but it costs for each recursion
  112. Tail Recursion A recursive function for which each recursive call to itself is a reduction of the original call.
  113. Recursive vs. Tail-Recursive public static long factorial1(int n) { if (n < 2) return 1L; else return n*factorial1(n-1); } Takes 172 ms public static long factorial1a(int n) { if (n < 2) return 1L; else return factorial1b(n, 1L); } public static long factorial1b(int n, long result) { if (n == 2) return 2L*result; else return factorial1b(n-1, result*n); Takes 125 ms }
  114. Dynamic Cached Recursive Do cache to gain more speed
  115. Dynamic-Cached Recursive public static long factorial1(int n) { if (n < 2) return 1L; else return n*factorial1(n-1); } Takes 172 ms public static final int CACHE_SIZE = 15; public static final long[ ] factorial3Cache = new long[CACHE_SIZE]; public static long factorial3(int n) { if (n < 2) return 1L; else if (n < CACHE_SIZE) { if (factorial3Cache[n] == 0) factorial3Cache[n] = n*factorial3(n-1); return factorial3Cache[n]; } else return n*factorial3(n-1); Takes 94 ms }
  116. Recursion Summary Dynamic-Cached Tail Recursive is better than Tail Recursive is better than Recursive
  117. 5th Rule Use Appropriate Collection
  118. Accession ArrayList vs. LinkedList
  119. Random Access ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms LinkedList ll = new LinkedList(); for (int i =0; i < loop; i++) { ll.get(i); } Takes 5828 ms
  120. Sequential Access ArrayList al = new ArrayList(); for (Iterator i = al.iterator(); i.hasNext();) { i.next(); } Takes 1375 ms LinkedList ll = new LinkedList(); for (Iterator i = ll.iterator(); i.hasNext();) { i.next(); } Takes 1047 ms
  121. ArrayList is good for random access LinkedList is good for sequential access
  122. Random vs. Sequential Access ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms LinkedList ll = new LinkedList(); for (Iterator i = ll.iterator(); i.hasNext();) { i.next(); } Takes 1047 ms
  123. Random Access is better than Sequential Access
  124. Insertion ArrayList vs. LinkedList
  125. Insertion at zero index ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.add(0, Integer.valueOf(i)); } Takes 328 ms LinkedList ll = new LinkedList(); for (int i =0; i < loop; i++) { ll.add(0, Integer.valueOf(i)); } Takes 109 ms
  126. LinkedList does insertion better than ArrayList
  127. Vector is likely to ArrayList but it is synchronized version
  128. Accession and Insertion Vector vs. ArrayList
  129. Random Accession ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms Vector vt = new Vector(); for (int i =0; i < loop; i++) { vt.get(i); } Takes 422 ms
  130. Sequential Accession ArrayList al = new ArrayList(); for (Iterator i = al.iterator(); i.hasNext();) { i.next(); } Takes 1375 ms Vector vt = new Vector(); for (Iterator i = vt.iterator(); i.hasNext();) { i.next(); } Takes 1890 ms
  131. Insertion ArrayList al = new ArrayList(); for (int i =1; i < loop; i++) { al.add(0, Integer.valueOf(i)); } Takes 328 ms Vector vt = new Vector(); for (int i =0; i < loop; i++) { vt.add(0, Integer.valueOf(i)); } Takes 360 ms
  132. Vector is slower than ArrayList in every method Use Vector if only synchronize needed
  133. Summary Type Random Sequential Insertion (get) (Iterator) ArrayList 281 1375 328 LinkedList 5828 1047 109 Vector 422 1890 360
  134. Addition and Accession Hashtable vs HashMap
  135. Addition Hashtable ht = new Hashtable(); for (int i =0; i < loop; i++) { ht.put(Integer.valueOf(i), Integer.valueOf(i)); } Takes 453 ms HashMap hm = new HashMap(); for (int i =0; i < loop; i++) { hm.put(Integer.valueOf(i), Integer.valueOf(i)); } Takes 328 ms
  136. Accession Hashtable ht = new Hashtable(); for (int i =0; i < loop; i++) { ht.get(Integer.valueOf(i)); } Takes 94 ms HashMap hm = new HashMap(); for (int i =0; i < loop; i++) { hm.get(Integer.valueOf(i)); } Takes 47 ms
  137. Hashtable is synchronized so it is slower than HashMap
  138. Q&A
  139. Reference O'Reilly Java Performance Tuning 2nd http://www.javaperformancetuning.com http://www.glenmccl.com/jperf/
  140. Future Topic I/O Logging, and Console Output Sorting Threading Tweak JVM and GC Strategy
  141. The End
Anúncio