2. Thread Problems
● App Server freezes and brings down our production environment.
● App Server unresponsive.
● App Server crashing.
● App Server hangs.
● App Server CPU usage 100%. High load. High CPU. CPU 100% utilization
● GC thrashing.
● Slow performance.
● Serious performance issues with API calls
● Hung threads.
● App Server goes to hung state.
● Server stuck.
● Slow response from application.
● Application deployed on App Server become slow, need to restart the service and then application is running normal,
But after a while it is run slowly again.
3. Understand your environment and available tools
● Physical & virtual host configuration and capacity (total # of assigned
CPU cores, RAM etc.)
● OS vendor, version and patch level
● Middleware vendor, versions and patch level
● Java vendor & versions (including 32-bit vs. 64-bit); including patch
level
● Third party API’s used within the Java or Java EE applications
● Existing monitoring tools that you can leverage for historical data and
trend analysis
● History of the environment, known issues, resources utilization etc.
● Business traffic breakdown per application along with average & peak
traffic level of the platform; including peak business periods
4. Process & Thread
Process is generally the most major and
separate unit of execution recognised by the
OS
Each process has its own memory space.
A thread is a subdivision that shares the
memory space of its parent process.
Each thread has its own private stack and
registers, including program counter.
Physical memory
memory space0x8000 memory space0x8000
Memory mapping
Process 1
Thread 1
- Stack
- Register
- PC
Thread 2
- Stack
- Register
- PC
Thread Scheduler (OS)
Process 1
Thread 1
- Stack
- Register
- PC
Thread 2
- Stack
- Register
- PC
Processor Processor
5. Java Thread
The statuses of threads are stated on java.lang.Thread.State
● NEW: The thread is created but has not been processed yet.
● RUNNABLE: The thread is occupying the CPU and processing a task. (It may be in WAITING
status due to the OS's resource distribution.)
● BLOCKED: The thread is waiting for a different thread to release its lock in order to get the
monitor lock.
● WAITING: The thread is waiting by using a wait, join or park method.
● TIMED_WAITING: The thread is waiting by using a sleep, wait, join or park method. (The
difference from WAITING is that the maximum waiting time is specified by the method
parameter, and WAITING can be relieved by time as well as external changes.)
7. Thread in Application Server
In a typical Thread Dump snapshot generated from a Java EE container JVM:
● Some Threads could be performing raw computing tasks such as XML parsing, IO / disk access etc.
● Some Threads could be waiting for some blocking IO calls such as a remote Web Service call, a DB /
JDBC query etc.
● Some Threads could be involved in garbage collection at that time e.g. GC Threads
● Some Threads will be waiting for some work to do (Threads not doing any work typically go in wait()
state)
● Some Threads could be waiting for some other Threads work to complete e.g. Threads waiting to
acquire a monitor lock (synchronized block{}) on some objects
8. How to get list oF Thread from a Process?
Linux top -b -n 1 -H -p <PID>
Solaris prstat -L -p <PID> 1 1
AIX ps -mp <PID> -o THREAD
9. Sample output of top command
top -b -n 1 -H -p <PID>
Useful for viewing threads that
utilize excessive CPU
10. Thread Dump Format
● Thread dump's format isn't a part of the SDK specification
● Each SDK vendor provides a unique thread dump format and its own JVM
information
12. - Thread name; often used by middleware vendors to identify the Thread Id along with its associated Thread Pool name and state
(running, stuck etc.)
- Thread type & priority ex: daemon prio=3 ** middleware softwares typically create their Threads as daemon meaning their Threads
are running in background; providing services to its user e.g. your Java EE application **
- Java Thread ID ex: tid=0x000000011e52a800 ** This is the Java Thread Id obtained via java.lang.Thread.getId() and usually
implemented as an auto-incrementing long 1..n**
- Native Thread ID ex: nid=0x251c** Crucial information as this native Thread Id allows you to correlate for example which Threads
from an OS perspective are using the most CPU within your JVM etc. **
- Java Thread State and detail ex: waiting for monitor entry [0xfffffffea5afb000] java.lang.Thread.State: BLOCKED (on object monitor)
** Allows to quickly learn about Thread state and its potential current blocking condition **
- Java Thread Stack Trace; this is by far the most important data that you will find from the Thread Dump. This is also where you will
spent most of your analysis time since the Java Stack Trace provides you with 90% of the information that you need in order to pinpoint
root cause of many problem pattern types as you will learn later in the training sessions
Sample Java Thread Dump - FORMAT EXPLANATION
13. Java Heap breakdown; starting with HotSpot VM 1.6, you will also find at the bottom of the Thread Dump snapshot a breakdown of the
HotSpot memory spaces utilization such as your Java Heap (YoungGen, OldGen) & PermGen space. This is quite useful when excessive
GC is suspected as a possible root cause so you can do out-of-the-box correlation with Thread data / patterns found
17. IBM JVM Thread Dump Format
IBM Thread Dumps/Javacores provide much
more information
18. IBM JVM Thread Dump EventS
Thread Dump is generated by ...
● kill -3 <PID>
● OutOfMemory Error (automatic generation)
WebSphere Application Server:
● wsadmin> $AdminControl invoke $jvm
dumpThreads
● Admin console:
○ Navigate to Troubleshooting > Java dumps and
cores
○ Select the server(s) to collect dumps from
○ Click on Java Core, System Dump, or Heap
Dump to produce the specified file.
19. IBM JVM Thread Dump Format
HW and OS environment detail
JRE detail and Java start-up arguments
20. IBM JVM Thread Dump Format
User and environment variables
21. IBM JVM Thread Dump Format
Java Heap detail and GC history
22. IBM JVM Thread Dump Format
Java and JVM object monitor lock and deadlock detail
23. IBM JVM Thread Dump Format
Java EE middleware, third party &
custom application Threads
25. Thread Problem: High CPU
Defined by an observation of one or many
Java VM processes consuming excessive CPU
utilization from your physical host(s).
Excessive CPU can also be described by an
abnormal high CPU utilization vs. a known &
established baseline.
Ex: if the average CPU utilization of your
Java VM under peak load condition is 40%
then excessive CPU threshold can be set
around 80%.
26. Thread Problem: HIGH CPU
Common high CPU Thread scenarios
● Heavy or excessive garbage collection (Threads identified are the actual GC Threads of the HotSpot
VM)
● Heavy or infinite looping (application or middleware code problem, corrupted & looping non Thread
safe HashMap etc.)
● Excessive IO / disk activity (Excessive Class loading or JAR file searches)
27. Threads Breakdown of a JVM Process
you will need to understand and perform a full breakdown of all Threads of your Java VM so you can
pinpoint the biggest contributors.
29. Thread Problem: Contention & Deadlock
Thread contention is a status in which one thread is waiting for a lock, held by another thread, to be lifted.
● Different threads frequently access shared resources on a web application.
● For example, to record a log, the thread trying to record the log must obtain a lock and access the shared resources.
Deadlock is a special type of thread contention, in which two or more threads are waiting for the other threads to complete
their tasks in order to complete their own tasks.
31. "BLOCKED_TEST pool-1-thread-1" prio=6 tid=0x0000000006904800 nid=0x28f4 runnable [0x000000000785f000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:282)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
- locked <0x0000000780a31778> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:432)
- locked <0x0000000780a04118> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:85)
- locked <0x0000000780a040c0> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:168)
at java.io.PrintStream.newLine(PrintStream.java:496)
- locked <0x0000000780a04118> (a java.io.PrintStream)
at java.io.PrintStream.println(PrintStream.java:687)
- locked <0x0000000780a04118> (a java.io.PrintStream)
at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:44)
- locked <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
at com.nbp.theplatform.threaddump.ThreadBlockedState$1.run(ThreadBlockedState.java:7)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Locked ownable synchronizers:
- <0x0000000780a31758> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
Example
32. "BLOCKED_TEST pool-1-thread-2" prio=6 tid=0x0000000007673800 nid=0x260c waiting for monitor entry
[0x0000000008abf000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:43)
- waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
at com.nbp.theplatform.threaddump.ThreadBlockedState$2.run(ThreadBlockedState.java:26)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Locked ownable synchronizers:
- <0x0000000780b0c6a0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
"BLOCKED_TEST pool-1-thread-3" prio=6 tid=0x00000000074f5800 nid=0x1994 waiting for monitor entry
[0x0000000008bbf000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:42)
- waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
at com.nbp.theplatform.threaddump.ThreadBlockedState$3.run(ThreadBlockedState.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Locked ownable synchronizers:
- <0x0000000780b0e1b8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
34. "DEADLOCK_TEST-1" daemon prio=6 tid=0x000000000690f800 nid=0x1820 waiting for monitor entry [0x000000000805f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
- waiting to lock <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
- locked <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)
"DEADLOCK_TEST-2" daemon prio=6 tid=0x0000000006858800 nid=0x17b8 waiting for monitor entry [0x000000000815f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
- waiting to lock <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
- locked <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)
"DEADLOCK_TEST-3" daemon prio=6 tid=0x0000000006859000 nid=0x25dc waiting for monitor entry [0x000000000825f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
- waiting to lock <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
- locked <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)
35. Stuck/Hung Thread (Long running thread)
Long running thread can be considered stuck or hung
It might be a signal of a problem
Long duration is relative
It might be because of:
● Deadlock (halt)
● Infinite loop
● Resource contention
36. Oracle WebLogic Server (Stuck Thread)
● In a Weblogic server, all incoming requests are handled by a thread pool which is controlled by a
work manager.
● Worker threads that are taken out of the pool and not returned after a specified time period are
marked as [STUCK] by the work manager.
Thread as stuck if it is continually working (not idle) for a set period of time.
37. WebSPhere App Server (Hung Thread)
● A hung thread is a thread that is being blocked by a blocking call or is waiting on a monitor (sync
locked object) to be released so that it can use it.
● Hung detection in WAS in the SystemOut log file:
○ A message ID of WSVR0605W indicates that a thread MAY be hung
○ A message ID of WSVR0606W is notify you that a previously reported hung thread actually
completed its work.
○ Hang detection works only with WebSphere managed threads (e.g. thread pools) and does
NOT monitor user created threads.
38. WebSPhere App Server (Hung Thread)
Configuring the hang detection policy:
http://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.iseries.doc/ae/ttrb_confighangdet.html
WSVR0605W: Thread “WebContainer : 1” has been active for 612,000 milliseconds and may be hung.
There are 3 threads in total in the server that may be hung.
WSVR0605W: Thread "WebContainer : 3" (0000005f) has been active for 679868 milliseconds and may be
hung. There is/are 3 thread(s) in total in the server that may be hung.
39. IBM Thread and Monitor Dump Analyzer for Java (TMDA)
● A tool that allows identification of hangs,
deadlocks, resource contention, and bottlenecks in
Java threads.
● Analyzes javacore (known as "javadump”) -- IBM term
Javacore is NOT the same as a core file, which is
generated by a system dump
https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=2245aa39-fa5c-4475-b891-14c205f7333c
40. IBM Thread and Monitor Dump Analyzer for Java
Features:
● Thread detail view
● Monitor detail view
● List of hang suspects
● Thread comparison view
● Thread comparison summary
● Java Monitor lock comparison view
Tutorial:
http://www-01.ibm.com/support/docview.wss?uid=swg27011855&aid=1
41. IBM Thread and Monitor Dump Analyzer for Java
Location of javacore:
● Set by environment variable: IBM_JAVACOREDIR
● Specified by JVM parameter -DWORKING_DIR
● Written to the directory from which the Java process was started
File name:
42. WAS Threads
So many internal (common) threads..
Thread name is useful in determining what process owns the thread and how that thread is used
43. IBM Resources:
Problem determination for javacore files from WebSphere Application Server - include sample javacore file
http://www-01.ibm.com/support/docview.wss?uid=swg21181068
MustGather: Generating Javacores and Userdumps Manually For Performance, Hang or High CPU Issues on Windows
http://www-01.ibm.com/support/docview.wss?uid=swg21138203
Deep Dive on Java Virtual Machine (JVM) Javacores and Javadumps
http://www-01.ibm.com/support/docview.wss?uid=swg27017906&aid=1
MustGather: Crash on Microsoft Windows
MustGather: Crash on AIX
MustGather: Crash on Linux
44. Exercise:
● Run WAS in standalone mode using Oracle JDK/OpenJDK
● Run ./wsadmin.sh from <WAS_INSTALL_DIR>/bin, then from prompt “wsadmin>”
If credential is set and want to use jython use:
wsadmin.sh -lang jython -user user_name -password password
○ Jcl:
set jvm [$AdminControl completeObjectName type=JVM,process=server1,*]
$AdminControl invoke $jvm dumpThreads
○ Jython:
ServerJVM = AdminControl.completeObjectName('type=JVM,process=server1,*')
wsadmin>AdminControl.invoke(ServerJVM,'dumpThreads')
● See directory /opt/IBM/WebSphere/profiles/AppSrv01
● Find file javacore.YYYYMMDD.XXXXXX.XXXX.XXXX.txt
● Try generate another coredump using Admin Console.
● Open using TMDA