SlideShare a Scribd company logo
1 of 36
Production Debugging @ 100mph
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/java8-debugging
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
About Me
Co-founder – Takipi (God mode in Production Code).
Co-founder – VisualTao (acquired by Autodesk).
Director, AutoCAD Web & Mobile.
Software Architect at IAI Aerospace.
Coding for the past 16 years - C++, Delphi, .NET, Java.
Focus on real-time, scalable systems.
Blogs at takipiblog.com
Overview
Dev-stage debugging is forward-tracing.
Production debugging is focused on backtracing.
Modern production debugging poses two challenges: state isolation and data distribution.
Direct correlation between quality of data to MTTR.
Agenda
1. Distributed logging – best practices.
1. Preemptive jstacks
2. Java 8 – state of the stack
3. Inspecting state with Btrace
1. Extracting state with custom Java agents.
Solid Logging Practices
1. Code context.
2. Time + duration.
3. Thread ID (preferably name).
4. Transaction ID (for async & distributed debugging).
Make sure these are baked into your logging context –
Transaction ID
• Logging is usually a multi–threaded / process affair.
• Generate a UUID at every thread entry point into your app – the transaction ID.
• Append the ID into each log entry.
• Try to maintain it across machines – critical for distributed / async debugging.
Thread Names
• Thread name is a mutable property.
• Can be set to hold transaction specific state.
• Some frameworks (e.g. EJB) don’t like that.
• Can be super helpful when debugging in tandem with jstack.
Thread Names (2)
• Transaction ID
• Servlet parameters, Queue message ID
• Start time
Thread.currentThread().setName(Context, TID, Params, Time,..)
"pool-1-thread-1" #17 prio=5 os_prio=31 tid=0x00007f9d620c9800 nid=0x6d03
in Object.wait() [0x000000013ebcc000]
”MsgID: AB5CAD, type: Analyze, queue: ACTIVE_PROD, TID: 5678956, TS:
11/8/20014 18:34 "
#17 prio=5 os_prio=31 tid=0x00007f9d620c9800 nid=0x6d03 in Object.wait()
[0x000000013ebcc000]
Your last line of defense - critical to pick up on unhandled exceptions.
Setting the callback:
This is where thread Name + TLS are critical as the only surviving state.
Global Exception Handlers
public static void Thread.setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler eh)
void UncaughtExceptionHandler.uncaughtException(Thread t, Throwable e) {
logger.error(“Uncaught error in thread “ + t, e);
}
Preemptive jstack
• A production debugging foundation.
• Presents two issues –
– Activated only in retrospect.
– No state: does not provide any variable state.
• Let’s see how we can overcome these with preemptive jstacks.
Preemptive jstack - Demo
github.com/takipi/jstack
60-100% > Atomics
Native frames, monitors
Java 8 stack traces
BTrace
• An advanced open-source tool for extracting state from a live JVM.
• Uses a Java agent and a meta-scripting language to capture state.
• Pros: Lets you probe variable state without modifying / restarting the JVM.
• Cons: read-only querying using a custom syntax and libraries.
BTrace - Restrictions
• Can not create new objects.
• Can not create new arrays.
• Can not throw exceptions.
• Can not catch exceptions.
• Can not make arbitrary instance or static method calls - only the public static methods of
com.sun.btrace.BTraceUtils class may be called from a BTrace program.
• Can not assign to static or instance fields of target program's classes and objects. But,
BTrace class can assign to it's own static fields ("trace state" can be mutated).
• Can not have instance fields and methods. Only static public void returning methods are
allowed for a BTrace class. And all fields have to be static.
• Can not have outer, inner, nested or local classes.
• Can not have synchronized blocks or synchronized methods.
• can not have loops (for, while, do..while)
• Can not extend arbitrary class (super class has to be java.lang.Object)
• Can not implement interfaces.
• Can not contains assert statements.
• Can not use class literals.
BTrace - Demo
kenai.com/projects/btrace
Custom Java Agents
• An advanced technique for instrumenting code dynamically.
• The foundation for most profiling / debugging tools.
• Two types of agents: Java and Native.
• Pros: extremely powerful technique to collect state from a live app.
• Cons: requires knowledge of creating verifiable bytecode.
Custom Agent - Demo
github.com/takipi/debugAgent
Auto generating bytecode (ASMifier)
Native Agents
• Java agents are written in Java. Have access to the Instrumentation API.
• Native agents – written in C++.
• Have access to JVMTI – the JVM’s low-level set of APIs and capabilities.
– JIT compilation, GC, Monitor, Exception, breakpoints, ..
• More complex to write. Capability performance impact.
• Platform dependent.
Takipi - Detect, priotitize and debug bugs at high-scale.
tal.weiss@takipi.com
@takipid
takipiblog.com
Thanks!
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/java8-
debugging

More Related Content

More from C4Media

More from C4Media (20)

Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Debugging Java 8: New Techniques for Fixing Production Code

  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /java8-debugging
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. About Me Co-founder – Takipi (God mode in Production Code). Co-founder – VisualTao (acquired by Autodesk). Director, AutoCAD Web & Mobile. Software Architect at IAI Aerospace. Coding for the past 16 years - C++, Delphi, .NET, Java. Focus on real-time, scalable systems. Blogs at takipiblog.com
  • 5. Overview Dev-stage debugging is forward-tracing. Production debugging is focused on backtracing. Modern production debugging poses two challenges: state isolation and data distribution. Direct correlation between quality of data to MTTR.
  • 6. Agenda 1. Distributed logging – best practices. 1. Preemptive jstacks 2. Java 8 – state of the stack 3. Inspecting state with Btrace 1. Extracting state with custom Java agents.
  • 7. Solid Logging Practices 1. Code context. 2. Time + duration. 3. Thread ID (preferably name). 4. Transaction ID (for async & distributed debugging). Make sure these are baked into your logging context –
  • 8. Transaction ID • Logging is usually a multi–threaded / process affair. • Generate a UUID at every thread entry point into your app – the transaction ID. • Append the ID into each log entry. • Try to maintain it across machines – critical for distributed / async debugging.
  • 9. Thread Names • Thread name is a mutable property. • Can be set to hold transaction specific state. • Some frameworks (e.g. EJB) don’t like that. • Can be super helpful when debugging in tandem with jstack.
  • 10. Thread Names (2) • Transaction ID • Servlet parameters, Queue message ID • Start time Thread.currentThread().setName(Context, TID, Params, Time,..) "pool-1-thread-1" #17 prio=5 os_prio=31 tid=0x00007f9d620c9800 nid=0x6d03 in Object.wait() [0x000000013ebcc000] ”MsgID: AB5CAD, type: Analyze, queue: ACTIVE_PROD, TID: 5678956, TS: 11/8/20014 18:34 " #17 prio=5 os_prio=31 tid=0x00007f9d620c9800 nid=0x6d03 in Object.wait() [0x000000013ebcc000]
  • 11. Your last line of defense - critical to pick up on unhandled exceptions. Setting the callback: This is where thread Name + TLS are critical as the only surviving state. Global Exception Handlers public static void Thread.setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler eh) void UncaughtExceptionHandler.uncaughtException(Thread t, Throwable e) { logger.error(“Uncaught error in thread “ + t, e); }
  • 12. Preemptive jstack • A production debugging foundation. • Presents two issues – – Activated only in retrospect. – No state: does not provide any variable state. • Let’s see how we can overcome these with preemptive jstacks.
  • 13. Preemptive jstack - Demo github.com/takipi/jstack
  • 16.
  • 17. Java 8 stack traces
  • 18.
  • 19.
  • 20. BTrace • An advanced open-source tool for extracting state from a live JVM. • Uses a Java agent and a meta-scripting language to capture state. • Pros: Lets you probe variable state without modifying / restarting the JVM. • Cons: read-only querying using a custom syntax and libraries.
  • 21. BTrace - Restrictions • Can not create new objects. • Can not create new arrays. • Can not throw exceptions. • Can not catch exceptions. • Can not make arbitrary instance or static method calls - only the public static methods of com.sun.btrace.BTraceUtils class may be called from a BTrace program. • Can not assign to static or instance fields of target program's classes and objects. But, BTrace class can assign to it's own static fields ("trace state" can be mutated). • Can not have instance fields and methods. Only static public void returning methods are allowed for a BTrace class. And all fields have to be static. • Can not have outer, inner, nested or local classes. • Can not have synchronized blocks or synchronized methods. • can not have loops (for, while, do..while) • Can not extend arbitrary class (super class has to be java.lang.Object) • Can not implement interfaces. • Can not contains assert statements. • Can not use class literals.
  • 23.
  • 24.
  • 25.
  • 26. Custom Java Agents • An advanced technique for instrumenting code dynamically. • The foundation for most profiling / debugging tools. • Two types of agents: Java and Native. • Pros: extremely powerful technique to collect state from a live app. • Cons: requires knowledge of creating verifiable bytecode.
  • 27. Custom Agent - Demo github.com/takipi/debugAgent
  • 28.
  • 29.
  • 30.
  • 32. Native Agents • Java agents are written in Java. Have access to the Instrumentation API. • Native agents – written in C++. • Have access to JVMTI – the JVM’s low-level set of APIs and capabilities. – JIT compilation, GC, Monitor, Exception, breakpoints, .. • More complex to write. Capability performance impact. • Platform dependent.
  • 33. Takipi - Detect, priotitize and debug bugs at high-scale. tal.weiss@takipi.com @takipid takipiblog.com Thanks!
  • 34.
  • 35.
  • 36. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/java8- debugging