SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Leveraging Performance Counters
and Execution Logs to Diagnose 
Memory‐Related Performance Issues
Mark D. Syer, Zhen Ming Jiang, Meiyappan Nagappan, 
Ahmed E. Hassan, Mohamed Nasser and Parminder Flora
mdsyer@cs.queensu.ca
1
2
Failures in ULS systems are typically 
due to performance issues
3
4
“...triggered a latent memory leak… By Monday
morning, the rate of memory loss became quite
high and consumed enough memory on the
affected storage servers that they were unable
to keep up with normal request handling
processes.”
5
Load testing 
may detect 
failures before 
they occur in 
the field
6
7
Performance analysts collect
counters & logs
0
5
10
15
20
25
30
0
100
200
300
400
500
600
700
800
900
1000
Memory Usage
Time
8
Memory 
Leak!
Diagnosing memory issues 
requires counters and logs
Diagnosing 
memory‐issues 
is difficult
9
Huge amount of data
Rapidly evolving systems
0
5
10
15
20
25
30
0
100
200
300
400
500
600
700
800
900
1000
Memory Usage
Time
10
Combining counters 
and logs is difficult
Memory 
Leak!
Generate
Signatures
Detect
Outliers
Inspect
Outliers
Our approach identifies the events 
causing performance issues
11
0
5
10
15
20
00:00 00:08 00:16 00:24
Memory (MB)
Time
12
We generate a signature each time 
memory is sampled
Abstract log lines to events
00:01, Alice starts a conversation with Bob
00:01, Alice says `hi' to Bob
00:02, Alice says `are you busy?' to Bob
00:11, Bob says `yes' to Alice
00:12, Alice says `ok' to Bob
00:18, Alice ends a conversation with Bob
13
00:00, 5MB
00:08, 15MB
00:16, 15MB
00:24, 5MB
Combine the counters and events
00:01, USER starts a conversation with USER 
00:01, USER says MSG to USER 
00:02, USER says MSG to USER 
00:11, USER says MSG to USER 
00:12, USER says MSG to USER 
00:18, USER ends a conversation with USER 
14
Count the events and calculate the 
memory delta in each time interval
00:08 00:16 00:24
USER starts a conversation with USER  1 0 0
USER says MSG to USER  2 2 0
USER ends a conversation with USER 0 0 1
Δ Memory 10MB 0 ‐10MB
15
Detect
Outliers
Inspect
Outliers
We identify and inspect 
outlying signatures
16
Can we diagnose...
17
Memory bloat?
Memory leaks?
Memory spikes?
Effort ReductionEffort ReductionPrecision
18
Our approach flags events
with high precision
0
20
40
60
80
100
Memory bloat Memory leak Memory spike
Precision
19
+80%
Effort ReductionPrecision
20
Precision
+80%
Our approach flags a small number
of events for expert analysis
0
1,000
2,000
3,000
4,000
5,000
6,000
# Log Lines # Flagged Events
21
5,303
1
99.98%
Our approach flags a small number
of events for expert analysis
99.9
99.92
99.94
99.96
99.98
100
Memory bloat Memory leak Memory spike
22
Effort Reduction
23
>99.98%+80%
Precision
+80%
Precision
24

Mais conteúdo relacionado

Destaque

The Pets of my Life
The Pets of my LifeThe Pets of my Life
The Pets of my Life
Sam Mayden
 
Single drama analysis missed
Single drama analysis missedSingle drama analysis missed
Single drama analysis missed
Lydia jill
 
Environmental value systems: Lake Victoria vs Gulf of California
Environmental value systems: Lake Victoria vs Gulf of CaliforniaEnvironmental value systems: Lake Victoria vs Gulf of California
Environmental value systems: Lake Victoria vs Gulf of California
Roberto Alviso
 

Destaque (10)

Carrera De Obstaculos
Carrera De ObstaculosCarrera De Obstaculos
Carrera De Obstaculos
 
APN Polishop.Com.VC - Apresentação de Oportunidade
APN Polishop.Com.VC - Apresentação de OportunidadeAPN Polishop.Com.VC - Apresentação de Oportunidade
APN Polishop.Com.VC - Apresentação de Oportunidade
 
Universidad Nacional Experimental
Universidad  Nacional  ExperimentalUniversidad  Nacional  Experimental
Universidad Nacional Experimental
 
Norma ong calidad.
Norma ong calidad.Norma ong calidad.
Norma ong calidad.
 
Seminário - Avaliação de Aprendizagem: do erro como fonte de castigo ao erro ...
Seminário - Avaliação de Aprendizagem: do erro como fonte de castigo ao erro ...Seminário - Avaliação de Aprendizagem: do erro como fonte de castigo ao erro ...
Seminário - Avaliação de Aprendizagem: do erro como fonte de castigo ao erro ...
 
The Pets of my Life
The Pets of my LifeThe Pets of my Life
The Pets of my Life
 
A Brief History of Diving
A Brief History of DivingA Brief History of Diving
A Brief History of Diving
 
Single drama analysis missed
Single drama analysis missedSingle drama analysis missed
Single drama analysis missed
 
Environmental value systems: Lake Victoria vs Gulf of California
Environmental value systems: Lake Victoria vs Gulf of CaliforniaEnvironmental value systems: Lake Victoria vs Gulf of California
Environmental value systems: Lake Victoria vs Gulf of California
 
Informe control interno contable -stafe2015
Informe control interno contable -stafe2015Informe control interno contable -stafe2015
Informe control interno contable -stafe2015
 

Semelhante a Leveraging performance counters and execution logs to diagnose memory related performance issues

Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenter
Andrei Khurshudov
 
Webinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System AnalysisWebinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System Analysis
Deepak Shankar
 

Semelhante a Leveraging performance counters and execution logs to diagnose memory related performance issues (20)

Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)
 
Big Events Cause Network Mayhem
Big Events Cause Network MayhemBig Events Cause Network Mayhem
Big Events Cause Network Mayhem
 
IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)
 
Automating cybersecurity
Automating cybersecurityAutomating cybersecurity
Automating cybersecurity
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Will County Sheriff’s Office: Solving Crime with Data
Will County Sheriff’s Office: Solving Crime with DataWill County Sheriff’s Office: Solving Crime with Data
Will County Sheriff’s Office: Solving Crime with Data
 
6 easy ways to monitor the success of Network Management Software
6 easy ways to monitor the success of Network Management Software6 easy ways to monitor the success of Network Management Software
6 easy ways to monitor the success of Network Management Software
 
KEY METRICS TO MONITOR IN SOLR
KEY METRICS TO MONITOR IN SOLR KEY METRICS TO MONITOR IN SOLR
KEY METRICS TO MONITOR IN SOLR
 
Data Discovery and PCI DSS
Data Discovery and PCI DSSData Discovery and PCI DSS
Data Discovery and PCI DSS
 
Card Data Discovery and PCI DSS
Card Data Discovery and PCI DSSCard Data Discovery and PCI DSS
Card Data Discovery and PCI DSS
 
ControlCase Data Discovery and PCI DSS
ControlCase Data Discovery and PCI DSSControlCase Data Discovery and PCI DSS
ControlCase Data Discovery and PCI DSS
 
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
 
Context is Critical: How Richer Data Yields Richer Results in AIOps | Bhanu S...
Context is Critical: How Richer Data Yields Richer Results in AIOps | Bhanu S...Context is Critical: How Richer Data Yields Richer Results in AIOps | Bhanu S...
Context is Critical: How Richer Data Yields Richer Results in AIOps | Bhanu S...
 
Data Discovery and PCI DSS
Data Discovery and PCI DSSData Discovery and PCI DSS
Data Discovery and PCI DSS
 
Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2Server and application monitoring webinars [Applications Manager] - Part 2
Server and application monitoring webinars [Applications Manager] - Part 2
 
Server and application monitoring webinars [Applications Manager]: Part 1
Server and application monitoring webinars [Applications Manager]: Part 1Server and application monitoring webinars [Applications Manager]: Part 1
Server and application monitoring webinars [Applications Manager]: Part 1
 
Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenter
 
Quick and dirty performance analysis
Quick and dirty performance analysisQuick and dirty performance analysis
Quick and dirty performance analysis
 
Solution Blueprint - Customer 360
Solution Blueprint - Customer 360Solution Blueprint - Customer 360
Solution Blueprint - Customer 360
 
Webinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System AnalysisWebinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System Analysis
 

Mais de SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
SAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
SAIL_QU
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
SAIL_QU
 

Mais de SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log Changes
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 

Leveraging performance counters and execution logs to diagnose memory related performance issues