SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
2 © 2014 CA. ALL RIGHTS RESERVED.
Agenda
 Why so many metrics with APM?
– “Big Data”?
 What we are learning with CA-ABA (analytics)
 How to find KPIs
 What’s new for CA-APM 9.6 Release
3 © 2014 CA. ALL RIGHTS RESERVED.
Typical APM Cluster
 Dozens to hundreds of applications
– 2800 JVMs/CLRs
 Up to 5M metrics, every 15 seconds
 Large applications span multiple data centers
– 2-8 APM clusters, typical
– 30-70 EM Collectors for a nationwide portal application
 12M to 28M metrics, every 15 seconds
… certainly sounds like big data!!!
4 © 2014 CA. ALL RIGHTS RESERVED.
What is Big Data???
APM information is “big”… but it is not “big data” without enrichment
5M Metrics
that you don’t fully
understand
OR
5M Metrics
that you don’t
fully understand
Trouble
Management
Version
Control
Time of ____
Constraints
Air Traffic
Advisories
Weather
Forecast
AP News
Updates
Marketing
Campaigns
E N R I C H M E N T
Correlation
Trends
Insights
Anomalies
5 © 2014 CA. ALL RIGHTS RESERVED.
Challenges for Big Data
 Data Variety – different sources gives different perspectives.
Does your data have a significant perspective?
 Validation – is the data source meaningful/predictive?
 Consistency – are the values trustworthy?
 Data Structure and Nomenclature – Mapping, Transformation
 Temporal Impedance Mismatch
– APM: real-time with 15 second reporting interval
– Trouble Management: +15-30 minutes later
– Stock Ticker: +15-30 minutes later
– Air Traffic Advisories: +30-60 minutes later
– Version Control: days to weeks in advance
– Marketing Campaign Assessment: 2-4 weeks later
6 © 2014 CA. ALL RIGHTS RESERVED.
KPI Management Maturity
SGCM:
Stalls,
GC Settings,
Concurrency,
Memory Management Trends
APC :
Availability,
Performance,
Capacity
EKB:
Errors,
Key Resource Performance,
Business Transaction Survey
VALUE
KPI MATURITY
(Platform) (Application) (Transaction)
What We are Learning with CA-ABA
ABA Logical Architecture
APM Cluster
5M Metrics
100k
Metrics
(via RegEx)
Anomaly Engine
Anomalies Alerts
Why only 100k Metrics???
Why not 5M???
RegEx == Regular Expression
 analytics.metricfeed.process.3 =
 Custom Metric Host (Virtual) |Custom Metric Process
(Virtual)|Custom Business Application Agent (Virtual)
 analytics.metricfeed.metric.3 =
 By Business Service|[^|]+|[^|]+|[^|]+:.+
RegEx is hard… but easy to validate
Metricfeed.3
0
20
40
60
80
100
120
140
160
180
200
Series1
metricfeed.3
Broader collection of metrics but only 87/500 == 17.4% are
generally known as useful
Suspects Identified via Baseline Technique
SiteMinder Backends JSP Frontends JMX Custom
0
2
4
6
8
10
12
14
16
18
Series1
Suspects via Baseline Techniques
Average RT only
100% Useful metrics, ready for validation: 47/43625 == 0.1%
Metric Count TypeView
What is an Application?
 Front-ends
– Browser? Webservice? Messaging?
 Back-ends
– Databases Webservices Messaging Mainframes Trading_Partners
 Muck-in-the-Middle
– Software quality, stability and scalability
 - We want to identify KPIs for each of these elements
– - helps us build a useful dashboard for Operations
– - helps expose with the resources are really doing
– - helps us define acceptance criteria, to act proactively
– - helps us to triage really effectively
How to Find KPIs
Capacity KPIs – “Tree Rings”
Performance KPIs
High Volume
+
Significant Response Time
Create a Simple Alert and Threshold
(ConnectionStatus)
Create a Simple Alert, Find Restart and threshold
(MetricCount)
“UP” – but not actually doing anything!!!
Understanding Your Environment
 Identify the KPIs
– Availability
 Agent ConnectionStatus
 Number Live Metrics (Metric Count)
– Performance
 High Volume components with significant response time
– NOT “Top 10 Response Time”
– Capacity
 Highest Volume Components
 Don’t Wait for Production!!!
– Make it part of your pre-production review
– Manage the application lifecycle by trending KPIs
Good Better (additional) Best (additional)
Stalls Availability – Connected
Status
Errors
GC Settings Availability - Metric
Count
Key Resource
Performance
Concurrency Suspect Performance Business Transaction
Survey
Memory Management
(graph)
Suspect Capacity
Platform
Coarse information
..but not really APM
Application, Transactions, Resources
The APM Advantage
KPI Evolution
What’s New in CA APM 9.6
Simplified, automated, and built on CA APM strengths.
Seamless Mainframe Awareness
Faster, Easier APM
• Intelligent Deep Transaction Trace is now dynamic, automated, and requires less developer
involvement for deep dives into apps supporting the transactions
• Simplified Triage with easier drill down with Application Triage Map including Socket Grouping
• Improved response times with software based Transaction Impact Monitor (end-user experience)
• Expanding APMs scope with Java 7 EM & Agents
• Increased insight by adding DB2 details to transaction traces
• Greater awareness with CA SYSVIEW MQ alerts & complete status in APM
• Driving further cross enterprise depth with CTG traces to fully expand backend calls
• Other mainframe based enhancements
Preparing to Upgrade
 HealthCheck the existing cluster prior to any upgrade
 Good:
– - Do a clean install of the APM Cluster, alongside of the existing cluster version.
 - Manually duplicate management modules, domains.xml, etc.
 - Bring down the old version, then bring up the new
 Better:
– - Install the new version in a separate environment, reduced size
– - migrate a few applications to the new environment for validation
– - upgrade the primary environment after validation achieved
 Best:
– - Install a new GOLD environment in production, separate from original cluster
– - migrate agents, as schedules permit, until original cluster may be
decommissioned
– - this provides an opportunity to introduce pre-production review and generally
correct any bad deployment habits
Resources
 APM Community Site (
https://communities.ca.com/web/ca-wily-global-user-community
– - Cookbook: APM HealthCheck
– - Understanding Which Metrics Matter (KPI discussion)
– - Cookbook: Application Audit
 - more details on the baseline techniques and process
 APM best practices – Realizing Application Performance Management
– available on Amazon.com and Apress.com
 - Baselines, Test Plans, App Audits, Triage, Firefighting
 - Organizational Models, Service Catalogs
 APM Web Page : Ca.com/apm

Mais conteúdo relacionado

Mais de CA Technologies

Mais de CA Technologies (20)

Making Security Work—Implementing a Transformational Security Program
Making Security Work—Implementing a Transformational Security ProgramMaking Security Work—Implementing a Transformational Security Program
Making Security Work—Implementing a Transformational Security Program
 
Keynote: Making Security a Competitive Advantage
Keynote: Making Security a Competitive AdvantageKeynote: Making Security a Competitive Advantage
Keynote: Making Security a Competitive Advantage
 
Emerging Managed Services Opportunities in Identity and Access Management
Emerging Managed Services Opportunities in Identity and Access ManagementEmerging Managed Services Opportunities in Identity and Access Management
Emerging Managed Services Opportunities in Identity and Access Management
 
The Unmet Demand for Premium Cloud Monitoring Services—and How Service Provid...
The Unmet Demand for Premium Cloud Monitoring Services—and How Service Provid...The Unmet Demand for Premium Cloud Monitoring Services—and How Service Provid...
The Unmet Demand for Premium Cloud Monitoring Services—and How Service Provid...
 
Leveraging Monitoring Governance: How Service Providers Can Boost Operational...
Leveraging Monitoring Governance: How Service Providers Can Boost Operational...Leveraging Monitoring Governance: How Service Providers Can Boost Operational...
Leveraging Monitoring Governance: How Service Providers Can Boost Operational...
 
The Next Big Service Provider Opportunity—Beyond Infrastructure: Architecting...
The Next Big Service Provider Opportunity—Beyond Infrastructure: Architecting...The Next Big Service Provider Opportunity—Beyond Infrastructure: Architecting...
The Next Big Service Provider Opportunity—Beyond Infrastructure: Architecting...
 
Application Experience Analytics Services: The Strategic Digital Transformati...
Application Experience Analytics Services: The Strategic Digital Transformati...Application Experience Analytics Services: The Strategic Digital Transformati...
Application Experience Analytics Services: The Strategic Digital Transformati...
 
Application Experience Analytics Services: The Strategic Digital Transformati...
Application Experience Analytics Services: The Strategic Digital Transformati...Application Experience Analytics Services: The Strategic Digital Transformati...
Application Experience Analytics Services: The Strategic Digital Transformati...
 
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
Strategic Direction Session: Deliver Next-Gen IT Ops with CA Mainframe Operat...
 
Strategic Direction Session: Enhancing Data Privacy with Data-Centric Securit...
Strategic Direction Session: Enhancing Data Privacy with Data-Centric Securit...Strategic Direction Session: Enhancing Data Privacy with Data-Centric Securit...
Strategic Direction Session: Enhancing Data Privacy with Data-Centric Securit...
 
Blockchain: Strategies for Moving From Hype to Realities of Deployment
Blockchain: Strategies for Moving From Hype to Realities of DeploymentBlockchain: Strategies for Moving From Hype to Realities of Deployment
Blockchain: Strategies for Moving From Hype to Realities of Deployment
 
Establish Digital Trust as the Currency of Digital Enterprise
Establish Digital Trust as the Currency of Digital EnterpriseEstablish Digital Trust as the Currency of Digital Enterprise
Establish Digital Trust as the Currency of Digital Enterprise
 
How Components Increase Speed and Risk
How Components Increase Speed and RiskHow Components Increase Speed and Risk
How Components Increase Speed and Risk
 
Securing Your Enterprise Continuous Delivery Pipelines with CA Automation Sol...
Securing Your Enterprise Continuous Delivery Pipelines with CA Automation Sol...Securing Your Enterprise Continuous Delivery Pipelines with CA Automation Sol...
Securing Your Enterprise Continuous Delivery Pipelines with CA Automation Sol...
 
The CA Technologies | Veracode Platform: A 360-Degree View of Your Applicatio...
The CA Technologies | Veracode Platform: A 360-Degree View of Your Applicatio...The CA Technologies | Veracode Platform: A 360-Degree View of Your Applicatio...
The CA Technologies | Veracode Platform: A 360-Degree View of Your Applicatio...
 
When You Test Matters: Why Testing Early in the SDLC is Important
When You Test Matters: Why Testing Early in the SDLC is ImportantWhen You Test Matters: Why Testing Early in the SDLC is Important
When You Test Matters: Why Testing Early in the SDLC is Important
 
Application Security in a DevOps World
Application Security in a DevOps WorldApplication Security in a DevOps World
Application Security in a DevOps World
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software Development
 
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
Case Study: How CA Went From 40 Days to Three Days Building Crystal-Clear Tes...
 
Case Study: Continuous Delivery in a Tech Debt Laden World by Talk Talk.
Case Study: Continuous Delivery in a Tech Debt Laden World by Talk Talk.Case Study: Continuous Delivery in a Tech Debt Laden World by Talk Talk.
Case Study: Continuous Delivery in a Tech Debt Laden World by Talk Talk.
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Redefine Triage by Learning the Golden Nuggets of APM

  • 1.
  • 2. 2 © 2014 CA. ALL RIGHTS RESERVED. Agenda  Why so many metrics with APM? – “Big Data”?  What we are learning with CA-ABA (analytics)  How to find KPIs  What’s new for CA-APM 9.6 Release
  • 3. 3 © 2014 CA. ALL RIGHTS RESERVED. Typical APM Cluster  Dozens to hundreds of applications – 2800 JVMs/CLRs  Up to 5M metrics, every 15 seconds  Large applications span multiple data centers – 2-8 APM clusters, typical – 30-70 EM Collectors for a nationwide portal application  12M to 28M metrics, every 15 seconds … certainly sounds like big data!!!
  • 4. 4 © 2014 CA. ALL RIGHTS RESERVED. What is Big Data??? APM information is “big”… but it is not “big data” without enrichment 5M Metrics that you don’t fully understand OR 5M Metrics that you don’t fully understand Trouble Management Version Control Time of ____ Constraints Air Traffic Advisories Weather Forecast AP News Updates Marketing Campaigns E N R I C H M E N T Correlation Trends Insights Anomalies
  • 5. 5 © 2014 CA. ALL RIGHTS RESERVED. Challenges for Big Data  Data Variety – different sources gives different perspectives. Does your data have a significant perspective?  Validation – is the data source meaningful/predictive?  Consistency – are the values trustworthy?  Data Structure and Nomenclature – Mapping, Transformation  Temporal Impedance Mismatch – APM: real-time with 15 second reporting interval – Trouble Management: +15-30 minutes later – Stock Ticker: +15-30 minutes later – Air Traffic Advisories: +30-60 minutes later – Version Control: days to weeks in advance – Marketing Campaign Assessment: 2-4 weeks later
  • 6. 6 © 2014 CA. ALL RIGHTS RESERVED. KPI Management Maturity SGCM: Stalls, GC Settings, Concurrency, Memory Management Trends APC : Availability, Performance, Capacity EKB: Errors, Key Resource Performance, Business Transaction Survey VALUE KPI MATURITY (Platform) (Application) (Transaction)
  • 7. What We are Learning with CA-ABA
  • 8. ABA Logical Architecture APM Cluster 5M Metrics 100k Metrics (via RegEx) Anomaly Engine Anomalies Alerts Why only 100k Metrics??? Why not 5M???
  • 9. RegEx == Regular Expression  analytics.metricfeed.process.3 =  Custom Metric Host (Virtual) |Custom Metric Process (Virtual)|Custom Business Application Agent (Virtual)  analytics.metricfeed.metric.3 =  By Business Service|[^|]+|[^|]+|[^|]+:.+
  • 10. RegEx is hard… but easy to validate
  • 11. Metricfeed.3 0 20 40 60 80 100 120 140 160 180 200 Series1 metricfeed.3 Broader collection of metrics but only 87/500 == 17.4% are generally known as useful
  • 12. Suspects Identified via Baseline Technique SiteMinder Backends JSP Frontends JMX Custom 0 2 4 6 8 10 12 14 16 18 Series1 Suspects via Baseline Techniques Average RT only 100% Useful metrics, ready for validation: 47/43625 == 0.1%
  • 14. What is an Application?  Front-ends – Browser? Webservice? Messaging?  Back-ends – Databases Webservices Messaging Mainframes Trading_Partners  Muck-in-the-Middle – Software quality, stability and scalability  - We want to identify KPIs for each of these elements – - helps us build a useful dashboard for Operations – - helps expose with the resources are really doing – - helps us define acceptance criteria, to act proactively – - helps us to triage really effectively
  • 15. How to Find KPIs
  • 16. Capacity KPIs – “Tree Rings”
  • 18. Create a Simple Alert and Threshold (ConnectionStatus)
  • 19. Create a Simple Alert, Find Restart and threshold (MetricCount) “UP” – but not actually doing anything!!!
  • 20. Understanding Your Environment  Identify the KPIs – Availability  Agent ConnectionStatus  Number Live Metrics (Metric Count) – Performance  High Volume components with significant response time – NOT “Top 10 Response Time” – Capacity  Highest Volume Components  Don’t Wait for Production!!! – Make it part of your pre-production review – Manage the application lifecycle by trending KPIs
  • 21. Good Better (additional) Best (additional) Stalls Availability – Connected Status Errors GC Settings Availability - Metric Count Key Resource Performance Concurrency Suspect Performance Business Transaction Survey Memory Management (graph) Suspect Capacity Platform Coarse information ..but not really APM Application, Transactions, Resources The APM Advantage KPI Evolution
  • 22. What’s New in CA APM 9.6 Simplified, automated, and built on CA APM strengths. Seamless Mainframe Awareness Faster, Easier APM • Intelligent Deep Transaction Trace is now dynamic, automated, and requires less developer involvement for deep dives into apps supporting the transactions • Simplified Triage with easier drill down with Application Triage Map including Socket Grouping • Improved response times with software based Transaction Impact Monitor (end-user experience) • Expanding APMs scope with Java 7 EM & Agents • Increased insight by adding DB2 details to transaction traces • Greater awareness with CA SYSVIEW MQ alerts & complete status in APM • Driving further cross enterprise depth with CTG traces to fully expand backend calls • Other mainframe based enhancements
  • 23. Preparing to Upgrade  HealthCheck the existing cluster prior to any upgrade  Good: – - Do a clean install of the APM Cluster, alongside of the existing cluster version.  - Manually duplicate management modules, domains.xml, etc.  - Bring down the old version, then bring up the new  Better: – - Install the new version in a separate environment, reduced size – - migrate a few applications to the new environment for validation – - upgrade the primary environment after validation achieved  Best: – - Install a new GOLD environment in production, separate from original cluster – - migrate agents, as schedules permit, until original cluster may be decommissioned – - this provides an opportunity to introduce pre-production review and generally correct any bad deployment habits
  • 24. Resources  APM Community Site ( https://communities.ca.com/web/ca-wily-global-user-community – - Cookbook: APM HealthCheck – - Understanding Which Metrics Matter (KPI discussion) – - Cookbook: Application Audit  - more details on the baseline techniques and process  APM best practices – Realizing Application Performance Management – available on Amazon.com and Apress.com  - Baselines, Test Plans, App Audits, Triage, Firefighting  - Organizational Models, Service Catalogs  APM Web Page : Ca.com/apm