SlideShare a Scribd company logo
1 of 55
Download to read offline
It’s all about telemetry
          Monitoring what matters in a useful way.




Tuesday, June 26, 12
Theo Schlossnagle                            @postwait


               I write software
               I write books
               I give talks
               I participate in the industry
               I speak frankly about industry issues




Tuesday, June 26, 12
Data, data, everywhere.

               A billion pageviews / month.
               100k database queries / second.
               1MM memcache queries / second.
               500k MQ messages / second.
               10MM I/O operations / second.




Tuesday, June 26, 12
Big Data

          Most new big data problems are

          solvable




Tuesday, June 26, 12
Big Data

          Most new big data problems are
          created by our solutions, and thus
          solvable
          despite their ROI



Tuesday, June 26, 12
That’s a whole lot of data

               Think in terms of logs (too many do)
                       About 26 trillion log lines / month
                       @ 40 bytes compressed: 1PB / month
               Just because it is possible
               does not mean it will return on investment
               (and does not mean it won’t)



Tuesday, June 26, 12
It’s all “useful”; which data?

               Think in terms of cost/benefit.
               Sure the data is useful, but it costs money to store
               Does it cost you more to have it or not to have it?
               Maybe the right approach is to keep that level of detail
               for a few days?




Tuesday, June 26, 12
Double-edged sword.



               Eroding granularity over time
               keeps storage under control




Tuesday, June 26, 12
Double-edged sword.


                                           K E
                             TA
               Eroding granularity over time


                           S
               keeps storage under control



                       M  I
Tuesday, June 26, 12
1 year
          at a glance

Tuesday, June 26, 12
1 week
          looks normalish

Tuesday, June 26, 12
1 day
          confidence of normalcy increases

Tuesday, June 26, 12
1 week
          that looks different

Tuesday, June 26, 12
1 day
          yup, that’s not at all like that other week

Tuesday, June 26, 12
Other methods

               What do you store?
               How do you store it?
               Why is it useful?
               Winning the cost benefit game by
               reducing costs more significantly than
               reducing benefits



Tuesday, June 26, 12
0   0.5   1    1.5        2              2.5   3




                                                                          1


                                        efit
                                     Ben
                                                         o st
                                                     C                0.75




                                                                          0.5




                                                                      0.25




                                        monitoring activity ➠



          Positive Value
          Be in the green.

Tuesday, June 26, 12
0   1   2   3   4          5        6     7     8       9       10
                                                                                         10




                                                                                        7.5




                                                                                         5




                                               o st
                                           C
                                                                                        2.5




                                                                              Benefit
                                                      monitoring activity ➠



          There’s a bigger picture
          It’s not as easy as you think.

Tuesday, June 26, 12
0   0.5   1    1.5        2              2.5   3




                                                                          1


                                        efit
                                     Ben
                                                         o st
                                                     C                0.75




                                                                          0.5




                                                                      0.25




                                        monitoring activity ➠



          Value is difference, not area
          Green can be misleading

Tuesday, June 26, 12
0.5   1   1.5              2            2.5   3
                                                                         0.5




                                                                     0.25




                                                                     -0.25




                                                                     -0.5



                                       monitoring activity ➠
                                                                     -0.75




          Value = Benefit - Cost                                          -1


          Green means we have positive return

Tuesday, June 26, 12
0.5   1     1.5              2            2.5   3
                                                                           0.5




                                                                       0.25




                                                                       -0.25




                                                                       -0.5



                                         monitoring activity ➠
                                                                       -0.75




          It’s not about return                                            -1


          Well, it’s not only about return

Tuesday, June 26, 12
0.5   1     1.5              2            2.5   3
                                                                           0.5




                                                                       0.25




                                                                       -0.25




                                                                       -0.5



                                         monitoring activity ➠
                                                                       -0.75




          It’s about maximizing return                                     -1


          This is a bit like black magic

Tuesday, June 26, 12
Technique 1: text



               Store changes




Tuesday, June 26, 12
Technique 2: numeric
               Store rollups
               (i.e. statistical aggregates over fixed windows)
               over 1 minute store
                       min/max/avg/stddev/covariance/50%/95%/99%
               lots of information
               heavy lossy compression of high-frequency data
               loses population distribution information


Tuesday, June 26, 12
Database replication
          Lag (green) and rate of lag change (purple)

Tuesday, June 26, 12
Storage Usage
          We can see growth.
          More useful, we can use this to project.
Tuesday, June 26, 12
Storage Usage
          We can see growth.
          More useful, we can use this to project.
Tuesday, June 26, 12
With simple numeric data

Tuesday, June 26, 12
With simple numeric data
          Unknowns can be predicted

Tuesday, June 26, 12
With simple numeric data
          In sane ways with confidence

Tuesday, June 26, 12
Full Disclosure

               You see awesome examples of predictive analytics
               Like the real-world one on the previous slide
               In practice, almost all data streams predict one thing:
                       they have no fucking clue.




Tuesday, June 26, 12
Technique 3: histograms

               Store histograms
               over 1 minute store
                       counts of datapoints seen in various buckets
               retains complete population distribution
               loss of precision




Tuesday, June 26, 12
Histograms 101
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Height represents frequency




Tuesday, June 26, 12
Histograms 101
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, height and color
               represents frequency




Tuesday, June 26, 12
Histograms 101
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, only color
               represents frequency




Tuesday, June 26, 12
Histograms 101
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, only color
               represents frequency




Tuesday, June 26, 12
Histograms ➠ time series
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, only color
               represents frequency




Tuesday, June 26, 12
Histograms ➠ time series
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, only color
               represents frequency




Tuesday, June 26, 12
Histograms ➠ time series
               This.
               This is a histogram.
               It shows the frequency of
               values within a population.
               Now, only color
               represents frequency

                                             at a single time interval


Tuesday, June 26, 12
API Service Times
          We can see a full population shift
          of several milliseconds
Tuesday, June 26, 12
Combining techniques

               In our system (as a reference point)
               Arbitrary numbers of numeric data points
               on a single stream
               occupy 32 bytes of space for statistical aggregates and
               occupy about 2k of space for a histogram
               These means we can store these transforms on
               numeric data in perpetuity



Tuesday, June 26, 12
Combining techniques
               Text is a bit harder
               You need to be careful
               Some data sources can be constantly changing
               Producing gobs of change data
                       You’re doing it wrong
                       Find these and fix them



Tuesday, June 26, 12
Correlating Events
          Change Management vs. Performance

Tuesday, June 26, 12
Correlating Events
          Change Management vs. Performance

Tuesday, June 26, 12
What to monitor?



               Most people don’t monitor the things that matter most




Tuesday, June 26, 12
Monitor the Business

               Financials:
                       Revenues. Costs. Margins. AR. Account delinquency.
               Marketing:
                       Web analytics. Campaigns. Costs. Returns.
                       Convergence.




Tuesday, June 26, 12
Monitor the Support


               Customer Service:
                       Problems. Time investment. Customer satisfaction.
                       Resolution time.




Tuesday, June 26, 12
Monitor the Engineering

               Engineering:
                       Deployments. Test coverage.
                       Bug reports. Bug fixes. Effort spent.
               Operations:
                       Faults. Pages. Escalations. Provisioning time.
                       Equipment defect rates. 3rd party failure rates.



Tuesday, June 26, 12
Monitor the Service
               Systems:
                       Networks. Systems. Storage.
               Databases:
                       Performance. Error rates. Backups.
               Middleware:
                       Herein lies the magic and room for awesomeness



Tuesday, June 26, 12
Monitor the Middleware


               Your systems are complex
               Monitor their interactions
                       Messaging, APIs, etc.




Tuesday, June 26, 12
Monitor all the things.

                       But, perhaps most importantly...




Tuesday, June 26, 12
Monitor all the things.

                       But, perhaps most importantly...




                                USE UNIFIED TOOLING




Tuesday, June 26, 12
What we use...

                       reconnoiter
                         SNMP, nad, resmon, statsd, HTTP traps, jdbc, etc.
                       statsd (clients)
                       javascript beacons




Tuesday, June 26, 12
Middleware mix
          API service times, traffic, user signup rates.

Tuesday, June 26, 12
Tuesday, June 26, 12
Thank you!




Tuesday, June 26, 12

More Related Content

What's hot

The future of business intelligence
The future of business intelligence The future of business intelligence
The future of business intelligence Phocas Software
 
The 7 principles of digital business strategy | Niall McKeown | iON
The 7 principles of digital business strategy | Niall McKeown | iONThe 7 principles of digital business strategy | Niall McKeown | iON
The 7 principles of digital business strategy | Niall McKeown | iONEnterprise Ireland
 
Data Mesh at Nordea with Kafka and Hadoop
Data Mesh at Nordea with Kafka and HadoopData Mesh at Nordea with Kafka and Hadoop
Data Mesh at Nordea with Kafka and HadoopRaduDragusin1
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
Ikea - A case study in stimulating innovation and change
Ikea - A case study in stimulating innovation and changeIkea - A case study in stimulating innovation and change
Ikea - A case study in stimulating innovation and changeAnkit Uttam
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformDatabricks
 
AI and the Future of Healthcare, Siemens Healthineers
AI and the Future of Healthcare, Siemens HealthineersAI and the Future of Healthcare, Siemens Healthineers
AI and the Future of Healthcare, Siemens HealthineersLevi Shapiro
 
Cashing in on analytics in the retail chain
Cashing in on analytics in the retail chain Cashing in on analytics in the retail chain
Cashing in on analytics in the retail chain Tridant
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Pitch 2 helena zhao
Pitch 2  helena zhaoPitch 2  helena zhao
Pitch 2 helena zhaoHelenaZhao5
 
Analytics Overview #Predictive Analytics
Analytics Overview #Predictive AnalyticsAnalytics Overview #Predictive Analytics
Analytics Overview #Predictive AnalyticsDurga Palakurthy
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptxFedoRam1
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationDenodo
 
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)Cathrine Wilhelmsen
 
Modern Data Warehouse with Azure Synapse.pdf
Modern Data Warehouse with Azure Synapse.pdfModern Data Warehouse with Azure Synapse.pdf
Modern Data Warehouse with Azure Synapse.pdfKeyla Dolores Méndez
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven CultureLucas Neo
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceRoland Bullivant
 

What's hot (20)

The future of business intelligence
The future of business intelligence The future of business intelligence
The future of business intelligence
 
Ikea Presentation
Ikea PresentationIkea Presentation
Ikea Presentation
 
The 7 principles of digital business strategy | Niall McKeown | iON
The 7 principles of digital business strategy | Niall McKeown | iONThe 7 principles of digital business strategy | Niall McKeown | iON
The 7 principles of digital business strategy | Niall McKeown | iON
 
Data Mesh at Nordea with Kafka and Hadoop
Data Mesh at Nordea with Kafka and HadoopData Mesh at Nordea with Kafka and Hadoop
Data Mesh at Nordea with Kafka and Hadoop
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Ikea - A case study in stimulating innovation and change
Ikea - A case study in stimulating innovation and changeIkea - A case study in stimulating innovation and change
Ikea - A case study in stimulating innovation and change
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
AI and the Future of Healthcare, Siemens Healthineers
AI and the Future of Healthcare, Siemens HealthineersAI and the Future of Healthcare, Siemens Healthineers
AI and the Future of Healthcare, Siemens Healthineers
 
Cashing in on analytics in the retail chain
Cashing in on analytics in the retail chain Cashing in on analytics in the retail chain
Cashing in on analytics in the retail chain
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Pitch 2 helena zhao
Pitch 2  helena zhaoPitch 2  helena zhao
Pitch 2 helena zhao
 
The Year of the Graph
The Year of the GraphThe Year of the Graph
The Year of the Graph
 
adb.pdf
adb.pdfadb.pdf
adb.pdf
 
Analytics Overview #Predictive Analytics
Analytics Overview #Predictive AnalyticsAnalytics Overview #Predictive Analytics
Analytics Overview #Predictive Analytics
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
 
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
 
Modern Data Warehouse with Azure Synapse.pdf
Modern Data Warehouse with Azure Synapse.pdfModern Data Warehouse with Azure Synapse.pdf
Modern Data Warehouse with Azure Synapse.pdf
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven Culture
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
 

Viewers also liked

Telemetry types, frequency,position and multiplexing in telemetry
Telemetry types, frequency,position and multiplexing in telemetryTelemetry types, frequency,position and multiplexing in telemetry
Telemetry types, frequency,position and multiplexing in telemetrysagheer ahmed
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012Theo Schlossnagle
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentationTheo Schlossnagle
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
Wireless telemetry systems
Wireless telemetry systemsWireless telemetry systems
Wireless telemetry systemsSneha Suluru
 
Biotelemetry
BiotelemetryBiotelemetry
BiotelemetrySamuely
 
Data transmission and telemetry
Data transmission and telemetryData transmission and telemetry
Data transmission and telemetryslide rock
 
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIES
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIESTELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIES
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIESRubashkyn
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.Theo Schlossnagle
 
IEEE BASE paper on artifical retina using TTF technology
IEEE BASE paper on artifical retina using TTF technologyIEEE BASE paper on artifical retina using TTF technology
IEEE BASE paper on artifical retina using TTF technologyAnu Antony
 

Viewers also liked (20)

Telrmetry1
Telrmetry1Telrmetry1
Telrmetry1
 
Telemetry types, frequency,position and multiplexing in telemetry
Telemetry types, frequency,position and multiplexing in telemetryTelemetry types, frequency,position and multiplexing in telemetry
Telemetry types, frequency,position and multiplexing in telemetry
 
Project reality
Project realityProject reality
Project reality
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Wireless telemetry systems
Wireless telemetry systemsWireless telemetry systems
Wireless telemetry systems
 
Biotelemetry
BiotelemetryBiotelemetry
Biotelemetry
 
Data transmission and telemetry
Data transmission and telemetryData transmission and telemetry
Data transmission and telemetry
 
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIES
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIESTELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIES
TELEMEDICINE AND HEALTH INFORMATION TECHNOLOGIES
 
Atldevops
AtldevopsAtldevops
Atldevops
 
Xtreme Deployment
Xtreme DeploymentXtreme Deployment
Xtreme Deployment
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
What's in a number?
What's in a number?What's in a number?
What's in a number?
 
Understanding Slowness
Understanding SlownessUnderstanding Slowness
Understanding Slowness
 
SRECon Coherent Performance
SRECon Coherent PerformanceSRECon Coherent Performance
SRECon Coherent Performance
 
Omnios and unix
Omnios and unixOmnios and unix
Omnios and unix
 
IEEE BASE paper on artifical retina using TTF technology
IEEE BASE paper on artifical retina using TTF technologyIEEE BASE paper on artifical retina using TTF technology
IEEE BASE paper on artifical retina using TTF technology
 
Adaptive availability
Adaptive availabilityAdaptive availability
Adaptive availability
 
Craftsmanship
CraftsmanshipCraftsmanship
Craftsmanship
 

More from Theo Schlossnagle

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to ComplexityTheo Schlossnagle
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwareTheo Schlossnagle
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or NotTheo Schlossnagle
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service designTheo Schlossnagle
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About PerformanceTheo Schlossnagle
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoringTheo Schlossnagle
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachTheo Schlossnagle
 
Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everythingTheo Schlossnagle
 

More from Theo Schlossnagle (17)

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to Complexity
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped Software
 
Monitoring 101
Monitoring 101Monitoring 101
Monitoring 101
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or Not
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
Commandments of scale
Commandments of scaleCommandments of scale
Commandments of scale
 
Monitoring the #DevOps way
Monitoring the #DevOps wayMonitoring the #DevOps way
Monitoring the #DevOps way
 
Operational Software Design
Operational Software DesignOperational Software Design
Operational Software Design
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoring
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approach
 
Webops dashboards
Webops dashboardsWebops dashboards
Webops dashboards
 
Web Operations Career
Web Operations CareerWeb Operations Career
Web Operations Career
 
Http front-ends
Http front-endsHttp front-ends
Http front-ends
 
Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everything
 

Recently uploaded

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

It's all about telemetry

  • 1. It’s all about telemetry Monitoring what matters in a useful way. Tuesday, June 26, 12
  • 2. Theo Schlossnagle @postwait I write software I write books I give talks I participate in the industry I speak frankly about industry issues Tuesday, June 26, 12
  • 3. Data, data, everywhere. A billion pageviews / month. 100k database queries / second. 1MM memcache queries / second. 500k MQ messages / second. 10MM I/O operations / second. Tuesday, June 26, 12
  • 4. Big Data Most new big data problems are solvable Tuesday, June 26, 12
  • 5. Big Data Most new big data problems are created by our solutions, and thus solvable despite their ROI Tuesday, June 26, 12
  • 6. That’s a whole lot of data Think in terms of logs (too many do) About 26 trillion log lines / month @ 40 bytes compressed: 1PB / month Just because it is possible does not mean it will return on investment (and does not mean it won’t) Tuesday, June 26, 12
  • 7. It’s all “useful”; which data? Think in terms of cost/benefit. Sure the data is useful, but it costs money to store Does it cost you more to have it or not to have it? Maybe the right approach is to keep that level of detail for a few days? Tuesday, June 26, 12
  • 8. Double-edged sword. Eroding granularity over time keeps storage under control Tuesday, June 26, 12
  • 9. Double-edged sword. K E TA Eroding granularity over time S keeps storage under control M I Tuesday, June 26, 12
  • 10. 1 year at a glance Tuesday, June 26, 12
  • 11. 1 week looks normalish Tuesday, June 26, 12
  • 12. 1 day confidence of normalcy increases Tuesday, June 26, 12
  • 13. 1 week that looks different Tuesday, June 26, 12
  • 14. 1 day yup, that’s not at all like that other week Tuesday, June 26, 12
  • 15. Other methods What do you store? How do you store it? Why is it useful? Winning the cost benefit game by reducing costs more significantly than reducing benefits Tuesday, June 26, 12
  • 16. 0 0.5 1 1.5 2 2.5 3 1 efit Ben o st C 0.75 0.5 0.25 monitoring activity ➠ Positive Value Be in the green. Tuesday, June 26, 12
  • 17. 0 1 2 3 4 5 6 7 8 9 10 10 7.5 5 o st C 2.5 Benefit monitoring activity ➠ There’s a bigger picture It’s not as easy as you think. Tuesday, June 26, 12
  • 18. 0 0.5 1 1.5 2 2.5 3 1 efit Ben o st C 0.75 0.5 0.25 monitoring activity ➠ Value is difference, not area Green can be misleading Tuesday, June 26, 12
  • 19. 0.5 1 1.5 2 2.5 3 0.5 0.25 -0.25 -0.5 monitoring activity ➠ -0.75 Value = Benefit - Cost -1 Green means we have positive return Tuesday, June 26, 12
  • 20. 0.5 1 1.5 2 2.5 3 0.5 0.25 -0.25 -0.5 monitoring activity ➠ -0.75 It’s not about return -1 Well, it’s not only about return Tuesday, June 26, 12
  • 21. 0.5 1 1.5 2 2.5 3 0.5 0.25 -0.25 -0.5 monitoring activity ➠ -0.75 It’s about maximizing return -1 This is a bit like black magic Tuesday, June 26, 12
  • 22. Technique 1: text Store changes Tuesday, June 26, 12
  • 23. Technique 2: numeric Store rollups (i.e. statistical aggregates over fixed windows) over 1 minute store min/max/avg/stddev/covariance/50%/95%/99% lots of information heavy lossy compression of high-frequency data loses population distribution information Tuesday, June 26, 12
  • 24. Database replication Lag (green) and rate of lag change (purple) Tuesday, June 26, 12
  • 25. Storage Usage We can see growth. More useful, we can use this to project. Tuesday, June 26, 12
  • 26. Storage Usage We can see growth. More useful, we can use this to project. Tuesday, June 26, 12
  • 27. With simple numeric data Tuesday, June 26, 12
  • 28. With simple numeric data Unknowns can be predicted Tuesday, June 26, 12
  • 29. With simple numeric data In sane ways with confidence Tuesday, June 26, 12
  • 30. Full Disclosure You see awesome examples of predictive analytics Like the real-world one on the previous slide In practice, almost all data streams predict one thing: they have no fucking clue. Tuesday, June 26, 12
  • 31. Technique 3: histograms Store histograms over 1 minute store counts of datapoints seen in various buckets retains complete population distribution loss of precision Tuesday, June 26, 12
  • 32. Histograms 101 This. This is a histogram. It shows the frequency of values within a population. Height represents frequency Tuesday, June 26, 12
  • 33. Histograms 101 This. This is a histogram. It shows the frequency of values within a population. Now, height and color represents frequency Tuesday, June 26, 12
  • 34. Histograms 101 This. This is a histogram. It shows the frequency of values within a population. Now, only color represents frequency Tuesday, June 26, 12
  • 35. Histograms 101 This. This is a histogram. It shows the frequency of values within a population. Now, only color represents frequency Tuesday, June 26, 12
  • 36. Histograms ➠ time series This. This is a histogram. It shows the frequency of values within a population. Now, only color represents frequency Tuesday, June 26, 12
  • 37. Histograms ➠ time series This. This is a histogram. It shows the frequency of values within a population. Now, only color represents frequency Tuesday, June 26, 12
  • 38. Histograms ➠ time series This. This is a histogram. It shows the frequency of values within a population. Now, only color represents frequency at a single time interval Tuesday, June 26, 12
  • 39. API Service Times We can see a full population shift of several milliseconds Tuesday, June 26, 12
  • 40. Combining techniques In our system (as a reference point) Arbitrary numbers of numeric data points on a single stream occupy 32 bytes of space for statistical aggregates and occupy about 2k of space for a histogram These means we can store these transforms on numeric data in perpetuity Tuesday, June 26, 12
  • 41. Combining techniques Text is a bit harder You need to be careful Some data sources can be constantly changing Producing gobs of change data You’re doing it wrong Find these and fix them Tuesday, June 26, 12
  • 42. Correlating Events Change Management vs. Performance Tuesday, June 26, 12
  • 43. Correlating Events Change Management vs. Performance Tuesday, June 26, 12
  • 44. What to monitor? Most people don’t monitor the things that matter most Tuesday, June 26, 12
  • 45. Monitor the Business Financials: Revenues. Costs. Margins. AR. Account delinquency. Marketing: Web analytics. Campaigns. Costs. Returns. Convergence. Tuesday, June 26, 12
  • 46. Monitor the Support Customer Service: Problems. Time investment. Customer satisfaction. Resolution time. Tuesday, June 26, 12
  • 47. Monitor the Engineering Engineering: Deployments. Test coverage. Bug reports. Bug fixes. Effort spent. Operations: Faults. Pages. Escalations. Provisioning time. Equipment defect rates. 3rd party failure rates. Tuesday, June 26, 12
  • 48. Monitor the Service Systems: Networks. Systems. Storage. Databases: Performance. Error rates. Backups. Middleware: Herein lies the magic and room for awesomeness Tuesday, June 26, 12
  • 49. Monitor the Middleware Your systems are complex Monitor their interactions Messaging, APIs, etc. Tuesday, June 26, 12
  • 50. Monitor all the things. But, perhaps most importantly... Tuesday, June 26, 12
  • 51. Monitor all the things. But, perhaps most importantly... USE UNIFIED TOOLING Tuesday, June 26, 12
  • 52. What we use... reconnoiter SNMP, nad, resmon, statsd, HTTP traps, jdbc, etc. statsd (clients) javascript beacons Tuesday, June 26, 12
  • 53. Middleware mix API service times, traffic, user signup rates. Tuesday, June 26, 12