We all like building and deploying cloud applications. But what happens once that’s done? How do we know if our application behaves like we expect it to behave? Of course, logging! But how do we get that data off of our machines? How do we sift through a bunch of seemingly meaningless diagnostics? In this session, we’ll look at how we can keep track of our Azure application using structured logging, AppInsights and AppInsights analytics to make all that data more meaningful.
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
1. What is going on?
Application Diagnostics on Azure
Maarten Balliauw
@maartenballiauw
2. Who am I?
Maarten Balliauw
Antwerp, Belgium
Developer Advocate, JetBrains
Founder, MyGet
AZUG
Focus on web
ASP.NET MVC, Azure, SignalR, ...
Former MVP Azure & ASPInsider
Big passion: Azure
http://blog.maartenballiauw.be
@maartenballiauw
3. Agenda
Logging
Why?
Logging sucks!
The need for semantic/structured logging
Application Insights
SDK, Azure portal
Application Insights Analytics
Find the needle in the haystack!
Build, measure, improve
5. Why logging?
Troubleshooting – Did a problem occur? Where? Our code? The machine?
Performance / Cost – Did we make the application slower?
Improvement – Can a problem be detected or avoided?
Trends – Need more storage? Need more servers?
Customer Experience – Are people happy? Or seeing error after error?
Business Decisions – Can we increase sales based on log data / telemetry?
6. So here’s what we typically do...
System.Diagnostics.Trace.TraceInformation(
"Something happened");
System.Diagnostics.Trace.TraceWarning(
"Error! " + ex.Message);
Does this help troubleshooting? Improve the application? Analyze trends?
7. And here’s a typical log...
App.exe [12:13:03:985] Information: 0 : Customer address updated.
App.exe [12:13:04:011] Error: 8 : System.NullReferenceException occurred. Value
can not be null.
App.exe [12:13:04:567] Information: 0 : Machine policy value 'Debug' is 0
App.exe [12:13:04:569] Verbose: 9 : ******* RunEngine
******* Action:
******* CommandLine: **********
App.exe [12:13:04:578] Information: 9 : Entered CheckResult()
App.exe [12:13:04:689] Debug: 0 : Created. Req=0, Ret=8, Font: Req=null
Does this help troubleshooting? Improve the application? Analyze trends?
9. Log files suck.
Stupid string data
Unless using ETW / proper System.Diagnostics listener
Typical log file has no “context”
Typical log file has no correlation
How to get data off our machines?
How to report/analyze?
10. Log files suck.
Stupid string data
Typical log file has no “context”
Process/thread
Machine name
User performing the action
Data specific to the event being logged
Typical log file has no correlation
How to get data off our machines?
How to report/analyze?
11. Log files suck.
Stupid string data
Typical log file has no “context”
Typical log file has no correlation, e.g.:
What was the server load when this was logged?
How much memory was consumed at the time?
Which web request invoked this action?
How to get data off our machines?
How to report/analyze?
12. Log files suck.
Stupid string data
Typical log file has no “context”
Typical log file has no correlation
How to get data off our machines?
How to report/analyze?
13. Log files suck.
Stupid string data
Typical log file has no “context”
Typical log file has no correlation
How to get data off our machines?
How to report/analyze?
Log processing and analysis – log, store, process, query, proactive analysis
Logstash, ElasticSearch, ... – cumbersome to setup
14. What if we had (better)
contextual information?
15. Structured/Semantic logging
“Of or relating to meaning, especially meaning in language”
– the dictionary
Windows Event Log
Event Tracing for Windows (ETW)
Semantic Logging Application Blog (PnP SLAB)
Serilog
…
16. Event Source
XML representation containing
contextual information
application-specific data elements
Can be read by humans (just a message, like any log)
Can be read by machines (XML, baby!)
add “meaning” – we know what the data points can be
allow automated aggregaton / trend analysis
Fast!
(not very developer friendly)
18. Event Source
Super awesome
It has structured logging and context
Can collect all kinds of data combined (e.g. HTTP + own event source)
Hard to maintain
Versioning
Ceremony
One method for every single thing we want to log (or at least an event id)
Requires thinking about the events to log
Lots of data
19. Serilog - https://serilog.net/
Logging library like any other, mostly
Sinks https://github.com/serilog/serilog/wiki/Provided-Sinks
Console, file, event log
Database
ElasticSearch
AppInsights
Structured data as a first-class concept
Enrichers, filters, …
20. Structured data in Serilog
Simple scalar values (bool, string, int, …)
Collections and dictionaries
Full objects
Rendered by a sink to, for example, a string
Can be stored by a sink to provide additional dimensions on data
22. Enrichment of data
Enrich log entries with additional context
Machine name
User name
Process / thread information
ASP.NET client IP / hostname
ASP.NET user agent
…
Using Serilog enrichers (when applicable)
In custom ASP.NET middleware
23. But still...
How to get the data off our machine(s)?
How to analyze/aggregate/predict?
Which customer had this Exception and why?
Did our sales go up after we made this page faster?
What button on the front-end crashes this microservice?
Are # Exceptions increasing?
Do I need special infrastructure to store and process logs?
24. The cloud to the rescue? (Azure)
Azure Insights
Azure platform’s core monitoring and alerting services (standard Metrics, Alerts,
Autoscale, Email Notifications and Audit Logs)
Application Insights
Application monitoring & logging for your applications (web, mobile, desktop, …) in
cloud or other server
Log Analytics (OMS)
Log Analytics, part of the Operations Management Suite (OMS) ingests data from
Azure servers and provides rich ‘global’ view & advanced log search based alerting
capabilities
26. Application Insights
Azure Service + library/SDK
Solves “where to store”, “how to ship”, “how to analyze”
Enriches data
Correlates data (e.g. client + server + dependency/DB/...)
Telemetry!
Allows structured logging
Allows rich querying, alerting
Works for many, many platforms and languages
Web, Windows, Xamarin, any application type really
.NET, Java, JavaScript, Objective-C, PHP, Python, Ruby, ...
28. Do I have to run on Azure?
No. REST API to send data to
Various tools and SDK’s to collect data specific to language/platform
Status Monitor on non-Azure machines
29.
30. Data collected
Performance Counters
Requests (both server/client side)
Traces (more later)
Exceptions
Dependencies
Custom Metrics & Events
That is a lot of data that can be correlated!
37. Metrics and Events
Send custom events and metrics
Name of event/metric + a value
Events – help find how the application is used and can be optimized
User resized column in a grid
User logged in
User clicked “Share”
No more stock
...
Metrics – help alerting / finding missing functionality
Purchase occurred
Search returned zero results
# Support calls
...
39. So what about logging...
Configure Serilog to use Application Insights as a sink
Custom data becomes available (*)
So we can correlate our own data with performance, requests, traces,
exceptions, depenencies, ...
But how to do this? And how to consume?
(*) Make sure the entire object is in the log message, no tracking of non-printed data
42. Application Insights Analytics
Former MS-internal tool “Kusto”
Near-realtime log ingestion and analysis
Lets you run custom queries
requests
| where timestamp > ago(24h)
| summarize count() by client_CountryOrRegion
| top 10 by count_
| render piechart
43. Queries over logs!
Input dataset
traces
customEvents
pageViews
requests
dependencies
exceptions
availabilityResults
customMetrics
performanceCounters
browserTimings
| Operators
where
count
project
join
limit
order
| Renderer
table
chart (bar, pie, time, area, scatter)
https://docs.microsoft.com/en-us/azure/application-insights/app-insights-analytics-reference
45. Not covered in this talk
VS plugin for Application Insights
Run some analysis inside VS
Continuous export
Export data from Application Insights continuously if you want to analyze in
other tools
Retain data longer than 7 days
Join with other data (preview)
https://docs.microsoft.com/en-us/azure/application-insights/app-insights-
analytics-using#import-data
Join logging and metrics with other data sources, query with AI Analytics
48. But... Why?
Troubleshooting – Did a problem occur? Where? Our code? The machine?
Performance / Cost – Did we make the application slower?
Improvement – Can a problem be detected or avoided?
Trends – Need more storage? Need more servers?
Customer Experience – Are people happy? Or seeing error after error?
Business Decisions – Can we increase sales based on log data/telemetry?
49. Structured logging
Enrich technical logs with business data
Allows correlating when things go wrong
Allows making business decisions when using a good data analysis tool on top
50. Application Insights
Is that good analysis tool! (for web, mobile, desktop!)
Solves
Collecting various sources
Ingesting various sources
Enrichment with structured logging (e.g. Serilog)
Helps analyzing all that data
Default views, metrics, alerts, monitoring
Custom querying – Application Insights Analytics
Build, measure, improve!
Both technical and business – there isn’t always a divide between both...
51. Thank you!
Need training, coaching, mentoring, performance analysis?
Hire me! https://blog.maartenballiauw.be/hire-me.html
http://blog.maartenballiauw.be
@maartenballiauw
Editor's Notes
https://pixabay.com/p-960806/
Highlight which customer? Awesome, stack trace for Exception, “Debug is 0” great! RunEngine entry very useful. What if we want to aggregate “Req” / “Ret”?
More structured/contextual data! Yay!
Open Demo-EventSource
Look at PerformBusinessLogic – some actions happening there, logging some items
Look at CRMSystemEventSource
In itself, pretty straightforward – attributes to inform the even tracing infrastructure about what something is and does
One method per event we want to trace – easy to use
BUT – how to version this? Do we really need to add a new method, decorate it with an attribute, ...
For every single thing we want to log? And then invoke that method when needed? Cumbersome...
Open Demo-Serilog
Look at NuGet packages installed – Serilog and some sinks
Browse NuGet for more Serilog packages and see millions
Setting up Serilog – check Main method and explain what happens, show multiple sinks, show enrichers
Show business logic again
Still has log statements, no way around that
Logging is now structured – we provide a message which can contain some info about the data we are logging
But for example when catching the Exception, we can write a simple message and log the actual Exception with it
Run the application, see colored console output, see a file has been created
Explore the collapsed region: we can write to other sinks e.g. Windows Event Log
Or attach an observable – explore the code and mention that even data that is not rendered by a sink is still present with the message, so if we could ship this to a system we could search/track/measure Exceptions, Customer objects, ... Based on property name
https://pixabay.com/p-960806/
Open portal.azure.com
Show how to create Application Insights service
Open service, click “Getting Started”
Explain there are a few entry points to collect all data
Client-side (JS)
Server-side (or mobile) – different data points from inside app, from inside server, Exceptions, ...
Custom metrics and events
Open Demo_WebApp
Show AppInsights package reference added, show Startup.cs where to configure (3 simple “Use...”)
Show _Layout.cshtml, run app, show snippet
Back to portal – data! (overview)
Response times (collected from server and perf counters)
Page view load time (correlated from client-side JS)
# requests
# failures
Click server response time, see more detail, grouping, aggregation
We can search data, correlate other data, ...
Open portal.azure.com
MyGet AppInsights
Application Map
Show Availability tests test the server-side
Show client-side uses server-side
Click around, explain what we can see here
Open portal.azure.com
MyGet AppInsights
Smart Detection
We see nothing! Which is good. Smart Detection tries to be proactive in detecting things that are “not normal”
Look at settings for list of examples
Alerts (open Alerts)
Add alert, see on which metrics we can alert
Show examples, show web tests feeding into alerts, ...
Open portal.azure.com
MyGet AppInsights
Live Metrics Stream
Connects to our running application and inspects the data that comes in real-time
Metrics, health, server CPU and memory, ...
We can filter per server as well
Open portal.azure.com
MyGet AppInsights
Browse through the blades
Availability – explain web tests, explain these can be simple or actual tests created in VS that have multiple steps and checks
Failures – we can see diagrams of exceptions vs. other metrics, as well as Exceptions
Open Exception, see the occurences of this specific one
See Exception message, stack trace, ...
Analyze trends / root cause: what happened 5 minutes before this event?
Performance – what is our slowest page? Drill into details, find out why
Also lets us improve then see if the page drops in the ranking, continuous improvement
Servers – Some server details, quickly go through this
Browser – Browser stats, verify client-side code, page render times, find slowest rendering pages, ...
Usage – Correlate technical side with user-side
Sessions, users, custom events (more later), ...
Open portal.azure.com
MyGet AppInsights
In Usage tab, note these “feature names”
Custom events! We track feature usage there to see if features are actually being used or not
Switch to Demo_WebApp
In HomeController -> About, show the way to track metrics
In Contact.cshtml, show how to track metrics/events client-side as well
Switch to Demo AppInsights, show the portal’s Metrics tab, show how we can filter/correlate
Open Demo_WebApp
Show the HomeController – still same code as before, this time using _logger from ASP.NET Core
Mention using full object data to make sure the object is streamed to AppInsights later on
Startup.cs
Show Serilog configuration – log to AppInsights
Show how it ties into ASP.NET Core logging infrastructure
Run the application, make a few requests to /Home/About, /Home/Contact, /Home/About?throw, ...
Switch to Demo appinsights, use Search
Lots of data – verbosity should probably be tuned...
Search for “invoice” – yay custom log data!
Expand one of them, click “All available telemetry” -> correlates all traces we had for this specific one, correlating our trace with request, exception, ...
Expand the Exception, see that we even have customer id and customer email with it – super useful to have it all tied together
https://pixabay.com/p-960806/
Open portal.azure.com
Demo AppInsights
Show some of the sample queries, show the editor, show the various rendering types (auto complete in there)
Run a few sample queries and find custom dimensions
All traces for our About page
traces
| where operation_Name == "GET /Home/About"
| order by timestamp desc
| limit 500
Number of invoices per customer (customMetrics!)
customMetrics
| extend customerId = tostring(customDimensions.['Customer.Id'])
| project customerId, value
| summarize avg(value) by customerId | render piechart
MyGet AppInsights
All requests from yesterday that have an Exception
requests
| where timestamp > ago(1d)
| where success == "False"
| join kind = leftouter (
exceptions
| where timestamp > ago(1d)
) on operation_Id
| summarize exceptionCount = count() by operation_Name, outerMessage
| order by exceptionCount asc
Are we being DDoS-ed by one specific client?
requests | summarize count() by client_IP | top 10 by count_ | render piechart
Explain concept of package sources – at one point we DDoS-ed ourselves (infinite loop)
Found using AI Analytics: