illuminate is a machine learning-based performance analytics tool that automatically diagnoses performance issues in servers and applications without human intervention. It has a small memory, CPU, and network footprint, uses adaptive machine learning to interpret data and scale with applications, and provides a holistic view of both application and system performance across servers. illuminate identifies the largest bottlenecks through machine learning, aggregates similar issues across servers, and auto-triggers on SLA breaches. It supports Linux systems and has a secure web-based dashboard.
Illuminate - Performance Analystics driven by Machine Learning
1. Focus on whatâs important
Performance analytics driven by machine learning
2. Why we built illuminate
âą To diagnose performance issues
â Without human intervention
â Machine scalable performance diagnostics and analysis
âą Industry not happy with existing tools
â Current state of the art is about raw metrics, no analysis
â âCollect everything and hope for the bestâ
âą We could build on a proven methodology
â Methodology proven through years of customer engagements
â Required the use of Machine Learning
3. Why illuminate is different!
âą Lightweight
â Small memory footprint, small CPU footprint, small network footprint
âą Intelligent
â illuminateâs Machine Learned algorithm interprets the data for you!
âą Adaptive
â Scales up or down with your application
âą Pervasive
â Fills out the server piece of the performance puzzle
4. illuminate analysis
âą Machine Learning finds largest bottleneck
â Points you in the right direction quickly
â Concentrate on biggest problem!
âą illuminate looks at the overall server
â The problem may not be caused by the Java application!
âą illuminate aggregates across servers
â If X servers have a similar problem we wrap that up in one report
âą Auto-triggers on SLA breaches
â Performance/business trigger points, e.g. login page within 2 secs
5. A sample of the bottlenecks that we find
Category Bottleneck Description
High Pause Times Shows you high pause times due to Garbage Collection (GC)
Too much time in GC Shows you if your application not progressing due to GC
Running out of memory Shows you if your application is close to an OOME
Heavy Disk I/O Shows you if you are reading or writing too much from disk
Waiting on external system Shows you threads that are waiting for a response from an
external system, e.g. Database, Webservice etc
Blocked threads Shows you threads that are blocked (with profiles)
Deadlocked threads Shows you deadlocked threads (with profiles)
Sleeping threads Shows you threads that are sleeping (with profiles)
Hot Loop Shows you if your code is in a hot (infinite loop)
Context Switching Shows you if your application is battling others for CPU time
7. Some technical details
âą Supports RedHat/Debian Linux systems
â Including Amazon AWS, MS Azure, Google Cloud Compute
â Run via init.d or simple run-headless.sh shell script
â /proc should be available to read from
âą Daemon Comms over SSL'd Websockets
â SaaS hosted dashboard (secure public, or secure in-house)
â Self Updating Daemons / Dashboard as a virtual appliance
âą Supports modern web browsers
â IE8+ (10 preferred), FF, Chrome, Safari, Opera