Mais conteúdo relacionado
Semelhante a .conf Go 2022 - Observability Session (20)
.conf Go 2022 - Observability Session
- 1. © 2022 SPLUNK INC.
Splunk
Observability
.conf Go 2022 - October 4th, Zurich
- 2. © 2022 SPLUNK INC.
Agenda
Introduction to Observability
Why Splunk for Observability
ACIC Use Case
Accenture
- 3. © 2022 SPLUNK INC.
Speakers
Specialist Sales Observability @ Splunk
Thomas Hochstrasser
SRE Tech Lead @ Accenture
Francesco Sbaraglia
- 4. © 2022 SPLUNK INC.
End-to-End
Visibility
So You Can Build,
Troubleshoot, and
Innovate Faster
Why Splunk for
Observability
- 5. © 2022 SPLUNK INC.
Urgency for Digital Initiatives
Has Never Been Higher
Since 2020, COVID has led to:
Sources: Forbes Article. “COVID-19 Pushes Up Internet Use 70% and Streaming More than 12%”, March 2020.
Digital Commerce 360 Article. “Charts: How the coronavirus is changing ecommerce”, August 2020.
McKinsey & Co Article. "How COVID-19 has pushed companies over the technology tipping point", October 2020.
Increase in
internet use
70% 76% 65%
Increase in
ecommerce
Of customer interactions
are now digital
- 6. © 2022 SPLUNK INC.
Modernization Initiatives Require
Thousands of Changes
DEV OPS
VM VM VM VM VM VM
DEV OPS
VM VM VM VM VM VM
Private Public
DEV OPS
VM VM VM
Private Public
DEV OPS
Private Public
- 7. © 2022 SPLUNK INC.
Need to Validate Thousands of Changes
Against Thousands of Scenarios
Bank Branch Mobile Banking
Online Banking
Mortgage
Services
Investment
Services
Account
Service
Credit Bureau
API
E-Payment
Microservices
VM VM VM VM VM
Private Public
VM VM VM
Private Public
- 8. © 2020 SPLUNK INC.
83% of organizations
are looking for new
monitoring approaches
to handle cloud
complexity
only 11% are satisfied with
existing monitoring tools
451 Research, IT Monitoring Meltdown: Just 11% of decision-makers are satisfied with their monitoring tools, Nancy Goehring, 06 Aug 2020
- 9. © 2022 SPLUNK INC.
It’s Time for a New Approach
New Approach
Last 5 Years
Siloed tools, hundreds of static dashboards
Vanilla infrastructure and app visibility
Manually setting up alerts for
known failure scenarios
Vendor lock-in with
proprietary instrumentation
Know where to look and who to engage
Complete business visibility
Proactively spot unknowns and
root causes
Full control of your data
- 10. © 2022 SPLUNK INC.
Metrics
Do I have
a problem?
Traces
Where is the
problem?
Logs
Why is the problem
happening?
Detect Troubleshoot Root Cause
Full-Stack Visibility
and Context-Rich Insights
Observability
Is a Data
Problem
The more observable a
system, the quicker we can
understand why it’s acting
up and fix it
- 11. © 2022 SPLUNK INC.
Observability Center of Excellence
Key indicators from our customers, when building a CoE
Observability /
Monitoring as
code included in
the Pipeline
Proactive
Synthetic
Monitoring
Real Time
Service
Dependency
Mapping
Real User
Monitoring,
Business Impact
Full stack
monitoring via
RUM, Infra and
APM
Performance
Testing as code
Session Click
Path
Self Service, Self
Healing
Logging as part of
Observability
Central Wiki with
Observability Best
Practices
SaaS based
solution to
eliminate TOIL
with a single UI
Multi
Vendor/Cloud
Support
- 12. © 2022 SPLUNK INC.
Observability to Detect, Troubleshoot and
Optimize
Splunk
Observability
Digital Experience
Monitoring (RUM /
Synthetics)
APM
Infrastructure
Monitoring
On-Call
On-Prem | Hybrid Cloud | Multi-Cloud | Cloud-Native
Real-Time Analytics-Powered Enterprise-Grade
OpenTelemetry-
Native
Full-Stack
Log Analysis
- 13. © 2022 SPLUNK INC.
“Splunk helps us improve customer experience and
keeps our business humming by monitoring our cloud
infrastructure, microservices and applications.”
— Head of Engineering
● Scale and performance needed to support 150k+
customers
● Splunk Core + Splunk Infrastructure Monitoring Metrics
● Easy-to-use and customizable AI-driven analytics
● Detailed visibility and transparency into usage to avoid
outages
Splunk Infrastructure Monitoring +
Splunk Cloud
• Real-time monitoring of all
public cloud infrastructure
and serverless functions
• Cloud Cost Management
Atlassian relies on Splunk for cloud monitoring and
observability
Cloud
Monitoring
- 15. © 2022 SPLUNK INC.
Behind The Scenes...
Systems are becoming more and more critical!
Debug
439
Info
384
Warning
87
Critical
5
Cloud Services
Storage
Partner Systems
Databases
Network
Systems/Infrastructure
Error
20
- 16. © 2022 SPLUNK INC.
Observability
The History
• Observability comes from the field of control theory
• Control theory originally applied to physical machines with the goal of “minimizing delay, overshoot, or steady-
state error”
• It’s all about managing the relationship between a machine’s inputs, outputs, and internal state. How do we do
this?
Outputs
Internal state
Inputs
As per Wikipedia, observability can be defined as:
“…a measure of how well internal states of a system can be inferred from knowledge of its
external outputs.”
- 17. © 2022 SPLUNK INC.
Observability in one page
App
OS
Cloud
Infra-
structure
Help Desk &
Ticket
System
DevSecOps
pipelines
SIEM Logs
&
Metrics
Data
Sources
Data Collection
The Four Golden Signals
The Golden Triangle of
Observability
LATENCY
The time it takes to service a request
TRAFFIC
Measure the bandwidth left for a service
ERRORS
The error rate caught during the service activity
SATURATION
How „full“ the service is, while system is serving requests
- 18. © 2022 SPLUNK INC.
How can we help an SRE with his daily
struggles?
Let’s introduce Artificial Intelligence for IT Operations
What is AIOps?
Source: Gartner
- 20. © 2022 SPLUNK INC.
End-to-End Services and Platforms
Design
and Build
Defects
Recurring
Problems and
Toil Reduction
Automation
Production Readiness
Reviews and Tools
Changes
Highly Available Services Demand
Customers, Employees, Consuming Systems
Site Reliability Engineers
Development &
Infrastructure
Engineers
Technology
Operations
Sustain and
Remediate
Alerts and
Incidents
Production Readiness
Reviews and Tools
Insights
Service Level Objectives and Incentives
AIOps & Observability Platforms
ILLUSTRATIVE
Why SREs & DevOps need AIOps & Observability
Transformation efforts should focus on maturing tightly coupled SRE principles and AIOps capabilities
into modern engineering and DevOps processes
Copyright © 2022 Accenture. All rights reserved.
- 21. © 2022 SPLUNK INC.
Accenture Cloud Innovation Center Use Case
The HYBRID-MULTI CLOUD
AIOps Outcomes for SRE
Dashboards
Queries
Automation /
Orchestration
…
…
…
…
AIOPs
Engine
Apps
Private
Cloud
On-prem
Cloud
Env.
ITSM
Data Sources Platform Output
OpenShift
Cluster
Web servers
Virtual
Machine
Splunk
Forwarder
Logs,
Events
RUM,
Metrics,
Traces
Alerts
▪ Hybrid environments monitoring
▪ Single view, full-stack visibility
▪ Proactive and reactive monitoring
▪ System observability
▪ Configurable KPIs and dashboards
▪ Intelligent Incident Management
▪ Machine Learning and Predictive
Analytics
Database
Containers in
Microservices
Service Bus
Virtual
Machine
Storage
Private
Cloud
Splunk
OpenTelemetry
- 24. © 2022 SPLUNK INC.
Observability with Splunk Drives Results
• 90% reduction in
unplanned downtime
• 75% faster MTTD & MTTR
Increase Uptime
& Reliability
s
Improve Operational
Efficiency
• 64% reduction in
customer-facing errors
• 30% faster page load
times
Deliver Flawless
Customer Experiences
● 96% faster innovation
velocity
● 8x higher code quality
Innovate
Faster
• 42% faster root cause
isolation & reduction in
lengthy war room meetings
• 70% improvement in
developer productivity
- 25. © 2022 SPLUNK INC.
See and / or Try Splunk Observability Yourself
https://www.splunk.com/en_us/products/observability.html