Machine Learning and Logging for Monitoring Microservices

ANALYZE THIS:
ML AND LOGGING FOR MONITORING
MICROSERVICES

kernel: xen_netfront: xennet: skb rides the rocket: 19
slots

Daniel Berman
• Product Evangelist @logzio
• LAMPer, Docker, ELK
• Speaker/Blogger (SitePoint,
DZone)
• Meetup organizer: TLV-PHP, TLV-
ELK
• Contact me:
@proudboffin | daniel@logz.io

1-min on
• Log analysis company
• ELK-as-a-Service
• Enterprise grade: auto-
everything, security,
multi-tenant
• Additional features: ELK
Apps, S3 archiving, AI

Agenda
• Logs + logging
background
• The challenges
• Centralized logging
with ELK
• Using machine
learning
• Demo
• Q & A

Online
user
behavior
IoT
analytic
s
Dev, monitoring &
system
troubleshooting
Security and
compliance
LOG ANALYTICS IS FUNDEMENTAL
FOR UNDERSTANDING MACHINES
Security
devices
App
server
Network

LOG ANALYTICS FOR
MICROSERVICES
• Service logs
10/01/17 00:53:51 INFO apollo i.l.c.b.c.b.MappedPageFactory: Page file
/tmp/logzio-logback-buffer/listener-metrics/logzio-logback-appender/data/page-
48.dat was just deleted.
• Service metrics
10/01/17 02:53:51 INFO apollo a.b.c.metrics: Account-Incoming, key: 126, value:
54321

LOG ANALYTICS FOR
MICROSERVICES
• Host logs/metrics
• Execution runtime logs

THE CHALLENGES WITH LOGGING
MICROSERVICES
• Transient
• Distributed
• Independent
• Multilayered

LOGGING IN A DOCKERIZED
WORLD
$ docker logs
2016-06-02T13:05:22.614090Z 0 [Note] InnoDB: 5.7.12 started; log sequence number
2522067

WORLD
$ docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O
BLOCK I/O
3747bd397456 0.01% 3.641 MB / 2.1 GB 0.17% 3.366 kB / 648 B
0 B / 0 B
396e42ba0d15 0.11% 1.638 MB / 2.1 GB 0.08% 9.79 kB / 648 B
348.2 kB / 0 B
468bf755240a 3.19% 45.67 MB / 2.1 GB 2.17% 25.19 MB / 17.95 MB
774.1 kB / 0 B
5f16814a3c0e 0.01% 495.6 kB / 2.1 GB 0.02% 8.564 kB / 648 B 0
B / 0 B
74cdfa7b8a0c 0.04% 3.908 MB / 2.1 GB 0.19% 2.028 kB / 648 B 0
B / 0 B
99bafb7600fc 0.00% 32.95 MB / 2.1 GB 1.57% 0 B / 0 B 2.093
MB / 20.48 kB

WORLD
$ docker daemon
time="2016-06-05T12:03:49.716900785Z" level=debug msg="received containerd event:
&types.Event{Type:"exit",
Id:"3747bd397456cd28058bb40799cd0642f431849b5c43ce56536ab7f55a98114f",
Status:0x0,
Pid:"4120a7625a592f7c95eab4b1b442a45370f6dd95b63d284714dbb58f00d0a20d",
Timestamp:0x57541525}"

OH, AND THERE’S THIS…
Large & complex application
& operational logs
Multiple different
formats
Multiple log files
per component /
instance
SLOW
& labor Intensive
Error-prone
processing
Relies on an
individual’s skills
Expensive
Hard to find what is relevant and
important in log data
Scaling and securing
open-source implementation is
expensive and almost impossible to
scale

CENTRALIZED LOGGING TO THE
RESCUE
• Centralized data collection and management
management
• Provides inferable context to logs
• Analysis, event correlation and visualization
visualization

OLD SCHOOL LOGGING
$ grep ' 30[1234] ' /var/logs/apache2/access.log | grep -v
baidu | grep -v Googlebot
173.230.156.8 - - [04/Sep/2015:06:10:10 +0000] "GET /morpht HTTP/1.0" 301 26
"-" "Mozilla/5.0 (pc-x86_64-linux-gnu)"
192.3.83.5 - - [04/Sep/2015:06:10:22 +0000] "GET /?q=node/add HTTP/1.0" 301
26 "http://morpht.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)
AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5"

A BIT ABOUT ELK
• World’s most popular open source log
analysis platform
• 4.5M downloads a month!
• Centralized logging AND: search, BI, SEO,
IoT, and more

THE MARKET IS
DOMINATED BY
OPEN SOURCE
SOLUTIONS
Over the past
3 years, the
market shifted
attention from
proprietary to
open source
It’s simple to
get
started and play
with ELK, and the
UI is just
beautiful
Simple and beautifulOpen Source/Flexible
Fast-growing
community, no
vendor lock-in
and no license
cost
Blazing quick
responses even
when searching
through millions
of documents
Fast. Very fast.
ELK Stack
500,000+
companies
15K companies

TYPICAL ELK PIPELINE
• Visualizations
and
dashboards
• Log shipper
• Collecting and
parsing
• Full-text search
and analysis
engine
• Scalable, fast,
highly available
• REST API

STEP 1 – INSTALLING ELK
https://hub.docker.com/r/sebp/elk/
elk:
image: sebp/elk
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"
$ sudo docker-compose up elk
https://github.com/deviantony/docker-elk

• Logging drivers (json-file, syslog,
fluentd…)
STEP 2 – FORWARDING LOGS
$ docker run -d --name nginx --log-driver=syslog --log-opt syslog-
address=tcp://SYSLOG_IP:PORT -p 80:80 nginx:alpine
webserver:
image: nginx:alpine
container_name: nginx
ports: - "80:80" s
logging:
driver: syslog
options:
syslog-address=tcp://SYSLOG_IP:PORT
syslog-tag: "nginx"

• Logspout
$ docker run --name="logspout" --
volume=/var/run/docker.sock:/var/run/docker.sock gliderlabs/logspout
syslog+tls://167.23.145.12:55555

• Filebeat
yourapp:
image: your/image
ports:
- "80:80"
links:
- elk elk:
image:
sebp/elk
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"

• Configure Logstash (input, filter,
output)
filter {
if [type] == "dockerlogs" {
if ([message] =~ "^tat ") {
drop {}
}
grok {
break_on_match => false
match => [ "message", " responded with %{NUMBER:status_code:int}" ]
tag_on_failure => []
}
}
}
STEP 3 – PARSING

• DO NOT expose
Elasticsearch
(‘network.host’)
• Use proxies
• Isolate
Elasticsearch
• Change default
ports
STEP 4 – SECURITY

OTHER SOLUTIONS
• Hosted ELK (Logz.io, Elastic Cloud,
Sematext)
• Other logging/monitoring SaaS
(Datadog, Papertrail, Loggly)

THE BIG ELEPHANT (ELK) IN THE ROOM
• Not knowing what question to ask
• Needle in the haystack syndrome
• Logs cannot be analyzed by a human alone
• Anomaly detection does not work

ANOMALY DETECTION DOESN’T WORK
• Not every anomaly is an error
• Not every error represents itself in
an anomaly
• Apps run as step functions

WHAT IS MACHINE LEARNING?
“Machine learning is a type of artificial
intelligence that provides computers with
the ability to learn without being
explicitly programmed.” (TechTarget)

SUPERVISED MACHINE LEARNING (BY
EXAMPLE)
1. Labeling – gathering and labeling logs
• User behavior
• Inter-user similarities
• Public resources
2. Training a classifier – defining what
log is important
3. Integration within the system

‘skb rides the rocket’
kernel: xen_netfront: xennet: skb rides the rocket: 19 slots
(http://serverfault.com/questions/647489/what-is-causing-
skb-rides-the-rocket-errors)

EXTRAS
• Logz.io blog:
http://logz.io/blog
• Elastic docs
http://elastic.co/documentation
• Slack team:
https://elk-stack-professionals-
pfuiokfxqy.now.sh
• ELK meetup:
https://www.meetup.com/Tel-Aviv-Yafo-
ELK-ElasticSearch-Meetup/

THANKS!
@proudboffin | daniel@logz.io

Machine Learning and Logging for Monitoring Microservices

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Machine Learning and Logging for Monitoring Microservices

Semelhante a Machine Learning and Logging for Monitoring Microservices (20)

Último

Último (20)

Machine Learning and Logging for Monitoring Microservices

Notas do Editor