Fluentd/LogStash + elastic search + kibana

Análise de Logs
Fluentd/Logstash + elasticsearch + kibana

César Araújo & Tomás Lima

, o que é?

É como o syslogd, mas usa JSON para a troca de mensagens.

, como funciona?
aplicação

app.log

2013-12-18 01:33:51 video=video1.avi
# path=/home/cesar/videos/ #
duration=125s # type=promotional

, como funciona?
aplicação

app.log

fluentd

plugins

storage

2013-12-18 01:33:51 video=video1.avi
# path=/home/cesar/videos/ #
duration=125s # type=promotional

, como funciona?
aplicação

app.log

fluentd

plugins

storage

2013-12-18 01:33:51
myapp.Playlog {
“video”: ”video1.avi”,
“path”: “/home/cesar/videos/”,
“duration”: 125,
“type”: “promotional”
}

, como funciona?
aplicação

fluentd

storage

2013-12-18 01:33:51
myapp.Playlog {
}

, como funciona?
aplicação

Filter/ Buffer / Routing

fluentd

fluentd

fluentd

storage

Plugins
Entrada

, como funciona?
syslogd

scribe

Filter/ Buffer / Routing

Plugins
Saída

fluentd

s3

tail

mondodb

elasticSearch

, como funciona?
mysql

fluentd

fluentd

couchbase

Servidor

apache

fluentd

TEMPO REAL

fluentd

fluentd

fluentd

Servidor 1

Servidor 2

Servidor 3

, como funciona?
timestamp

2013-12-18 01:33:51
myapp.Playlog {
}

atributos

tag

, como funciona?

<source>
type tail
path /var/log/application/terminal.log # Localização do ficheiro do playlog
pos_file /var/log/td-agent/terminal.log.pos # Ficheiro que guarda posição do log
tag xp.terminal # Tag do fluentd para identificação!
format /^(?<time>[^ ]* [^ ]*) ?(?<Thread>[^ ]*) ?(?<TypeOfAlert>[^ ]*)
?(?<Machine>[XP]*[0-9a-fA-F]{12})-?(?<Player>[^ ]*).*filename=
"(?<Filename>[0-9]*.[^"]*).*/
</source>

Logstash, o que é?

Gestão de eventos e logs com as seguintes fases:
• Input - Colecção de eventos/logs

• Filter - Análise
• Output – Envio para storage

Input

stdin
tcp
udp
syslog
unix

imap
xmpp
irc
redis
sqlite

twitter
elasticsearch
drupal_dblog

Input - Exemplos
input {
file {
type => "syslog"
path => [ "/var/log/messages", "/var/log/syslog", "/var/log/*.log" ]
}
}
input {
redis {
host => "127.0.0.1"
type => "redis-input"
key => "logstash“
message_format => "json_event"
}
}

# “index”

Filter e Codecs

Filters
csv
date
grep
kv
useragent
json

Codecs
json
json_lines
multiline
netflow
plain

Filter - Exemplos
filter {
grok {
type => "syslog"
pattern => "%{SYSLOGLINE}"
}
date {
type => "syslog"
timestamp => "MMM d HH:mm:ss"
timestamp => "MMM dd HH:mm:ss"
timestamp8601 => ISO8601
}
}

Output

stdout
tcp
udp
syslog
xmpp
irc

nagios
elasticsearch
elasticsearch_river
mongodb
redis

Output - Exemplos

output {
stdout { debug => true debug_format => "json"}
redis { host => "logs.i.att.io" data_type => "list" key => "logstash" }
}
output {
stdout { debug => true debug_format => "json"}
elasticsearch { host => "127.0.0.1" }
}

, o que é?

Base de dados em tempo real e sem esquema de dados definido

, como funciona?

Base de dados
Relacional

ElasticSearch

Database

Index

Table

Type

Row

Document

Column

Field

Schema

Mapping

, como funciona?
Cluster : 1 ou mais nodes

Node : em produção 1 node = 1 server

Shard : instância apache lucene.
configura-se internamente no ES
transparente ao nível aplicacional
podem mover-se para outros nodes

Index : é um "namespace"
mapeia um ou mais shards primários
mapeia zero ou mais shards replicas
define o mapping dos campos recebidos como input

, estrutura de dados.

Doc-Type

Index


Doc-Type

Index

Um index pode ter multiplos tipos
e cada tipo pode ter multiplos
documentos.

Todos os documentos são JSON.

Doc-Type
{…}

{…}

Documento

{…}

{…}

{…}

{…}
As pesquisas podem conter multiplos indices

{…}

{…}

{…}

{…}

, como pesquisar?
RESTfull

GET

POST

PUT

DELETE

, mapping?
curl -XPUT 127.0.0.1:9200/_template/myindex -d '
{
"template" : "*",
"order" : 10,
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
},
"mappings" : {
"_default_" : {
"properties" : {
"source_time" : { "type": "date" },
"type" : { "type": "string", "index": "not_analyzed" },
"event_hash" : { "type": "string", "index": "not_analyzed" },
"ip" : { "type" : "string", "index": "not_analyzed" },
"asn" : { "type" : "integer"},
"cc" : { "type": "string", "index": "not_analyzed" },
"url" : { "type": "string", "index": "not_analyzed" },
}
}
}
}

, como inserir?
PUT /index/type/id

Conteúdo

O quê
Onde
Acção

, como inserir?
PUT /index/type/id
Curl –XPUT „http://192.168.2.6:9200/twitter_dev/tweet/1‟ -d
'{

Pedido
"tweet" : ”Estou no porto linux! ",
"name" : ”BrixSat”

}

{
"_index":"twitter_dev", "_type":"tweet",
"_id":"1",
"_version": 1,
"ok":true

Resposta
}

, como remover?
GET /index/type/id
Curl –XGET „http://192.168.2.6:9200/twitter_dev/tweet/1‟

Pedido

{
"_index" : "twitter", "_type" : "tweet", "_id" : "1",
"_source" :
{
"tweet" : " Estou no porto linux! ", "name" : ”BrixSat"
}
}

Resposta

, como pesquisar?
DELETE/index
Curl –XGET „http://192.168.2.6:9200/twitter_dev

Pedido

Problemas encontrados

• Alocação excessiva de memória – usar apenas metade
da memória para o ElasticSearch
• Crash do Node devido a pouco espaço no disco – 95%
~= 100%
• Erros relativos a mensagens “not fully read”

, resumindo!

Análise massiva de dados em tempo real!

kibana, o que é?

Interface web de análise de logs

Obrigado!
Referências:
http://www.slideshare.net/shifa27/elasticsearch-26896932
http://www.slideshare.net/treasure-data/fluentd-meetup-in-japan-11410514

Fluentd/LogStash + elastic search + kibana

Mais conteúdo relacionado

Mais procurados

Semelhante a Fluentd/LogStash + elastic search + kibana

Último

Fluentd/LogStash + elastic search + kibana

Notas do Editor