Elk presentation 2#3

ELK Architecture
Uzzal Basak
Database Engineer, RDB Team

Previous Overview of ELK
• "ELK" is the formed for three open source projects: Elasticsearch, Logstash,
and Kibana. Elasticsearch is a search and analytics engine. Logstash is a
server side data processing pipeline that ingests data from multiple sources‑
simultaneously, transforms it, and then sends it to a "stash" like
Elasticsearch. Kibana lets users visualize data with charts and graphs in
Elasticsearch.
Beats is installed in source side working as data shipper, basically send data
from source to logstash or elasticsearch.
The Elastic Stack is the next evolution of the ELK Stack

What is Logstash?
• Logstash is an open source, server-side data processing pipeline
that ingests data from a multitude of sources simultaneously,
transforms it, and then sends it to your favorite “stash.”
• Logstash is a tool based on the filter/pipes patterns for gathering,
processing and generating the logs or events. It helps in
centralizing and making real time analysis of logs and events
from different sources.
• Logstash is written on JRuby programming language that runs on
the JVM, hence user can run Logstash on different platforms. It
collects different types of data like Logs, Packets, Events,
Transactions, Timestamp Data, etc., from almost every type of
source. The data source can be Social data, E-commerce, News
articles, CRM, Game data, Web trends, Financial data, Internet of
Things, Mobile devices, etc.

How Logstash Works?
• The Logstash event processing pipeline has three stages: inputs
→ filters → outputs.
• Inputs generate events, filters modify them, and outputs ship
them elsewhere. Inputs and outputs support codecs that enable
you to encode or decode the data as it enters or exits the pipeline
without having to use a separate filter.
• Inputs
• You use inputs to get data into Logstash. Some of the more
commonly-used inputs are:
• file: reads from a file on the filesystem, much like the UNIX
command tail -f
• syslog: listens on the syslog

Logstash Input
redis: reads from a redis server, using both redis channels and redis lists.
beats: processes events sent by Beats.
Database: Process data from any SQL database like Oracle , MySql or any nsql
Database like MongoDB etc.

Logstash Filter (grok)
• Filters are intermediary processing devices in the Logstash pipeline. You
can combine filters with conditionals to perform an action on an event if
it meets certain criteria. Some useful filters include:
• grok: parse and structure arbitrary text. Grok is currently the best way in
Logstash to parse unstructured log data into something structured and
queryable. With 120 patterns built-in to Logstash, it’s more than likely
you’ll find one that meets your needs!
• Sample logfile
• 2016-07-11T23:56:42.000+00:00 INFO [MySecretApp.com
.Transaction.Manager]:Starting transaction for session -464410bf-37bf-
475a-afc0-498e0199f008
• The main goal to accomplish with a grok filter is to break down the
logline into the following fields: timestamp, log level, class, and then the
rest of the message.

Logstash Filter (grok)
• grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %
{LOGLEVEL:log-level} [%{DATA:class}]:%
{GREEDYDATA:message}" }
}
• This will try to match the incoming log to the given pattern. In case of a
match, the log will be broken down into the specified fields, according
to the defined patterns in the filter. In case of a mismatch, Logstash will
add a tag called _grokparsefailure.
• In this case, the filter will match and result in the following output:
• {
"message" => "Starting transaction for session -464410bf-37bf-475a-
afc0-498e0199f008",
"timestamp" => "2016-07-11T23:56:42.000+00:00",
"log-level" => "INFO",
"class" => "MySecretApp.com.Transaction.Manager"
}

Logstash Filter (mutate)
• mutate: perform general transformations on event fields. You can rename,
remove, replace, and modify fields in your events.
As its name implies, this filter allows user can really massage log messages
by “mutating” the various fields. For example, use the filter to change fields,
join them together, rename them, and more.
• Using the log above as an example, using the lowercase configuration
option for the mutate plugin, we can transform the ‘log-level’ field into
lowercase:
• filter {
grok {match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %
{LOGLEVEL:log-level} [%{DATA:class}]:%
{GREEDYDATA:message}" }}
mutate {
lowercase => [ "log-level" ]
}
}
•

Logstash Filter (drop , clone)
• drop: drop an event completely, for example, debug events.
• Drop filter. Drops everything that gets to this filter.
• This is best used in combination with conditionals, for example:
• filter { if [loglevel] == "debug" { drop { } } }
• clone: make a copy of an event, possibly adding or removing fields.
• The clone filter is for duplicating events. A clone will be created for each type in the
clone list. The original event is left unchanged. Created events are inserted into the
pipeline as normal events and will be processed by the remaining pipeline
configuration starting from the filter that generated them (i.e. this plugin).
• If this filter is successful, add any arbitrary fields to this event. Field names can be
dynamic and include parts of the event using the %{field}.
• Example
• filter { clone { add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
} }
• # You can also add multiple fields at once: filter { clone { add_field => { "foo_%
{somefield}" => "Hello world, from %{host}" "new_field" => "new_static_value" } }
}

Logstash Output
• geoip: add information about geographical location of IP addresses (also
displays amazing charts in Kibana!)
Output
• Outputs are the final phase of the Logstash pipeline. An event can pass
through multiple outputs, but once all output processing is complete, the
event has finished its execution. Some commonly used outputs include:
• elasticsearch: send event data to Elasticsearch. If you’re planning to save
your data in an efficient, convenient, and easily queryable format…
Elasticsearch is the way to go. Period. Yes, we’re biased :)
• file: write event data to a file on disk.
• graphite: send event data to graphite, a popular open source tool for storing
and graphing metrics.

Logstash Source and Destination

Beats
• Lightweight Data Shippers
• Beats is the platform for single-purpose data shippers. They send
data from hundreds or thousands of machines and systems to
Logstash or Elasticsearch.

Types of BEAT
FILEBEAT
•Forget using SSH when you have tens, hundreds, or even thousands
of servers, virtual machines, and containers generating logs. Filebeat
helps you keep the simple things simple by offering a lightweight way
to forward and centralize logs and files.
•METRICBEAT
•Collect metrics from your systems and services. From CPU to
memory, Redis to NGINX, and much more, Metricbeat is a
lightweight way to send system and service statistics.
•PACKETBEAT
•Know what’s going on across your applications by tapping into data
traveling over the wire. Packetbeat is a lightweight network packet
analyzer that sends data to Logstash or Elasticsearch.

Types of BEAT
• WINLOGBEAT
• Keep a pulse on what’s happening across your Windows-based
infrastructure. Winlogbeat live streams Windows event logs to
Elasticsearch and Logstash in a lightweight way.
• AUDITBEAT
• Collect your Linux audit framework data and monitor the integrity
of your files. Auditbeat ships these events in real time to the rest of
the Elastic Stack for further analysis.
• HEARTBEAT
• Monitor services for their availability with active probing. Given a
list of URLs, Heartbeat asks the simple question: Are you alive?
Heartbeat ships this information and response time to the rest of the
Elastic Stack for further analysis.

Example of Using beats, Logstash and Elasticsearch
• Machine
• Two Virtual box inside this installed Red hat Linux 6 .
• Host information: machine 1: hostname: ansible ip: 192.168.56.103
• machine 2: hostname: test2 ip: 192.168.56.101
• Machine 1: unzip elasticsearch 6.2.2
• Machine 2: unzip filebeats 6.2.2, logstash 6.2.2
• Mission: collecting data from
• "/home/oracle/ELK/Filebeats/Apache.log" file using filebeats ,
which sent data to logstash do some filtering in here then output sent
to elasticsearch.

• Elasticsearch Config file
• •cluster.name: my-application → Use a descriptive name for your cluster:
index.number_of_shards : 5 → Default value is 5 means all the type are
index.number_of_replicas: 0
node.name: node-2 → Use a descriptive name for the node:
path.data: /home/oracle/ELK/data → Path to directory where to store the data (separate multiple
locations by comma):
path.logs: /home/oracle/ELK/log → Path to log files:
path.repo: /home/oracle/ELK/backup → Path for backup snapshot
bootstrap.memory_lock: true → Lock the memory on startup:
network.host: 192.168.56.103 → Set the bind address to a specific IP (IPv4 or IPv6):
discovery.zen.ping.unicast.hosts: ["192.168.56.103", "192.168.56.101"] → Nodes IP address
discovery.zen.minimum_master_nodes: 1
transport.host: localhost → Should be localhost or 127.0.0.1
transport.tcp.port: 9300
http.port: 9200
node.master : true
node.data : true
xpack.graph.enabled: true
xpack.logstash.enabled: true
xpack.ml.enabled: true
xpack.monitoring.enabled: true
xpack.watcher.enabled: true
xpack.security.enabled: false

• Start the elasticsearch process
• nohup ./bin/elasticsearch &
• Create a logstash file for beats
[02:01:34 oracle@test2 ~]$ cat /home/oracle/ELK/logstash/logstash_beats.cnf
input {
beats {
port => 5044
}
}
filter {
grok {match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-
level} [%{DATA:class}]:%{GREEDYDATA:message}" }}
mutate {
lowercase => [ "log-level" ]
}
}

output {
elasticsearch {
hosts => "http://192.168.56.103:9200"
index => 'apache_log'
document_type => 'apache_log'
#user => 'elastic'
#password => 'changeme'
}
stdout { codec => rubydebug }
}
•Note:
•both Logstash and beats are running on same machine so don’t need to put IP
•Start the Logstash
•
nohup ./ELK/logstash/logstash-6.2.2/bin/logstash -f /home/oracle/ELK/logstash/logstash_beats.cnf &

Configure beats
filebeat.yml
# Paths that should be crawled and fetched. Glob based paths.
paths:
- "/home/oracle/ELK/Filebeats/Apache.log"
fields:
apache: true
#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["192.168.56.101:5044"]
Start the Filebeats
/home/oracle/ELK/Filebeats/filebeat-6.2.2-linux-x86_64
nohup ./filebeat &

Manually insert data in the apache.log
[17:45:43 oracle@test2 Filebeats]$ pwd
/home/oracle/ELK/Filebeats
[17:46:01 oracle@test2 Filebeats]$ cat Apache_bk.log > Apache.log
2016-07-11T23:56:42.000+00:00 INFO [MySecretApp.com.Transaction.Manager]:Starting
transaction for session -464410bf-37bf-475a-afc0-498e0199f008
2016-07-11T23:56:42.000+00:00 INFO [MySecretApp.com.Transaction.Manager]:Starting
2016-07-11T22:56:42.000+00:00 WARNING [MySecretApp.com.Transaction.Manager]:Starting
2016-07-11T22:56:42.000+00:00 WARNING [MySecretApp.com.Transaction.Manager]:Starting

Check Filebeats Log
/home/oracle/ELK/Filebeats/filebeat-6.2.2-linux-x86_64/logs
2019-03-27T11:51:24.121+0900 INFO cfgfile/reload.go:219 Loading of config files completed.
2019-03-27T11:51:54.122+0900 INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics":
{"b eat":{"cpu":{"system":{"ticks":40,"time":45},"total":{"ticks":60,"time":70,"value":60},"user":{"ticks":20,"time":25}},"info":
{"ephemeral_id":" bee1329a-06e3-4ff6-90fb-3d0d08925e7f","uptime":{"ms":30021}},"memstats":
{"gc_next":4473924,"memory_alloc":2812344,"memory_total":2812344,"rss" :11571200}},"filebeat":{"events":
{"added":1,"done":1},"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"r
eloads":1},"output":{"type":"logstash"},"pipeline":{"clients":1,"events":{"active":0,"filtered":1,"total":1}}},"registrar":{"states":{"current
":1,"update":1},"writes":1},"system":{"cpu":{"cores":1},"load":{"1":0.72,"15":0.32,"5":0.67,"norm":
{"1":0.72,"15":0.32,"5":0.67}}}}}}
Check Logstash Log
],
"@timestamp" => 2019-03-27T02:53:14.137Z,
"host" => "test2",
"tags" => [
[0] "beats_input_codec_plain_applied"
],
"fields" => {
"apache" => true
},
"class" => "MySecretApp.com.Transaction.Manager",
"log-level" => "warning",
"prospector" => {
"type" => "log"
},
"source" => "/home/oracle/ELK/Filebeats/Apache.log"
}

Check the elasticsearch log
[2019-03-27T11:53:18,420][INFO ][o.e.c.m.MetaDataMappingService] [node-2]
[apache_log/fIaBuQomRhWsZk0If1zG0w] create_mapping [apache_log]
Check the Head Plugin in Browser

Summary and Upcoming
Basically targets for this todays’ presentation make unstructured data
structured and do some filtering in logstash side as per our requirements .
In my Previous session I talked about Elasticsearch.
Next session I will deep drive with Kibana.
Make Elasticsearch data visualize through Kibana
Main Goal of my this Operation is visualize
data in Kibana no matter the source is.

Elk presentation 2#3

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Elk presentation 2#3

Similar to Elk presentation 2#3 (20)

More from uzzal basak

More from uzzal basak (10)

Recently uploaded

Recently uploaded (20)

Elk presentation 2#3