SlideShare uma empresa Scribd logo
1 de 33
Logstash::Intro
           @ARGV
Why use Logstash?

• We already have splunk, syslog-ng, chukwa,
  graylog2, scribe, flume and so on.
• But we want a free, light-weight and high-
  integrality frame for our log:
•   non free --> splunk
•   heavy java --> scribe,flume
•   lose data --> syslog
•   non flex --> nxlog
How logstash works?

• Ah, just like others, logstash has
  input/filter/output plugins.
• Attention: logstash process events, not (only)
  loglines!
• "Inputs generate events, filters modify them,
  outputs ship them elsewhere." -- [the life of an
  event in logstash]
• "events are passed from each phase using
  internal queues......Logstash sets each queue
  size to 20." -- [the life of an event in logstash]
Existing plugins
Most popular plugins(inputs)

•   amqp
•   eventlog
•   file
•   redis
•   stdin
•   syslog
•   ganglia
Most popular plugins(filters)

•   date
•   grep
•   grok
•   multiline
Most popular plugins(outputs)

•   amqp
•   elasticsearch
•   email
•   file
•   ganglia
•   graphite
•   mongodb
•   nagios
•   redis
•   stdout
•   zabbix
•   websocket
Usage in cluster - agent install

• Only an 'all in one' jar download in
  http://logstash.net/
• All source include ruby and JRuby in
  http://github.com/logstash/
• But we want a lightweight agent in cluster.
Usage in cluster - agent install

• Edit Gemfile like:
   –   source "http://ruby.taobao.org/"
   –   gem "cabin", "0.4.1"
   –   gem "bunny"
   –   gem "uuidtools"
   –   gem "filewatch", "0.3.3"
• clone logstash/[bin|lib]:
   – git clone https://github.com/chenryn/logstash.git
   – git branch pure-ruby
• Gem install
   – gem install bundler
   – bundle
• Run
   – ruby logstash/bin/logstash -f logstash/etc/logstash-agent.conf
Usage in cluster - agent configuration

  –   input {
  –     file {
  –       type => "nginx"
  –       path => ["/data/nginx/logs/access.log" ]
  –    }
  –   }
  –   output {
  –     redis {
  –       type => "nginx"
  –       host => "5.5.5.5"
  –       key => "nginx"
  –       data_type => "channel"
  –     }
  –   }
Usage in cluster - server install

• Server is another agent run some filter and
  storages.
• Message queue(RabbitMQ is too heavy, Redis
  just enough):
  – yum install redis-server
  – service redis-server start
• Storage: mongo/elasticsearch/Riak
• Visualization: kibana/statsd/riemann/opentsdb
• Run:
  – java -jar logstash-1.1.0-monolithic.jar agent -f logstash/etc/server.conf
Usage in cluster - server configuration

  –   input {
  –     redis {
  –       type => "nginx"
  –       host => "5.5.5.5"
  –       data_type => "channel"
  –       key => "nginx"
  –     }
  –   }
  –   filter {
  –     grok {
  –       type => "nginx"
  –       pattern => "%{NGINXACCESS}"
  –       patterns_dir => ["/usr/local/logstash/etc/patterns"]
  –     }
  –   }
  –   output {
  –     elasticsearch {
  –       cluster => 'logstash'
  –       host => '10.5.16.109'
  –       port => 9300
  –     }
  –   }
Usage in cluster - grok

• jls-grok is a pattern tool wrote by JRuby
• Lots of examples can be found at:
  https://github.com/logstash/logstash/tree/master/patterns

• Here is my "nginx" patterns:
   – NGINXURI %{URIPATH}(?:%{URIPARAM})*
   – NGINXACCESS [%{HTTPDATE}] %{NUMBER:code:int} %{IP:client} %
     {HOSTNAME} %{WORD:method} %{NGINXURI:req} %{URIPROTO}/%
     {NUMBER:version} %{IP:upstream}(:%{POSINT:port})? %
     {NUMBER:upstime:float} %{NUMBER:reqtime:float} %{NUMBER:size:int}
     "(%{URIPROTO}://%{HOST:referer}%{NGINXURI:referer}|-)" %
     {QS:useragent} "(%{IP:x_forwarder_for}|-)"
Usage in cluster - elasticsearch

• ElasticSearch is a production build-on Luence
  for the cloud compute.
• more information at:
  – http://www.elasticsearch.cn/

• Logstash has an embedded ElasticSearch
  already!
• Attention: If you want to build your own
  distributed elasticsearch cluster, make sure the
  server version is equal to the client used by
  logstash!
Usage in cluster - elasticsearch

•   elasticsearch/config/elasticsearch.yml:
     –   cluster.name: logstash
     –   node.name: "ES109"
     –   node.master: true
     –   node.data: false
     –   index.number_of_replicas: 0
     –   index.number_of_shards: 1
     –   path.data: /data1/ES/data
     –   path.logs: /data1/ES/logs
     –   network.host: 10.5.16.109
     –   transport.tcp.port: 9300
     –   transport.tcp.compress: true
     –   gateway.type: local
     –   discovery.zen.minimum_master_nodes: 1
Usage in cluster - elasticsearch

• The embedded web front for ES is too simple,
  sometimes naïve~Try Kibana and EShead.
•   https://github.com/rashidkpc/Kibana
•   https://github.com/mobz/elasticsearch-head.git

• Attention:there is a bug about ES ---- ifdown
  your external network before ES starting and
  ifup later.Otherwase your ruby client cannot
  connect ES server!
Try it please!

• Ah, do not want install,install,install and install?
• Here is a killer application:
   –   sudo zypper install virtualbox rubygems
   –   gem install vagrant
   –   git clone https://github.com/mediatemple/log_wrangler.git
   –   cd log_wrangler
   –   PROVISION=1 vagrant up
Other output example

• For monitor(example):
  –   filter {
  –     grep {
  –       type => "linux-syslog"
  –       match => [ "@message","(error|ERROR|CRITICAL)" ]
  –       add_tag => [ "nagios-update" ]
  –       add_field => [ "nagios_host", "%{@source_host}", "nagios_service", "the name of your
      nagios service check" ]
  –     }
  –   }
  –   output{
  –     nagios {
  –       commandfile => “/usr/local/nagios/var/rw/nagios.cmd"
  –       tags => "nagios-update"
  –       type => "linux-syslog"
  –     }
  –    }
Other output example

• For metric
  – output {
  – statsd {
  –   increment => "apache.response.%{response}"
  –   count => [ "apache.bytes", "%{bytes}" ]
  – }
  – }
Advanced Questions

• Is ruby1.8.7 stability enough?
•   Try Message::Passing module in CPAN, I love perl~

• Is ElasticSearch high-speedy enough?
•   Try Sphinx, see report in ELSA project:
     –    In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to
          slowest for indexing speeds (non-scientifically tested):
     1.   Tokyo Cabinet
     2.   MongoDB
     3.   TokuDB MySQL plugin
     4.   Elastic Search (Lucene)
     5.   Splunk
     6.   HBase
     7.   CouchDB
     8.   MySQL Fulltext
•   http://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation#Why_ELSA?
Advanced Testing

• How much event/sec can ElasticSearch hold?
•   - Logstash::Output::Elasticsearch(HTTP) can only indexes 200+ msg/sec for
    one thread.
•   - Try _bulk API by myself using perl ElasticSearch::Transport::HTTPLite
    module.
•   -- speed testing result is 2500+ msg/sec
•   -- tesing record see:
    http://chenlinux.com/2012/09/16/elasticsearch-bulk-index-speed-testing/




                           WHY?!
Maybe…

• Logstash use an experimental module, we can
  see the Logstash::Output::ElasticsearchHTTP
  use ftw as http client but it cannot hold bulk size
  larger than 200!!
• So we all suggest to use multi-output block in
  agent.conf.
Advanced ES Settings(1)--problems

• Kibana can search data by using facets APIs.
  But when you indexes URLs, they would be
  auto-splitted by ‘/’~~
• And search facets at ip from 1000w msgs use
  0.1s,but at urls use…ah, timeout!
• When you check your indices size, you will find
  that (indices size/indices count) : message
  length ~~ 10:1 !!
Advanced ES Settings(2)--solution

• Setting ElasticSearch default _mapping
  template!
• In fact, ES “store” index data, and then “store”
  store data… Yes! If you don’t set “store” : “no”,
  all the data reduplicate stored.
• And ES has many analyze plugins.They
  automate split words by whitespaces, path
  hierachy, keword etc.
• So, set “index”:”not_analyzed” and facets 100k+
  URLs can be finished in 1s.
Advanced ES Settings(2)--solution

• Optimze:
• Call _optimze API everyday may decrease some
  indexed size~

• You can found those solutions in:
•   https://github.com/logstash/logstash/wiki/Elasticsearch-Storage-Optimization
•   https://github.com/logstash/logstash/wiki/Elasticsearch----Using-index-templates-&-dynamic-
Advanced Input -- question

• Now we know how to disable _all field, but there
  are still duplicated fields: @fields and
  @message!
• Logstash search ES default in @message field
  but logstash::Filter::Grok default capture
  variables into @fields just from @message!
• How to solve?
Advanced Input -- solution

• We know some other systems like
  Message::Passing have encode/decode in
  addition to input/filter/output.
• In fact logstash has them too~but rename them
  as ‘format’.
• So we can define the message format ourself,
  just using logformat in nginx.conf.

•   (example as follow)
Advanced Input -- nginx.conf

   – logformat json '{"@timestamp":"$time_iso8601",'
     '"@source":"$server_addr",‘
     '"@fields":{‘
     '"client":"$remote_addr",'
     '"size":$body_bytes_sent,'
     '"responsetime":$request_time,' '"upstreamtime":
     $upstream_response_time,'
     '"oh":"$upstream_addr",'
     '"domain":"$host",'
     '"url":"$uri",'
     '"status":"$status"}}';
   – access_log /data/nginx/logs/access.json json;
• See
  http://cookbook.logstash.net/recipes/apache-json-logs/
Advanced Input -- json_event

• Now define input block with format:
     – input {
     –    stdin {
     –       type => "nginx“
     –       format => "json_event“
     –    }
     – }

• And start in command line:
     – tail -F /data/nginx/logs/access.json 
     – | sed 's/upstreamtime":-/upstreamtime":0/' 
     – | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/agent.conf &
•   Attention: Upstreamtime may be “-” if status is 400.
Advanced Web GUI

• Write your own website using ElasticSearch
  RESTful API to search as follows:
  –   curl -XPOST http://es.domain.com:9200/logstash-2012.09.18/nginx/_search?pretty=1 –d ‘
      {
        “query”: {
          “range”: {
            “from”: “now-1h”,
            “to”: “now”
          }
        },
        “facets”: {
          “curl_test”: {
            “date_histogram”: {
              “key_field”: “@timestamp”,
              “value_field”: “url”,
              “interval “: “5m”
            }
          }
        },
        “size”: 0
      }
      ’
Additional Message::Passing demo

• I do write a demo using Message::Passing,
  Regexp::Log, ElasticSearch and so on perl
  modules working similar to logstash usage
  showed here.
• See:
  – http://chenlinux.com/2012/09/16/message-passing-agent/
  – http://chenlinux.com/2012/09/16/regexp-log-demo-for-nginx/
  – http://chenlinux.com/2012/09/16/message-passing-filter-demo/
Reference

•   http://logstash.net/docs/1.1.1/tutorials/metrics-from-logs
•   http://logwrangler.mtcode.com/
•   https://www.virtualbox.org/wiki/Linux_Downloads
•   http://vagrantup.com/v1/docs/getting-started/index.html
•   http://www.elasticsearch.cn
•   http://search.cpan.org/~bobtfish/Message-Passing-
    0.010/lib/Message/Passing.pm
Logstash

Mais conteúdo relacionado

Mais procurados

Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
Vikrant Chauhan
 
Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to Kibana
Vineet .
 

Mais procurados (20)

What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
 
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
 
Apache Spark Data Validation
Apache Spark Data ValidationApache Spark Data Validation
Apache Spark Data Validation
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Introducing ELK
Introducing ELKIntroducing ELK
Introducing ELK
 
Elk
Elk Elk
Elk
 
Grafana.pptx
Grafana.pptxGrafana.pptx
Grafana.pptx
 
ELK, a real case study
ELK,  a real case studyELK,  a real case study
ELK, a real case study
 
Combining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observabilityCombining logs, metrics, and traces for unified observability
Combining logs, metrics, and traces for unified observability
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 
ELK Elasticsearch Logstash and Kibana Stack for Log Management
ELK Elasticsearch Logstash and Kibana Stack for Log ManagementELK Elasticsearch Logstash and Kibana Stack for Log Management
ELK Elasticsearch Logstash and Kibana Stack for Log Management
 
Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to Kibana
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Log analytics with ELK stack
Log analytics with ELK stackLog analytics with ELK stack
Log analytics with ELK stack
 
Introduction To Kibana
Introduction To KibanaIntroduction To Kibana
Introduction To Kibana
 

Destaque

How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
琛琳 饶
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
Amazee Labs
 

Destaque (7)

Elk stack
Elk stackElk stack
Elk stack
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Webinar usando graylog para la gestión centralizada de logs
Webinar usando graylog para la gestión centralizada de logsWebinar usando graylog para la gestión centralizada de logs
Webinar usando graylog para la gestión centralizada de logs
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
 

Semelhante a Logstash

Semelhante a Logstash (20)

Managing Your Security Logs with Elasticsearch
Managing Your Security Logs with ElasticsearchManaging Your Security Logs with Elasticsearch
Managing Your Security Logs with Elasticsearch
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Open Source Logging and Metric Tools
Open Source Logging and Metric ToolsOpen Source Logging and Metric Tools
Open Source Logging and Metric Tools
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with Puppet
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Open Source Logging and Metrics Tools
Open Source Logging and Metrics ToolsOpen Source Logging and Metrics Tools
Open Source Logging and Metrics Tools
 
Open Source Logging and Monitoring Tools
Open Source Logging and Monitoring ToolsOpen Source Logging and Monitoring Tools
Open Source Logging and Monitoring Tools
 
Don’t turn your logs into cuneiform
Don’t turn your logs into cuneiformDon’t turn your logs into cuneiform
Don’t turn your logs into cuneiform
 
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
 
(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석
 
Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3
 
Securing Your Webserver By Pradeep Sharma
Securing Your Webserver By Pradeep SharmaSecuring Your Webserver By Pradeep Sharma
Securing Your Webserver By Pradeep Sharma
 
ITB2019 NGINX Overview and Technical Aspects - Kevin Jones
ITB2019 NGINX Overview and Technical Aspects - Kevin JonesITB2019 NGINX Overview and Technical Aspects - Kevin Jones
ITB2019 NGINX Overview and Technical Aspects - Kevin Jones
 
Introducing the Seneca MVP framework for Node.js
Introducing the Seneca MVP framework for Node.jsIntroducing the Seneca MVP framework for Node.js
Introducing the Seneca MVP framework for Node.js
 
20120816 nodejsdublin
20120816 nodejsdublin20120816 nodejsdublin
20120816 nodejsdublin
 

Mais de 琛琳 饶 (9)

{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
 
More kibana
More kibanaMore kibana
More kibana
 
Monitor is all for ops
Monitor is all for opsMonitor is all for ops
Monitor is all for ops
 
Perl调用微博API实现自动查询应答
Perl调用微博API实现自动查询应答Perl调用微博API实现自动查询应答
Perl调用微博API实现自动查询应答
 
Add mailinglist command to gitolite
Add mailinglist command to gitoliteAdd mailinglist command to gitolite
Add mailinglist command to gitolite
 
Skyline 简介
Skyline 简介Skyline 简介
Skyline 简介
 
DNS协议与应用简介
DNS协议与应用简介DNS协议与应用简介
DNS协议与应用简介
 
Mysql测试报告
Mysql测试报告Mysql测试报告
Mysql测试报告
 
Perl在nginx里的应用
Perl在nginx里的应用Perl在nginx里的应用
Perl在nginx里的应用
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Logstash

  • 2. Why use Logstash? • We already have splunk, syslog-ng, chukwa, graylog2, scribe, flume and so on. • But we want a free, light-weight and high- integrality frame for our log: • non free --> splunk • heavy java --> scribe,flume • lose data --> syslog • non flex --> nxlog
  • 3. How logstash works? • Ah, just like others, logstash has input/filter/output plugins. • Attention: logstash process events, not (only) loglines! • "Inputs generate events, filters modify them, outputs ship them elsewhere." -- [the life of an event in logstash] • "events are passed from each phase using internal queues......Logstash sets each queue size to 20." -- [the life of an event in logstash]
  • 5. Most popular plugins(inputs) • amqp • eventlog • file • redis • stdin • syslog • ganglia
  • 6. Most popular plugins(filters) • date • grep • grok • multiline
  • 7. Most popular plugins(outputs) • amqp • elasticsearch • email • file • ganglia • graphite • mongodb • nagios • redis • stdout • zabbix • websocket
  • 8. Usage in cluster - agent install • Only an 'all in one' jar download in http://logstash.net/ • All source include ruby and JRuby in http://github.com/logstash/ • But we want a lightweight agent in cluster.
  • 9. Usage in cluster - agent install • Edit Gemfile like: – source "http://ruby.taobao.org/" – gem "cabin", "0.4.1" – gem "bunny" – gem "uuidtools" – gem "filewatch", "0.3.3" • clone logstash/[bin|lib]: – git clone https://github.com/chenryn/logstash.git – git branch pure-ruby • Gem install – gem install bundler – bundle • Run – ruby logstash/bin/logstash -f logstash/etc/logstash-agent.conf
  • 10. Usage in cluster - agent configuration – input { – file { – type => "nginx" – path => ["/data/nginx/logs/access.log" ] – } – } – output { – redis { – type => "nginx" – host => "5.5.5.5" – key => "nginx" – data_type => "channel" – } – }
  • 11. Usage in cluster - server install • Server is another agent run some filter and storages. • Message queue(RabbitMQ is too heavy, Redis just enough): – yum install redis-server – service redis-server start • Storage: mongo/elasticsearch/Riak • Visualization: kibana/statsd/riemann/opentsdb • Run: – java -jar logstash-1.1.0-monolithic.jar agent -f logstash/etc/server.conf
  • 12. Usage in cluster - server configuration – input { – redis { – type => "nginx" – host => "5.5.5.5" – data_type => "channel" – key => "nginx" – } – } – filter { – grok { – type => "nginx" – pattern => "%{NGINXACCESS}" – patterns_dir => ["/usr/local/logstash/etc/patterns"] – } – } – output { – elasticsearch { – cluster => 'logstash' – host => '10.5.16.109' – port => 9300 – } – }
  • 13. Usage in cluster - grok • jls-grok is a pattern tool wrote by JRuby • Lots of examples can be found at: https://github.com/logstash/logstash/tree/master/patterns • Here is my "nginx" patterns: – NGINXURI %{URIPATH}(?:%{URIPARAM})* – NGINXACCESS [%{HTTPDATE}] %{NUMBER:code:int} %{IP:client} % {HOSTNAME} %{WORD:method} %{NGINXURI:req} %{URIPROTO}/% {NUMBER:version} %{IP:upstream}(:%{POSINT:port})? % {NUMBER:upstime:float} %{NUMBER:reqtime:float} %{NUMBER:size:int} "(%{URIPROTO}://%{HOST:referer}%{NGINXURI:referer}|-)" % {QS:useragent} "(%{IP:x_forwarder_for}|-)"
  • 14. Usage in cluster - elasticsearch • ElasticSearch is a production build-on Luence for the cloud compute. • more information at: – http://www.elasticsearch.cn/ • Logstash has an embedded ElasticSearch already! • Attention: If you want to build your own distributed elasticsearch cluster, make sure the server version is equal to the client used by logstash!
  • 15. Usage in cluster - elasticsearch • elasticsearch/config/elasticsearch.yml: – cluster.name: logstash – node.name: "ES109" – node.master: true – node.data: false – index.number_of_replicas: 0 – index.number_of_shards: 1 – path.data: /data1/ES/data – path.logs: /data1/ES/logs – network.host: 10.5.16.109 – transport.tcp.port: 9300 – transport.tcp.compress: true – gateway.type: local – discovery.zen.minimum_master_nodes: 1
  • 16. Usage in cluster - elasticsearch • The embedded web front for ES is too simple, sometimes naïve~Try Kibana and EShead. • https://github.com/rashidkpc/Kibana • https://github.com/mobz/elasticsearch-head.git • Attention:there is a bug about ES ---- ifdown your external network before ES starting and ifup later.Otherwase your ruby client cannot connect ES server!
  • 17. Try it please! • Ah, do not want install,install,install and install? • Here is a killer application: – sudo zypper install virtualbox rubygems – gem install vagrant – git clone https://github.com/mediatemple/log_wrangler.git – cd log_wrangler – PROVISION=1 vagrant up
  • 18. Other output example • For monitor(example): – filter { – grep { – type => "linux-syslog" – match => [ "@message","(error|ERROR|CRITICAL)" ] – add_tag => [ "nagios-update" ] – add_field => [ "nagios_host", "%{@source_host}", "nagios_service", "the name of your nagios service check" ] – } – } – output{ – nagios { – commandfile => “/usr/local/nagios/var/rw/nagios.cmd" – tags => "nagios-update" – type => "linux-syslog" – } – }
  • 19. Other output example • For metric – output { – statsd { – increment => "apache.response.%{response}" – count => [ "apache.bytes", "%{bytes}" ] – } – }
  • 20. Advanced Questions • Is ruby1.8.7 stability enough? • Try Message::Passing module in CPAN, I love perl~ • Is ElasticSearch high-speedy enough? • Try Sphinx, see report in ELSA project: – In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to slowest for indexing speeds (non-scientifically tested): 1. Tokyo Cabinet 2. MongoDB 3. TokuDB MySQL plugin 4. Elastic Search (Lucene) 5. Splunk 6. HBase 7. CouchDB 8. MySQL Fulltext • http://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation#Why_ELSA?
  • 21. Advanced Testing • How much event/sec can ElasticSearch hold? • - Logstash::Output::Elasticsearch(HTTP) can only indexes 200+ msg/sec for one thread. • - Try _bulk API by myself using perl ElasticSearch::Transport::HTTPLite module. • -- speed testing result is 2500+ msg/sec • -- tesing record see: http://chenlinux.com/2012/09/16/elasticsearch-bulk-index-speed-testing/ WHY?!
  • 22. Maybe… • Logstash use an experimental module, we can see the Logstash::Output::ElasticsearchHTTP use ftw as http client but it cannot hold bulk size larger than 200!! • So we all suggest to use multi-output block in agent.conf.
  • 23. Advanced ES Settings(1)--problems • Kibana can search data by using facets APIs. But when you indexes URLs, they would be auto-splitted by ‘/’~~ • And search facets at ip from 1000w msgs use 0.1s,but at urls use…ah, timeout! • When you check your indices size, you will find that (indices size/indices count) : message length ~~ 10:1 !!
  • 24. Advanced ES Settings(2)--solution • Setting ElasticSearch default _mapping template! • In fact, ES “store” index data, and then “store” store data… Yes! If you don’t set “store” : “no”, all the data reduplicate stored. • And ES has many analyze plugins.They automate split words by whitespaces, path hierachy, keword etc. • So, set “index”:”not_analyzed” and facets 100k+ URLs can be finished in 1s.
  • 25. Advanced ES Settings(2)--solution • Optimze: • Call _optimze API everyday may decrease some indexed size~ • You can found those solutions in: • https://github.com/logstash/logstash/wiki/Elasticsearch-Storage-Optimization • https://github.com/logstash/logstash/wiki/Elasticsearch----Using-index-templates-&-dynamic-
  • 26. Advanced Input -- question • Now we know how to disable _all field, but there are still duplicated fields: @fields and @message! • Logstash search ES default in @message field but logstash::Filter::Grok default capture variables into @fields just from @message! • How to solve?
  • 27. Advanced Input -- solution • We know some other systems like Message::Passing have encode/decode in addition to input/filter/output. • In fact logstash has them too~but rename them as ‘format’. • So we can define the message format ourself, just using logformat in nginx.conf. • (example as follow)
  • 28. Advanced Input -- nginx.conf – logformat json '{"@timestamp":"$time_iso8601",' '"@source":"$server_addr",‘ '"@fields":{‘ '"client":"$remote_addr",' '"size":$body_bytes_sent,' '"responsetime":$request_time,' '"upstreamtime": $upstream_response_time,' '"oh":"$upstream_addr",' '"domain":"$host",' '"url":"$uri",' '"status":"$status"}}'; – access_log /data/nginx/logs/access.json json; • See http://cookbook.logstash.net/recipes/apache-json-logs/
  • 29. Advanced Input -- json_event • Now define input block with format: – input { – stdin { – type => "nginx“ – format => "json_event“ – } – } • And start in command line: – tail -F /data/nginx/logs/access.json – | sed 's/upstreamtime":-/upstreamtime":0/' – | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/agent.conf & • Attention: Upstreamtime may be “-” if status is 400.
  • 30. Advanced Web GUI • Write your own website using ElasticSearch RESTful API to search as follows: – curl -XPOST http://es.domain.com:9200/logstash-2012.09.18/nginx/_search?pretty=1 –d ‘ { “query”: { “range”: { “from”: “now-1h”, “to”: “now” } }, “facets”: { “curl_test”: { “date_histogram”: { “key_field”: “@timestamp”, “value_field”: “url”, “interval “: “5m” } } }, “size”: 0 } ’
  • 31. Additional Message::Passing demo • I do write a demo using Message::Passing, Regexp::Log, ElasticSearch and so on perl modules working similar to logstash usage showed here. • See: – http://chenlinux.com/2012/09/16/message-passing-agent/ – http://chenlinux.com/2012/09/16/regexp-log-demo-for-nginx/ – http://chenlinux.com/2012/09/16/message-passing-filter-demo/
  • 32. Reference • http://logstash.net/docs/1.1.1/tutorials/metrics-from-logs • http://logwrangler.mtcode.com/ • https://www.virtualbox.org/wiki/Linux_Downloads • http://vagrantup.com/v1/docs/getting-started/index.html • http://www.elasticsearch.cn • http://search.cpan.org/~bobtfish/Message-Passing- 0.010/lib/Message/Passing.pm