O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Fluentd - Flexible, Stable, Scalable

Fluentd - a data collector with many plugin support
This is a talk about Fluentd at Taipei.py on Feb. 26th 2015.

Fluentd - Flexible, Stable, Scalable

  1. 1. Fluentd Flexible, Stable, Scalable Suiting @Taipei.py
  2. 2. ho  am  I Suiting  (@suitingtseng)   Gogolook  Inc.   Data  Team
  3. 3. Before
  4. 4. What is Fluentd? • Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. • Treasure Data: td-agent
  5. 5. What is Fluentd? • Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. • Treasure Data: td-agent
  6. 6. What is a log?
  7. 7. Log definition Time + Tag + Content
  8. 8. After
  9. 9. How? • Lightweight: C + Ruby + MessagePack • Pluggable architecture • Built-in Reliability
  10. 10. Input plugins • forward • tail • AWS Simple Queue Service • AWS CloudWatch
  11. 11. input: tail $  cat  /etc/td-­‐agent/conf.d   <source>      type            tail      path            /var/log/nginx/access.log      pos_file    /var/log/td-­‐agent/httpd-­‐access.log.pos      tag              nginx.access   </source>   <match  nginx.access>      blah  blah   </match>
  12. 12. input: forward $  cat  /etc/td-­‐agent/conf.d   <source>      type  forward      port  24224   </source>   <match  flask.index>      blah  blah   </match>
  13. 13. input: forward $  cat  ~/example.py   from  fluent  import  sender   from  fluent  import  event   sender.setup('flask',  host='localhost',  port=24224)   event.Event("index",  {   "user":  "foo",   "token":  "bar",   "action":  "POST"   })
  14. 14. Output plugins • forward • copy • Elasticsearch / MongoDB • statsd / influxDB / graphite • S3 / GCS / BigQuery
  15. 15. output: elasticsearch $  cat  /etc/td-­‐agent/conf.d   <source>      foo                        bar      tag                        nginx.access   </source>   <match  nginx.access>      type                      elasticsearch      hosts                    es-­‐host1,es-­‐host2      index_name          nginx      type_name            access      flush_interval  60s   </match>
  16. 16. output: splunk $  cat  /etc/td-­‐agent/conf.d   <source>      foo                        bar      tag                        nginx.access   </source>   <match  nginx.access>      type                      splunk      hosts                    splunk-­‐host1   </match>
  17. 17. Filter plugins • grok • grep • record-modifier / record-reformer • geoip
  18. 18. Buffer types • Memory • File
  19. 19. Buffer example $  cat  /etc/td-­‐agent/conf.d   <source>      foo                                  bar      tag                                  nginx.access   </source>   <match  nginx.access>      type                                splunk      hosts                              splunk-­‐host1      buffer_chunk_limit    10m      buffer_queue_limit    1000      flush_interval            5m   </match>
  20. 20. Scalability • Scale up: multi-process plugin • Scale out: out-forward plugin
  21. 21. App + Fluentd Fluentd Elastic search Elastic search Elastic search Elastic search App + Fluentd App + Fluentd
  22. 22. Fluentd Elastic search Elastic search Elastic search Elastic search Fluentd App + Fluentd App + Fluentd App + Fluentd
  23. 23. Fluentd Elastic search Elastic search Elastic search Elastic search Fluentd Fluentd Load balance App + Fluentd App + Fluentd App + Fluentd Auto scaling group
  24. 24. Stability • Auto retry • Persistent file buffer • At-most-once delivery
  25. 25. Message Delivery • At-most-once: data may be lost • At-least-once: data may be duplicated • Exactly-once: perfect
  26. 26. Idempotent • HTTP PUT • Maintain a unique id in application level or • Concatenate (instance-id, time, ….) as id
  27. 27. Gogolook use cases • MongoDB, nginx log • API, worker log • Monitor • Benchmark
  28. 28. Active users by day
  29. 29. System monitor
  30. 30. Queue monitor
  31. 31. Benchmark? FluentdApp + Fluentd DB
  32. 32. Benchmark? FluentdApp + Fluentd DB Local files
  33. 33. Benchmark? FluentdApp + Fluentd DB Local files
  34. 34. Q & A

×