O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Fluentd and Kafka
Hadoop / Spark Conference Japan 2016

Feb 8, 2016
Who are you?
• Masahiro Nakagawa
• github: @repeatedly
• Treasure Data Inc.
• Fluentd / td-agent developer
• Fluentd Enter...
Fluentd
• Pluggable streaming event collector
• Lightweight, robust and flexible
• Lots of plugins on rubygems
• Used by AW...
Popular case
App
Push
Push
Forwarder Aggregator Destination
• Distributed messaging system
• Producer - Broker - Consumer pattern
• Pull model, replication, etc











Apache Kaf...
Push vs Pull
• Push:
• Easy to transfer data to multiple destinations
• Hard to control stream ratio in multiple streams
•...
There are 2 ways
• fluent-plugin-kafka
• kafka-fluentd-consumer
fluent-plugin-kafka
• Input / Output plugin for kafka
• https://github.com/htgc/fluent-plugin-kafka
• in_kafka, in_kafka_gro...
Configuration example
<source>
@type kafka
topics web,system
format json
add_prefix kafka.
# more options
</source>
<match k...
kafka fluentd consumer
• Stand-alone kafka consumer for fluentd
• https://github.com/treasure-data/kafka-fluentd-consumer
• S...
Run consumer
• Edit log4j and fluentd-consumer properties
• Run following command:

$ java 

-Dlog4j.configuration=file:///pa...
Properties example
fluentd.tag.prefix=kafka.event.
fluentd.record.format=regexp # default is json
fluentd.record.pattern=(?<te...
With Fluentd example
<source>
@type forward
</source>
<source>
@type exec
command java -
Dlog4j.configuration=file:///path/t...
Conclusion
• Kafka is now becomes important component

on data platform
• Fluentd can communicate with Kafka
• Fluentd plu...
Próximos SlideShares
Carregando em…5
×

Fluentd and Kafka

12.269 visualizações

Publicada em

Talks at Hadoop / Spark Conference Japane 2016

Publicada em: Tecnologia
  • People used to laugh at me behind my back before I was in shape or successful. Once I lost a lot of weight, I was so excited that I opened my own gym, and began helping others. I began to get quite a large following of students, and finally, I didn't catch someone laughing at me behind my back any longer. CLICK HERE NOW ◆◆◆ http://t.cn/A6PnIVD3
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Sex in your area is here: ♥♥♥ http://bit.ly/2F7hN3u ♥♥♥
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Follow the link, new dating source: ♥♥♥ http://bit.ly/2F7hN3u ♥♥♥
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • 1 minute a day to keep your weight away! ◆◆◆ http://ishbv.com/1minweight/pdf
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui

Fluentd and Kafka

  1. 1. Fluentd and Kafka Hadoop / Spark Conference Japan 2016
 Feb 8, 2016
  2. 2. Who are you? • Masahiro Nakagawa • github: @repeatedly • Treasure Data Inc. • Fluentd / td-agent developer • Fluentd Enterprise support • I love OSS :) • D Language, MessagePack, The organizer of several meetups, etc…
  3. 3. Fluentd • Pluggable streaming event collector • Lightweight, robust and flexible • Lots of plugins on rubygems • Used by AWS, GCP, MS and more companies • Resources • http://www.fluentd.org/ • Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
  4. 4. Popular case App Push Push Forwarder Aggregator Destination
  5. 5. • Distributed messaging system • Producer - Broker - Consumer pattern • Pull model, replication, etc
 
 
 
 
 
 Apache Kafka App Push Pull Producer Broker DestinationConsumer
  6. 6. Push vs Pull • Push: • Easy to transfer data to multiple destinations • Hard to control stream ratio in multiple streams • Pull: • Easy to control stream flow / ratio • Should manage consumers correctly
  7. 7. There are 2 ways • fluent-plugin-kafka • kafka-fluentd-consumer
  8. 8. fluent-plugin-kafka • Input / Output plugin for kafka • https://github.com/htgc/fluent-plugin-kafka • in_kafka, in_kafka_group, out_kafka, out_kafka_buffered • Pros • Easy to use and output support • Cons • Performance is not primary
  9. 9. Configuration example <source> @type kafka topics web,system format json add_prefix kafka. # more options </source> <match kafka.**> @type kafka_buffered output_data_type msgpack default_topic metrics compression_codec gzip required_acks 1 </match> https://github.com/htgc/fluent-plugin-kafka#usage
  10. 10. kafka fluentd consumer • Stand-alone kafka consumer for fluentd • https://github.com/treasure-data/kafka-fluentd-consumer • Send cosumed events to fluentd’s in_forward • Pros • High performance and Java API features • Cons • Need Java runtime
  11. 11. Run consumer • Edit log4j and fluentd-consumer properties • Run following command:
 $ java 
 -Dlog4j.configuration=file:///path/to/log4j.properties 
 -jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 
 path/to/fluentd-consumer.properties
  12. 12. Properties example fluentd.tag.prefix=kafka.event. fluentd.record.format=regexp # default is json fluentd.record.pattern=(?<text>.*) # for regexp format fluentd.consumer.topics=app.* # can use Java Rege fluentd.consumer.topics.pattern=blacklist # default is whitelist fluentd.consumer.threads=5 https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
  13. 13. With Fluentd example <source> @type forward </source> <source> @type exec command java - Dlog4j.configuration=file:///path/to/ log4j.properties -jar /path/to/kafka- fluentd-consumer-0.2.1-all.jar /path/ to/config/fluentd- consumer.properties tag dummy format json </source> https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
  14. 14. Conclusion • Kafka is now becomes important component
 on data platform • Fluentd can communicate with Kafka • Fluentd plugin and kafka consumer • Building reliable and flexible data pipeline with
 Fluentd and Kafka

×