The document summarizes the new plugin API in Fluentd v0.14. Key points include:
- The v0.12 plugin API was fragmented and difficult to write tests for. The v0.14 API provides a unified architecture.
- The main plugin classes are Input, Filter, Output, Buffer, and plugins must subclass Fluent::Plugin::Base.
- The Output plugin supports both buffered and non-buffered processing. Buffering can be configured by tags, time, or custom fields.
- "Owned" plugins like Buffer are instantiated by primary plugins and can access owner resources. Storage is a new owned plugin for persistent storage.
- New test drivers emulate plugin
3. Topics
• Why Fluentd v0.14 has a new API set for plugins
• Compatibility of v0.12 plugins/configurations
• Plugin APIs: Input, Filter, Output & Buffer
• Storage Plugin, Plugin Helpers
• New Test Drivers for plugins
• Plans for v0.14.x & v1
5. Fluentd v0.12 Plugins
• No supports to write plugins by Fluentd core
• plugins creates threads, sockets, timers and event loops
• writing tests is very hard and messy with sleeps
• Fragmented implementations
• Output, BufferedOutput, ObjectBufferedOutput and TimeSlicedOutput
• Mixture of configuration parameters from output&buffer
• Uncontrolled plugin instance lifecycle (no "super" in start/shutdown)
• Imperfect buffering control and useless configurations
• the reason why fluent-plugin-forest exists and be used widely
6. Fluentd v0.12 Plugins
• Insufficient buffer chunking control
• only by size, without number of events in chunks
• Forcedly synchronized buffer flushing
• no way to flush-and-commit chunks asynchronously
• Ultimate freedom for using mix-ins
• everything overrides Plugin#emit ... (the only one entry point for
events to plugins)
• no valid hook points to get metrics or something else
• Bad Ruby coding rules and practices
• too many classes at "Fluent::*" in fluent/plugin, no "require", ...
9. Compatibility of plugins
• v0.12 plugins are subclass of Fluent::*
• Fluent::Input, Fluent::Filter, Fluent::Output, ...
• Compatibility layers for v0.12 plugins in v0.14
• Fluent::Compat::Klass -> Fluent::Klass (e.g., Input, Output, ...)
• it provides transformation of:
• namespaces, configuration parameters
• internal APIs, argument objects
• IT SHOULD WORK, except for :P
• 3rd party buffer plugin, part of test code
• "Engine.emit"
10. Compatibility of configurations
• v0.14 plugins have another set of parameters
• many old-fashioned parameters are removed
• "buffer_type", "num_threads", "timezone", "time_slice_format",
"buffer_chunk_limit", "buffer_queue_limit", ...
• Plugin helper "compat_parameters"
• transform parameters between v0.12 style
configuration and v0.14 plugin
v0.12 v0.14
convert
internally
11. FAQ:
Can we create plugins like this?
* it uses v0.14 API
* it runs on Fluentd v0.12
Impossible :P
13. v0.14 plugin classes
• All files MUST be in `fluent/plugin/*.rb` (in gems)
• or just a "*.rb" file in directory specified by "-r"
• All classes MUST be under Fluent::Plugin
• All plugins MUST be subclasses of Fluent::Plugin::Base
• All plugins MUST call `super` in methods overriding
default implementation (e.g., #configure, #start, #shutdown, ...)
14. Classes hierarchy (v0.12)
Fluent::Input F::Filter
F::Output
BufferedOutput
Object
Buffered
Time
Sliced
Multi
Output F::Buffer
F::Parser
F::Formatter
3rd party plugins
15. Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin
16. Tour of New Plugin APIs:
Fluent::Plugin::Input
17. Fluent::Plugin::Input
• Nothing changed :)
• except for overall rules
• But it's much easier
to write plugins
than v0.12 :)
• fetch HTTP resource per
specified interval
• parse response body
with format specified in
config
• emit parse result
19. Tour of New Plugin APIs:
Fluent::Plugin::Filter
20. Fluent::Plugin::Filter
• Almost nothing changed :)
• Required:
#filter(tag, time, record)
#=> record | nil
• Optional:
#filter_stream(tag, es)
#=> event_stream
21. Tour of New Plugin APIs:
Fluent::Plugin::Output
22. Fluent::Plugin::Output
• Many things changed!
• Merged Output, BufferedOutput, ObjectBufferedOutput, TimeSlicedOutput
• Output plugins can be
• with buffering
• without buffering
• both (do/doesn't buffering by configuration)
• Buffers chunks events by:
• byte size, interval, tag
• number of records (new!)
• time (by any unit(new!): 30s, 5m, 15m, 3h, ...)
• any specified field in records (new!)
• any combination of above (new!)
24. Output Plugin:
Methods to be implemented
• Non-buffered: #process(tag, es)
• Buffered synchronous: #write(chunk)
• Buffered Asynchronous: #try_write(chunk)
• New feature for destinations with huge latency to write
chunks
• Plugins must call #commit_write(chunk_id) (otherwise,
#try_write will be retried)
• Buffered w/ custom format: #format(tag, time, record)
• Without this method, output uses standard format
25. implement?
#process
implement?
#process or #write or #try_write
NO error
YES
#prefer_buffered_processing
called (default true)
NO
non-buffered
YES
exists?
<buffer> section
YES implement?
#write or #try_write
error
NO
YES
implement?
#write or
#try_write
NO
NO
YES
false
implement?
#write and
#try_write
YES
#prefer_delayed_commit
called (default true)
implement?
#try_write
sync
buffered
async
buffered
26. In other words :P
• If users configure "<buffer>" section
• plugin try to do buffering
• Else if plugin implements both (buffering/non-buf)
• plugin call #prefer_buffer_processing to decide
• Else plugin does as implemented
• When plugin does buffering
If plugin implements both (sync/async write)
• plugin call #prefer_delayed_commit to decide
• Else plugin does as implemented
27. Delayed commit (1)
• high latency #write operations locks a flush thread for long time
(e.g., ACK in forward)
destination w/ high latency
#write
Output Plugin
send data send ACK
return #write
a flush thread locked
29. Use cases: delayed commit
• Forward protocol w/ ACK
• Distributed file systems or databases
• put data -> confirm to read data -> commit
• Submit tasks to job queues
• submit a job -> detect executed -> commit
30. Standard chunk format
• Buffering w/o #format method
• Almost same with ObjectBufferedOutput
• No need to implement #format always
• Implement it for performance/low-latency
• Tool to dump & read buffer chunks on disk w/
standard format
• To be implemented in v0.14.x :)
31. <buffer CHUNK_KEYS>
• comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk
• flushed when chunk is full
• (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey
• timekey: unit of time to be chunked (1m, 15m, 3h, ...)
• flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
32. • comma-separated tag, time or ANY_KEYS
• Nothing specified: all events are in same chunk
• flushed when chunk is full
• (optional) "flush_interval" after first event in chunk
• tag: events w/ same tag are in same chunks
• time: buffer chunks will be split by timekey
• timekey: unit of time to be chunked (1m, 15m, 3h, ...)
• flushed after expiration of timekey unit + timekey_wait
• ANY_KEYS: any key names in records
<buffer CHUNK_KEYS>
BufferedOutput
TimeSlicedOutput
ObjectBufferedOutput
in v0.12
in v0.12
in v0.12
33. configurations:
flushing buffers
• flush_mode: lazy, interval, immediate
• default: lazy if "time" specified, otherwise interval
• flush_interval, flush_thread_count
• flush_thread_count: number of threads for flushing
• delayed_commit_timeout
• output plugin will retry #try_write when expires
34. Retries, Secondary
• Explicit timeout for retries:
• retry_timeout: timeout not to retry anymore
• retry_max_times: how many times to retry
• retry_type: "periodic" w/ fixed retry_wait
• retry_secondary_threshold (percentage)
• output will use secondary if specified percentage
of retry_timeout elapsed after first error
35. Buffer parameters
• chunk_limit_size
• maximum bytesize per chunks
• chunk_records_limit (default: not specified)
• maximum number of records per chunks
• total_limit_size
• maximum bytesize which a buffer plugin can use
• (optional) queue_length_limit: no need to specify
36. Chunk metadata
• Stores various information of buffer chunks
• key-values of chunking unit
• number of records
• created_at, modified_at
• `chunk.metadata`
• extract_placeholders(@path, chunk.metadata)
38. Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin
39. Classes hierarchy (v0.14)
F::P::Input F::P::Filter F::P::Output
Fluent::Plugin::Base
F::P::Buffer
F::P::Parser
F::P::Formatter
F::P::Storage
both of
buffered/non-buffered
F::P::
BareOutput
(not for 3rd party
plugins)
F::P::
MultiOutput
copy
roundrobin"Owned" plugins
40. "Owned" plugins
• Primary plugins: Input, Output, Filter
• Instantiated by Fluentd core
• "Owned" plugins are owned by primary plugins
• Buffer, Parser, Formatter, Storage, ...
• It can refer owner's plugin id, logger, ...
• Fluent::Plugin.new_xxx("kind", parent:@input)
• "Owned" plugins can be configured by owner plugins
41. Owner plugins can control defaults of owned plugins
Fluentd provides standard way to configure owned
plugins
42. Tour of New Plugin APIs:
Fluent::Plugin::Storage
43. Storage plugins
• Pluggable Key-Value store for plugins
• configurable: autosave, persistent, save_at_shutdown
• get, fetch, put, delete, update (transactional)
• Various possible implementations
• built-in: local (json) on-disk / on-memory
• possible: Redis, Consul,
or whatever supports serialize/deserialize of json-like object
• To store states of plugins:
• counter values of data-counter plugin
• pos data of file plugin
• To load configuration dynamically for plugins:
• load configurations from any file systems
45. Plugin Helpers
• No more mixin!
• declare to use helpers by "helpers :name"
• Utility functions to support difficult things
• creating threads, timers, child processes...
• created timers will be stopped automatically in
plugin's shutdown sequence
• Integrated w/ New Test Drivers
• tests runs after helpers started everything requested
49. New Test Drivers
• Instead of old drivers Fluent::Test::*TestDriver
• Fluent::Test::Driver::Input, Output or Filter
• fully emulates actual plugin behavior
• w/ override SystemConfig
• capturing emitted events & error event streams
• inserting TestLogger to capture/test logs of plugins
• capturing "format" result of output plugins
• controlling "flush" timing of output plugins
• Running tests under control
• Plugin Helper integration
• conditions to keep/break running tests
• timeouts, number of emits/events to stop tests
• automatic start/shutdown call for plugins
51. New Features
• Symmetric multi processing
• to use 2 or more CPU cores!
• by sharing a configuration between all processes
• "detach_process" will be deprecated
• forward: TLS + authentication/authorization support
• secure-forward integration
• Buffer supports compression & forward it
• Plugin generator & template
52. New APIs
• Controlling global configuration from SystemConfig
• configured via <system> tag
• root buffer path + plugin id: remove paths from
each buffers
• process total buffer size control
• Counter APIs
• counting everything over processes via RPC
• creating metrics for a whole fluentd cluster
54. v1: stable version of v0.14
• v0.12 plugins will be still supported at v1.0.0
• deprecated, and will be obsoleted at v1.x
• Will be obsoleted:
• v0 (traditional) configuration syntax
• "detach_process" feature
• Q4 2016?
55. To Be Written by me :-)
• As soooooooooon as possible...
• Plugin developers' guide for
• Updating v0.12 plugins with v0.14 APIs
• Writing plugins with v0.14 APIs
• Writing tests of plugins with v0.14 APIs
• Users' guide for
• How to use buffering in general (w/ <buffer>)
• Updated plugin documents