Logging for Production Systems in The Container Era

Logging for Production Systems
in The Container Era
Sadayuki Furuhashi 
Founder & Software Architect
DOCKER MOUNTAIN VIEW

A little about me…
Sadayuki Furuhashi
github: @frsyuki
A founder of Treasure Data, Inc. located in Silicon Valley.
Fluentd - Uniﬁd log collection infrastracture Embulk - Plugin-based ETL tool
OSS projects I founded:
An open-source hacker.

It's like JSON.
but fast and small.
A little about me…

The Container Era
Server Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?

Server Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?
The Container Era
How should log & metrics collection
be done in The Container Era?

The traditional logrotate + rsync on containers
Log Server
Application
Container A
File FileFile
Hard to analyze!!

Complex text parsers
Application
Container C
File FileFile
Application
Container B
File FileFile
High latency!!

Must wait for a day
Ephemeral!!

Could be lost at any time

Server 1
Container A
Application
Container B
Application
Server 2
Container C
Application
Container D
Application
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Small & many containers make storages overloaded
Too many connections
from micro containers!

Server 1
Container A
Application
Container B
Application
Server 2
Container C
Application
Container D
Application
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
System images are immutable
from micro containers!
Embedding destination IPs 
in ALL Docker images 
makes management hard

Combination explosion with microservices 
requires too many scripts for data integration
LOG
script to
parse data
cron job for
loading
ﬁltering
script
syslog
script
Tweet-
fetching
script
aggregation
script
aggregation
script
script to
parse data
rsync
server

A solution: centralized log collection service
LOG
Log Service

The centralized log collection service
LOG

The centralized log collection service
LOG
We Released! 
(Apache License)

What’s Fluentd?
Simple core 
+ Variety of plugins
Buﬀering, HA (failover),
Secondary output, etc.
Like syslogd
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL

How to collect logs from 
Docker containers

Text logging with --log-driver=ﬂuentd
Server
Container
App
FluentdSTDOUT / STDERR
docker run
--log-driver=fluentd  
--log-opt
fluentd-address=localhost:24224
{

“container_id”: “ad6d5d32576a”,

“container_name”: “myapp”,

“source”: stdout

}

Metrics collection with ﬂuent-logger
Server
Container
App
Fluentd
from fluent import sender
from fluent import event
sender.setup('app.events', host='localhost')
event.Event('purchase', {
'user_id': 21, 'item_id': 321, 'value': '1'
})
tag = app.events.purchase

{

“user_id”: 21,

“item_id”: 321

“value”: 1,

}
fluent-logger library

Logging methods for each purpose
• Collecting log messages
> --log-driver=ﬂuentd
• Application metrics
> ﬂuent-logger
• Access logs, logs from middleware
> Shared data volume
• System metrics (CPU usage, Disk capacity, etc.)
> Fluentd’s input plugins 
(Fluentd pulls those data periodically)

Server 1
Container A
Application
Container B
Application
Server 2
Container C
Application
Container D
Application
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Primitive deployment…
from many containers!
Embedding destination IPs 
in ALL Docker images 
makes management hard

Server 1
Container A
Application
Container B
Application
Fluentd
Server 2
Container C
Application
Container D
Application
Fluentd Kafka
elasticsearch
HDFS
Container
Container
Container
Container
destination is always
localhost from app’s
point of view
Source aggregation decouples conﬁg
from apps

Server 1
Container A
Application
Container B
Application
Fluentd
Server 2
Container C
Application
Container D
Application
Fluentd
active / standby /
load balancing
Destination aggregation makes storages scalable
for high trafﬁc
Aggregation server(s)

Aggregation servers
• Logging directly from microservices makes log
storages overloaded.
> Too many RX connections
> Too frequent import API calls
• Aggregation servers make the logging infrastracture
more reliable and scalable.
> Connection aggregation
> Buﬀering for less frequent import API calls
> Data persistency during downtime
> Automatic retry at recovery from downtime

Internal Architecture (simpliﬁed)
Plugin
Input Filter Buffer Output
Plugin Plugin Plugin
2012-02-04 01:33:51

myapp.buylog{

“user”:”me”,

“path”: “/buyItem”,

“price”: 150,

“referer”: “/landing”

}
Time
Tag
Record

Architecture: Input Plugins
HTTP+JSON (in_http)

File tail (in_tail)

Syslog (in_syslog)

…
Receive logs
Or pull logs from data sources
In non-blocking manner
Plugin
Input

Filter
Architecture: Filter Plugins
Transform logs
Filter out unnecessary logs
Enrich logs
Plugin
Encrypt personal data

Convert IP to countries

Parse User-Agent

…

Buffer
Architecture: Buffer Plugins
Plugin
Improve performance
Provide reliability
Provide thread-safety
Memory (buf_memory)

File (buf_ﬁle)

Architecture: Output Plugins
Output
Write or send event logs
Plugin
File (out_ﬁle)

Amazon S3 (out_s3)

MongoDB (out_mongo)

…

Buffer
Architecture: Buffer Plugins
Chunk
Plugin
Improve performance
Provide reliability
Provide thread-safety
Input
Output
Chunk
Chunk

Retry
Error
Retry
Batch
Stream Error
Retry
Retry
Divide & Conquer for retry

Divide & Conquer for recovery
Buffer
(on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + ﬂow control
queued chunks

Streaming from Apache/Nginx to Elasticsearch
in_tail
/var/log/access.log
/var/log/ﬂuentd/buffer
but_ﬁle

Error Handling and Recovery
in_tail
/var/log/access.log
/var/log/ﬂuentd/buffer
but_ﬁle
Buffering for any outputs
Retrying automatically
With exponential wait
and persistence on a disk
and secondary output

Tailing & parsing files
Supported built-in formats:
Read a log file
Custom regexp
Custom parser in Ruby
• apache
• apache_error
• apache2
• nginx
• json
• csv
• tsv
• ltsv
• syslog
• multiline
• none
pos fileevents.log
?
(your app)

Out to Multiple Locations
Routing based on tags
Copy to multiple storages
buffer
access.log
in_tail

Example conﬁguration for real time batch combo

Data partitioning by time on HDFS / S3
access.log
buffer
Custom ﬁle
formatter
Slice ﬁles based on time
2016-01-01/01/access.log.gz
…
in_tail

3rd party input plugins
dstat
df AMQL
munin
jvmwatcher
SQL

3rd party output plugins
AMQL
Graphite

Microsoft
Operations Management Suite uses Fluentd: "The core of the agent uses an existing
open source data aggregator called Fluentd. Fluentd has hundreds of existing
plugins, which will make it really easy for you to add new data sources."
Syslog
Linux Computer
Operating System
Apache
MySQL
Containers
omsconﬁg (DSC)
PS DSC
Providers
OMI Server
(CIM Server)
omsagent
Firewall/proxy
OMSService
Upload Data

(HTTPS)
Pull

configuration

(HTTPS)

Atlassian
"At Atlassian, we've been impressed by Fluentd and have chosen to use it in
Atlassian Cloud's logging and analytics pipeline."
Kinesis
Elasticsearch

cluster
Ingestion

service

Amazon web services
The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache
Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better
documentation and support than Flume and Scribe.
Types of DataStoreCollect
Transactional
• Database reads & write (OLTP)

• Cache
Search
• Logs

• Streams
File
• Log ﬁles (/val/log)

• Log collectors & frameworks
Stream
• Log records

• Sensors & IoT data
Web Apps
IoTApplicationsLogging
Mobile Apps
Database
Search
File Storage
Stream Storage

Logging for Production Systems in The Container Era

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Logging for Production Systems in The Container Era

Semelhante a Logging for Production Systems in The Container Era (20)

Mais de Sadayuki Furuhashi

Mais de Sadayuki Furuhashi (14)

Último

Último (20)

Logging for Production Systems in The Container Era