2. History
● Ceilometer started in 2012
○ Original mission: provide an infrastructure to collect any
information needed regarding OpenStack projects
● Added alarming in 2013
○ Create rules and based on threshold conditions that when broken
trigger action
● Added events in 2014
○ The state of an object in an OpenStack service at a point in time
● New mission
○ To reliably collect data on the utilization of the physical and
virtual resources comprising deployed clouds, persist these data for
subsequent retrieval and analysis, and trigger actions when defined
5. Growing pains
● Too large of a scope - we did everything
● Too complex - must deploy everything
● Too much data - all data in one place
● Too few resources - handful of developers
● Too generic a solution - storage designed to handle any
scenario
● Good at nothing, average/bad at everything
7. Componentisation
● Split functionality into own projects
○ Faster rate of change
○ Less expertise
● Important functionality lives
● Ceilometer - data gathering and transformation service
● Gnocchi - time series storage service
● Aodh - alarming service
● Panko - event focused storage service
● They all work together and separately
9. Gnocchi use cases
● Storage brick for a billing system
● Alarm-triggering or monitoring system
● Statistical usage of data
10. Ceilometer to Gnocchi
● Ceilometer legacy storage
captures full-resolution data
○ Each datapoint has:
Timestamp, measurement, IDs,
resource metadata, metric
metadata, etc…
● Gnocchi stores pre-aggregated
data in a timeserie
○ Each datapoint has:
Timestamp, measurement… that’s
it… and then it’s compressed
○ resource metadata is an
explicit subset AND not tied to
measurement
○ Defined archival rules
■ capture data at 1 min
granularity for 1 day AND
3 hr granularity for 1
month AND ...
16. MetricD Aggregation
Metric Storage
MetricD
Computation
workers2
raw metric dump
computed aggregates
1
3backlog
1. Get unprocessed datapoint
2. Compute new aggregations
a. Update sum, avg, min, max, etc…
values based on define policy
3. Add datapoint to backlog for next
computation
a. Delete datapoints not required for
future aggregations
b. By default, only keep backlog for
single period.
17. Storage format
Metric Storage
raw metric dump
computed aggregates
backlog
● [ (timestamp, value), (timestamp,value) ]
● One object per write
● { values: { timestamp: value, timestamp:value },
block_size: max number of points,
back_window: number of blocks to retain}
● Binary serialised using msgpacks
● One object per metric
● { first_timestamp: first timestamp of block,
aggregation_method: sum, min, max, etc…,
max_size: max number of points,
sampling: granularity (60s, 300s, etc…),
timestamps: [ time1, time2, … ],
values: [value1, value2, … ]}
● Binary serialised using msgpacks
● Compressed with LZ4
● Split into chunks to minimise transfer when updating large series
● (potentially) multiple objects per aggregate per granularity per metric
22. Ceilometer to Gnocchi
Ceilometer legacy storage
● Single datapoint averages to
~1.5KB/point (mongodb) or
~150B/point (SQL)
● For 1000 VM, capturing 10
metrics/VM, every minute:
~15MB/minute, ~900MB/hour,
~21GB/day, etc…
Gnocchi
● Single datapoint AT MOST is
9B/point
● For 1000 VM, capturing 10
metrics/VM, every minute:
~90KB/minute, ~5.4MB/hour,
~130MB/day, etc…