SlideShare a Scribd company logo
1 of 22
Download to read offline
The Gnocchi Experiment
playing with timeseries
History
● Ceilometer started in 2012
○ Original mission: provide an infrastructure to collect any
information needed regarding OpenStack projects
● Added alarming in 2013
○ Create rules and based on threshold conditions that when broken
trigger action
● Added events in 2014
○ The state of an object in an OpenStack service at a point in time
● New mission
○ To reliably collect data on the utilization of the physical and
virtual resources comprising deployed clouds, persist these data for
subsequent retrieval and analysis, and trigger actions when defined
Ceilometer Architecture
OpenStack Services
Notification Bus
API
External Systems
Notification
Agents
Agent1
AgentN
Agent2
Pipeline
Polling
Agents
Agent1
AgentN
Agent2
Pipeline
Databases
Alarms
Events
Meters
AlarmEvaluator
AlarmNotifier
Collectors
Collector1
CollectorN
Collector2
this didn’t work.
Growing pains
● Too large of a scope - we did everything
● Too complex - must deploy everything
● Too much data - all data in one place
● Too few resources - handful of developers
● Too generic a solution - storage designed to handle any
scenario
● Good at nothing, average/bad at everything
Ceilometer
Gnocchi
Ceilometer Architecture
Notification Bus
Aodh
OpenStack Services
MetricsAPI
External Systems
Notification
Agents
Agent1
AgentN
Agent2
Pipeline
Polling
Agents
Agent1
AgentN
Agent2
Panko
Alarms
Events
Metrics
AlarmEvaluator
Collectors
Collector1
CollectorN
Collector2
AlarmNotifier
EventsAPI
Componentisation
● Split functionality into own projects
○ Faster rate of change
○ Less expertise
● Important functionality lives
● Ceilometer - data gathering and transformation service
● Gnocchi - time series storage service
● Aodh - alarming service
● Panko - event focused storage service
● They all work together and separately
Gnocchi
Gnocchi use cases
● Storage brick for a billing system
● Alarm-triggering or monitoring system
● Statistical usage of data
Ceilometer to Gnocchi
● Ceilometer legacy storage
captures full-resolution data
○ Each datapoint has:
Timestamp, measurement, IDs,
resource metadata, metric
metadata, etc…
● Gnocchi stores pre-aggregated
data in a timeserie
○ Each datapoint has:
Timestamp, measurement… that’s
it… and then it’s compressed
○ resource metadata is an
explicit subset AND not tied to
measurement
○ Defined archival rules
■ capture data at 1 min
granularity for 1 day AND
3 hr granularity for 1
month AND ...
Archive Policies
5 minute granularity for a day
1 day granularity for a year
How it all works...
Ceilometer
Raw sample
{
"user_id": "0d9d089b8f8340999fbe01354ef84643",
"resource_id": "a7c7cf84-5bf7-4838-a116-645ea376f4e0",
"timestamp": "2016-05-11T18:23:46.166000",
"meter": "disk.write.bytes",
"volume": 56114794496,
"source": "openstack",
"recorded_at": "2016-05-11T18:23:47.177000",
"project_id": "dec2b73655154e31be903fc93e575146",
"type": "cumulative",
"id": "7fbf56ca-17a5-11e6-a210-e8bdd1f62a56",
"unit": "B",
"metadata": {
"instance_host": "cloud03.wz",
"ephemeral_gb": "0",
"flavor.vcpus": "8",
"OS-EXT-AZ.availability_zone": "nova",
"memory_mb": "16384",
"display_name": "gord_dev",
"state": "active",
"flavor.id": "5",
"status": "active",
"ramdisk_id": "None",
"flavor.name": "m1.xlarge",
"disk_gb": "160",
"kernel_id": "None",
"image.id": "dba2c73c-3f11-45a1-998a-6a4ca2cf243e",
"flavor.ram": "16384",
"host":
"64fe410a8b602f69fe43a180c62b02d6c00e41c03caba40a092e2fb6",
"device": "['vda']",
"flavor.ephemeral": "0",
"image.name": "fedora-23-x86_64",
}
}
Separation of value
Resource
● Id
● User_id
● Project_id
● Start_timestamp: timestamp
● End_timestamp: timestamp
● Metadata: {attribute: value}
● Metric: list
Measurements
● [ (timestamp, value), ... ]
Metric
● Name
● archive_policy
Gnocchi Architecture
API
Resource
Indexer
Metric
Storage MetricD
Computation workers
data
MetricD Aggregation
Metric Storage
MetricD
Computation
workers2
raw metric dump
computed aggregates
1
3backlog
1. Get unprocessed datapoint
2. Compute new aggregations
a. Update sum, avg, min, max, etc…
values based on define policy
3. Add datapoint to backlog for next
computation
a. Delete datapoints not required for
future aggregations
b. By default, only keep backlog for
single period.
Storage format
Metric Storage
raw metric dump
computed aggregates
backlog
● [ (timestamp, value), (timestamp,value) ]
● One object per write
● { values: { timestamp: value, timestamp:value },
block_size: max number of points,
back_window: number of blocks to retain}
● Binary serialised using msgpacks
● One object per metric
● { first_timestamp: first timestamp of block,
aggregation_method: sum, min, max, etc…,
max_size: max number of points,
sampling: granularity (60s, 300s, etc…),
timestamps: [ time1, time2, … ],
values: [value1, value2, … ]}
● Binary serialised using msgpacks
● Compressed with LZ4
● Split into chunks to minimise transfer when updating large series
● (potentially) multiple objects per aggregate per granularity per metric
Query path
API
Resource
Indexer
Metric
Storage
What’s the cpu utilisation for
VM1?
resource_id
Meausures (all granularities)
metric_id
+---------------------------+-------------+----------------+
| timestamp | granularity | value |
+---------------------------+-------------+----------------+
| 2016-04-07T00:00:00+00:00 | 86400.0 | 0.30323927544 |
| 2016-04-07T17:00:00+00:00 | 3600.0 | 1.2855184725 |
| 2016-04-07T18:00:00+00:00 | 3600.0 | 0.188613527791 |
| 2016-04-07T19:00:00+00:00 | 3600.0 | 0.188871232024 |
| 2016-04-07T20:00:00+00:00 | 3600.0 | 0.188876901916 |
| 2016-04-07T21:00:00+00:00 | 3600.0 | 0.189646641908 |
| 2016-04-07T21:10:00+00:00 | 300.0 | 0.190019839676 |
| 2016-04-07T21:15:00+00:00 | 300.0 | 0.186565358466 |
| 2016-04-07T21:20:00+00:00 | 300.0 | 0.183166934543 |
| 2016-04-07T21:25:00+00:00 | 300.0 | 0.179994544916 |
| 2016-04-07T21:30:00+00:00 | 300.0 | 0.186649908928 |
| 2016-04-07T21:35:00+00:00 | 300.0 | 0.193315212093 |
| 2016-04-07T21:40:00+00:00 | 300.0 | 0.193272093903 |
| 2016-04-07T21:45:00+00:00 | 300.0 | 0.196677374077 |
| 2016-04-07T21:50:00+00:00 | 300.0 | 0.193300189049 |
+---------------------------+-------------+----------------+
metric_id
Query path
API
Resource
Indexer
Metric
Storage
What’s the metadata for
VM1? resource_id
resource+-----------------------+----------------------------------------------------------------+
| Field | Value |
+-----------------------+----------------------------------------------------------------+
| created_by_project_id | f7481a38d7c543528d5121fab9eb2b99 |
| created_by_user_id | 9246f424dcb341478067967f495dc133 |
| display_name | test3 |
| ended_at | None |
| flavor_id | 1 |
| host | 7f218c8350a86a71dbe6d14d57e8f74fa60ac360fee825192a6cf624 |
| id | e90974a6-31bf-4e47-8824-ca074cd9b47d |
| image_ref | 671375cc-177b-497a-8551-4351af3f856d |
| metrics | cpu.delta: 20cd1d71-de2f-43d5-90a8-b23ad31a7d04 |
| | cpu_util: 22cd22e7-e48e-4f21-887a-b1c6612b4c98 |
| | disk.iops: 9611a114-d37e-42e7-9b0c-0fb5e61d96c8 |
| | disk.latency: 6205c66f-2a5d-49c8-85e6-aa7572cfb34a |
| | disk.root.size: c9f9ca31-7e54-4dd7-81ad-129d86951dbc |
| | disk.usage: 4f29ca2e-d58f-40a9-94a7-15084233c1bb |
| original_resource_id | e90974a6-31bf-4e47-8824-ca074cd9b47d |
| project_id | 71bf402adea343609f2192ce998fa38e |
| revision_end | None |
| revision_start | 2016-04-07T17:32:33.245924+00:00 |
| server_group | None |
| started_at | 2016-04-07T17:32:25.740862+00:00 |
| type | instance |
| user_id | fd3eb127863b4177bf1abb38dda1f557 |
+-----------------------+----------------------------------------------------------------+
Zero computation at
query. Only lookup.
Results (benchmark data, Gnocchi 1.3.x)
Ceilometer to Gnocchi
Ceilometer legacy storage
● Single datapoint averages to
~1.5KB/point (mongodb) or
~150B/point (SQL)
● For 1000 VM, capturing 10
metrics/VM, every minute:
~15MB/minute, ~900MB/hour,
~21GB/day, etc…
Gnocchi
● Single datapoint AT MOST is
9B/point
● For 1000 VM, capturing 10
metrics/VM, every minute:
~90KB/minute, ~5.4MB/hour,
~130MB/day, etc…

More Related Content

What's hot

Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôc
Devopsdays
 
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDBPuppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet
 

What's hot (20)

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
 
Invited cloud-e-Genome project talk at 2015 NGS Data Congress
Invited cloud-e-Genome project talk at 2015 NGS Data CongressInvited cloud-e-Genome project talk at 2015 NGS Data Congress
Invited cloud-e-Genome project talk at 2015 NGS Data Congress
 
Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôc
 
Gnocchi v4 (preview)
Gnocchi v4 (preview)Gnocchi v4 (preview)
Gnocchi v4 (preview)
 
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbag
 
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendscodecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
 
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBHow to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
 
No one listens to my podcast (a kibana story)
No one listens to my podcast (a kibana story)No one listens to my podcast (a kibana story)
No one listens to my podcast (a kibana story)
 
Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge
 
Managing your Black Friday Logs
Managing your Black Friday LogsManaging your Black Friday Logs
Managing your Black Friday Logs
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
Herding cats & catching fire: Workday's telemetry & middleware
Herding cats & catching fire: Workday's telemetry & middlewareHerding cats & catching fire: Workday's telemetry & middleware
Herding cats & catching fire: Workday's telemetry & middleware
 
Tweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский Дмитрий
 
ManetoDB: Key/Value storage, BigData in Open Stack_Сергей Ковалев, Илья Свиридов
ManetoDB: Key/Value storage, BigData in Open Stack_Сергей Ковалев, Илья СвиридовManetoDB: Key/Value storage, BigData in Open Stack_Сергей Ковалев, Илья Свиридов
ManetoDB: Key/Value storage, BigData in Open Stack_Сергей Ковалев, Илья Свиридов
 
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDBPuppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
 
ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019
ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019
ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019
 
Keynote: Scaling Sensu Go
Keynote: Scaling Sensu GoKeynote: Scaling Sensu Go
Keynote: Scaling Sensu Go
 
Gmails Quota Secrets
Gmails Quota SecretsGmails Quota Secrets
Gmails Quota Secrets
 
AWS Public Sector Symposium 2014 Canberra | Big Data in the Cloud: Accelerati...
AWS Public Sector Symposium 2014 Canberra | Big Data in the Cloud: Accelerati...AWS Public Sector Symposium 2014 Canberra | Big Data in the Cloud: Accelerati...
AWS Public Sector Symposium 2014 Canberra | Big Data in the Cloud: Accelerati...
 

Viewers also liked

Viewers also liked (7)

Gnocchi Profiling v2
Gnocchi Profiling v2Gnocchi Profiling v2
Gnocchi Profiling v2
 
The n00bs guide to ovs dpdk
The n00bs guide to ovs dpdkThe n00bs guide to ovs dpdk
The n00bs guide to ovs dpdk
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Devconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKDevconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDK
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 

Similar to The Gnocchi Experiment

RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
Khadija Atiya
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB
 

Similar to The Gnocchi Experiment (20)

MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
 
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
Nagios Conference 2014 - Konstantin Benz - Monitoring Openstack The Relations...
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
 
app/server monitoring
app/server monitoringapp/server monitoring
app/server monitoring
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
SFScon16 - Michele Baldessari: "OpenStack – An introduction"
SFScon16 - Michele Baldessari: "OpenStack – An introduction"SFScon16 - Michele Baldessari: "OpenStack – An introduction"
SFScon16 - Michele Baldessari: "OpenStack – An introduction"
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
 
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
How to Make a Motion Tracking Device
How to Make a Motion Tracking DeviceHow to Make a Motion Tracking Device
How to Make a Motion Tracking Device
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
 
Sdn future of networks
Sdn future of networksSdn future of networks
Sdn future of networks
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scale
 
18 DSD-NL 2016 - Delft-FEWS Gebruikersdag - Ontwikkeling van Delft-FEWS - Ger...
18 DSD-NL 2016 - Delft-FEWS Gebruikersdag - Ontwikkeling van Delft-FEWS - Ger...18 DSD-NL 2016 - Delft-FEWS Gebruikersdag - Ontwikkeling van Delft-FEWS - Ger...
18 DSD-NL 2016 - Delft-FEWS Gebruikersdag - Ontwikkeling van Delft-FEWS - Ger...
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
 
IPython Notebooks - Hacia los papers ejecutables
IPython Notebooks - Hacia los papers ejecutablesIPython Notebooks - Hacia los papers ejecutables
IPython Notebooks - Hacia los papers ejecutables
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersYahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

The Gnocchi Experiment

  • 2. History ● Ceilometer started in 2012 ○ Original mission: provide an infrastructure to collect any information needed regarding OpenStack projects ● Added alarming in 2013 ○ Create rules and based on threshold conditions that when broken trigger action ● Added events in 2014 ○ The state of an object in an OpenStack service at a point in time ● New mission ○ To reliably collect data on the utilization of the physical and virtual resources comprising deployed clouds, persist these data for subsequent retrieval and analysis, and trigger actions when defined
  • 3. Ceilometer Architecture OpenStack Services Notification Bus API External Systems Notification Agents Agent1 AgentN Agent2 Pipeline Polling Agents Agent1 AgentN Agent2 Pipeline Databases Alarms Events Meters AlarmEvaluator AlarmNotifier Collectors Collector1 CollectorN Collector2
  • 5. Growing pains ● Too large of a scope - we did everything ● Too complex - must deploy everything ● Too much data - all data in one place ● Too few resources - handful of developers ● Too generic a solution - storage designed to handle any scenario ● Good at nothing, average/bad at everything
  • 6. Ceilometer Gnocchi Ceilometer Architecture Notification Bus Aodh OpenStack Services MetricsAPI External Systems Notification Agents Agent1 AgentN Agent2 Pipeline Polling Agents Agent1 AgentN Agent2 Panko Alarms Events Metrics AlarmEvaluator Collectors Collector1 CollectorN Collector2 AlarmNotifier EventsAPI
  • 7. Componentisation ● Split functionality into own projects ○ Faster rate of change ○ Less expertise ● Important functionality lives ● Ceilometer - data gathering and transformation service ● Gnocchi - time series storage service ● Aodh - alarming service ● Panko - event focused storage service ● They all work together and separately
  • 9. Gnocchi use cases ● Storage brick for a billing system ● Alarm-triggering or monitoring system ● Statistical usage of data
  • 10. Ceilometer to Gnocchi ● Ceilometer legacy storage captures full-resolution data ○ Each datapoint has: Timestamp, measurement, IDs, resource metadata, metric metadata, etc… ● Gnocchi stores pre-aggregated data in a timeserie ○ Each datapoint has: Timestamp, measurement… that’s it… and then it’s compressed ○ resource metadata is an explicit subset AND not tied to measurement ○ Defined archival rules ■ capture data at 1 min granularity for 1 day AND 3 hr granularity for 1 month AND ...
  • 11. Archive Policies 5 minute granularity for a day 1 day granularity for a year
  • 12. How it all works...
  • 13. Ceilometer Raw sample { "user_id": "0d9d089b8f8340999fbe01354ef84643", "resource_id": "a7c7cf84-5bf7-4838-a116-645ea376f4e0", "timestamp": "2016-05-11T18:23:46.166000", "meter": "disk.write.bytes", "volume": 56114794496, "source": "openstack", "recorded_at": "2016-05-11T18:23:47.177000", "project_id": "dec2b73655154e31be903fc93e575146", "type": "cumulative", "id": "7fbf56ca-17a5-11e6-a210-e8bdd1f62a56", "unit": "B", "metadata": { "instance_host": "cloud03.wz", "ephemeral_gb": "0", "flavor.vcpus": "8", "OS-EXT-AZ.availability_zone": "nova", "memory_mb": "16384", "display_name": "gord_dev", "state": "active", "flavor.id": "5", "status": "active", "ramdisk_id": "None", "flavor.name": "m1.xlarge", "disk_gb": "160", "kernel_id": "None", "image.id": "dba2c73c-3f11-45a1-998a-6a4ca2cf243e", "flavor.ram": "16384", "host": "64fe410a8b602f69fe43a180c62b02d6c00e41c03caba40a092e2fb6", "device": "['vda']", "flavor.ephemeral": "0", "image.name": "fedora-23-x86_64", } }
  • 14. Separation of value Resource ● Id ● User_id ● Project_id ● Start_timestamp: timestamp ● End_timestamp: timestamp ● Metadata: {attribute: value} ● Metric: list Measurements ● [ (timestamp, value), ... ] Metric ● Name ● archive_policy
  • 16. MetricD Aggregation Metric Storage MetricD Computation workers2 raw metric dump computed aggregates 1 3backlog 1. Get unprocessed datapoint 2. Compute new aggregations a. Update sum, avg, min, max, etc… values based on define policy 3. Add datapoint to backlog for next computation a. Delete datapoints not required for future aggregations b. By default, only keep backlog for single period.
  • 17. Storage format Metric Storage raw metric dump computed aggregates backlog ● [ (timestamp, value), (timestamp,value) ] ● One object per write ● { values: { timestamp: value, timestamp:value }, block_size: max number of points, back_window: number of blocks to retain} ● Binary serialised using msgpacks ● One object per metric ● { first_timestamp: first timestamp of block, aggregation_method: sum, min, max, etc…, max_size: max number of points, sampling: granularity (60s, 300s, etc…), timestamps: [ time1, time2, … ], values: [value1, value2, … ]} ● Binary serialised using msgpacks ● Compressed with LZ4 ● Split into chunks to minimise transfer when updating large series ● (potentially) multiple objects per aggregate per granularity per metric
  • 18. Query path API Resource Indexer Metric Storage What’s the cpu utilisation for VM1? resource_id Meausures (all granularities) metric_id +---------------------------+-------------+----------------+ | timestamp | granularity | value | +---------------------------+-------------+----------------+ | 2016-04-07T00:00:00+00:00 | 86400.0 | 0.30323927544 | | 2016-04-07T17:00:00+00:00 | 3600.0 | 1.2855184725 | | 2016-04-07T18:00:00+00:00 | 3600.0 | 0.188613527791 | | 2016-04-07T19:00:00+00:00 | 3600.0 | 0.188871232024 | | 2016-04-07T20:00:00+00:00 | 3600.0 | 0.188876901916 | | 2016-04-07T21:00:00+00:00 | 3600.0 | 0.189646641908 | | 2016-04-07T21:10:00+00:00 | 300.0 | 0.190019839676 | | 2016-04-07T21:15:00+00:00 | 300.0 | 0.186565358466 | | 2016-04-07T21:20:00+00:00 | 300.0 | 0.183166934543 | | 2016-04-07T21:25:00+00:00 | 300.0 | 0.179994544916 | | 2016-04-07T21:30:00+00:00 | 300.0 | 0.186649908928 | | 2016-04-07T21:35:00+00:00 | 300.0 | 0.193315212093 | | 2016-04-07T21:40:00+00:00 | 300.0 | 0.193272093903 | | 2016-04-07T21:45:00+00:00 | 300.0 | 0.196677374077 | | 2016-04-07T21:50:00+00:00 | 300.0 | 0.193300189049 | +---------------------------+-------------+----------------+ metric_id
  • 19. Query path API Resource Indexer Metric Storage What’s the metadata for VM1? resource_id resource+-----------------------+----------------------------------------------------------------+ | Field | Value | +-----------------------+----------------------------------------------------------------+ | created_by_project_id | f7481a38d7c543528d5121fab9eb2b99 | | created_by_user_id | 9246f424dcb341478067967f495dc133 | | display_name | test3 | | ended_at | None | | flavor_id | 1 | | host | 7f218c8350a86a71dbe6d14d57e8f74fa60ac360fee825192a6cf624 | | id | e90974a6-31bf-4e47-8824-ca074cd9b47d | | image_ref | 671375cc-177b-497a-8551-4351af3f856d | | metrics | cpu.delta: 20cd1d71-de2f-43d5-90a8-b23ad31a7d04 | | | cpu_util: 22cd22e7-e48e-4f21-887a-b1c6612b4c98 | | | disk.iops: 9611a114-d37e-42e7-9b0c-0fb5e61d96c8 | | | disk.latency: 6205c66f-2a5d-49c8-85e6-aa7572cfb34a | | | disk.root.size: c9f9ca31-7e54-4dd7-81ad-129d86951dbc | | | disk.usage: 4f29ca2e-d58f-40a9-94a7-15084233c1bb | | original_resource_id | e90974a6-31bf-4e47-8824-ca074cd9b47d | | project_id | 71bf402adea343609f2192ce998fa38e | | revision_end | None | | revision_start | 2016-04-07T17:32:33.245924+00:00 | | server_group | None | | started_at | 2016-04-07T17:32:25.740862+00:00 | | type | instance | | user_id | fd3eb127863b4177bf1abb38dda1f557 | +-----------------------+----------------------------------------------------------------+
  • 21. Results (benchmark data, Gnocchi 1.3.x)
  • 22. Ceilometer to Gnocchi Ceilometer legacy storage ● Single datapoint averages to ~1.5KB/point (mongodb) or ~150B/point (SQL) ● For 1000 VM, capturing 10 metrics/VM, every minute: ~15MB/minute, ~900MB/hour, ~21GB/day, etc… Gnocchi ● Single datapoint AT MOST is 9B/point ● For 1000 VM, capturing 10 metrics/VM, every minute: ~90KB/minute, ~5.4MB/hour, ~130MB/day, etc…