SlideShare uma empresa Scribd logo
1 de 86
Identifying (and fixing)
oslo.messaging &
RabbitMQ issues
Michael Klishin, Pivotal
Dmitry Mescheryakov, Mirantis
What is oslo.messaging?
● Library for
○ building RPC clients/servers
○ emitting/handling notifications
What is oslo.messaging?
● Library for
○ building RPC clients/servers
○ emitting/handling notifications
● Supports several backends:
○ RabbitMQ
■ based on Kombu - the oldest and most well known (and we will speak about it)
■ based on Pika - recent addition
○ AMQP 1.0
What is oslo.messaging?
● Library for
○ building RPC clients/servers
○ emitting/handling notifications
● Supports several backends:
○ RabbitMQ
■ based on Kombu - the oldest and most well known (and we will speak about it)
■ based on Pika - recent addition
○ AMQP 1.0
What is oslo.messaging?
● Library for
○ building RPC clients/servers
○ emitting/handling notifications
● Supports several backends:
○ RabbitMQ
■ based on Kombu - the oldest and most well known (and we will speak about it)
■ based on Pika - recent addition
○ AMQP 1.0
Spawning a VM in Nova
nova-api
nova-api
nova-api
nova-
conductor
nova-
conductor
nova-
scheduler
nova-
scheduler
nova-
scheduler
nova-
compute
nova-
compute
nova-
compute
nova-
compute
Client
HTTP
RPC
Examples
Internal:
● nova-compute sends a report to nova-conductor every minute
● nova-conductor sends a command to spawn a VM to nova-compute
● neutron-l3-agent requests router list from neutron-server
● …
Examples
Internal:
● nova-compute sends a report to nova-conductor every minute
● nova-conductor sends a command to spawn a VM to nova-compute
● neutron-l3-agent requests router list from neutron-server
● …
External:
● Every OpenStack service sends notifications to Ceilometer
Where is RabbitMQ in this picture?
nova-
conductor
nova-
compute
RabbitMQ
compute.node-1.domain.tld
reply_b6686f7be58b4773a2e0f5475368d19a
request
response
RPC
Spotting oslo.messaging logs
Spotting oslo.messaging logs
2016-04-15 11:16:57.239 16181 DEBUG nova.service [req-d83ae554-7ef5-4299-
82ce-3f70b00b6490 - - - - -] Creating RPC server for service scheduler start
/usr/lib/python2.7/dist-packages/nova/service.py:218
2016-04-15 11:16:57.258 16181 DEBUG oslo.messaging._drivers.pool [req-
d83ae554-7ef5-4299-82ce-3f70b00b6490 - - - - -] Pool creating new connection
create /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/pool.py:109
...
File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line
420, in _send
result = self._waiter.wait(msg_id, timeout)
File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line
318, in wait
message = self.waiters.get(msg_id, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line
223, in get
'to message ID %s' % msg_id)
MessagingTimeout: Timed out waiting for a reply to message ID
9e4a677887134a0cbc134649cd46d1ce
My favorite oslo.messaging exception
oslo.messaging operations
● Cast - fire RPC request and forget about it
● Notify - the same, only format is different
● Call - send RPC request and receive reply
Call throws a MessagingTimeout exception when a reply isn’t received in a certain
amount of time
Making a Call
1. Client -> request -> RabbitMQ
2. RabbitMQ -> request -> Server
3. Server processes the request and produces the response
4. Server -> response -> RabbitMQ
5. RabbitMQ -> response -> Client
If the process gets stuck on any step from 2 to 5, client gets a MessagingTimeout
exception.
Debug shows the truth
L3 Agent log
CALL msg_id: ae63b165611f439098f1461f906270de exchange: neutron topic: q-reports-plugin
received reply msg_id: ae63b165611f439098f1461f906270de
* Examples from Mitaka
Debug shows the truth
L3 Agent log
CALL msg_id: ae63b165611f439098f1461f906270de exchange: neutron topic: q-reports-plugin
received reply msg_id: ae63b165611f439098f1461f906270de
Neutron Server
received message msg_id: ae63b165611f439098f1461f906270de reply to:
reply_df2405440ffb40969a2f52c769f72e30
REPLY msg_id: ae63b165611f439098f1461f906270de reply queue:
reply_df2405440ffb40969a2f52c769f72e30
* Examples from Mitaka
Enabling the debug
[DEFAULT]
debug=true
Enabling the debug
[DEFAULT]
debug=true
default_log_levels=...,oslo.messaging=DEBUG,...
If you don’t have debug enabled
Examine the stack trace
Find which operation failed
Guess the destination service
Try to find correlating log entries around the time the request was made
If you don’t have debug enabled
Examine the stack trace
Find which operation failed
Guess the destination service
Try to find correlating log entries around the time the request was made
File "/opt/stack/neutron/neutron/agent/dhcp/agent.py", line 571, in _report_state
self.state_rpc.report_state(ctx, self.agent_state, self.use_call)
File "/opt/stack/neutron/neutron/agent/rpc.py", line 86, in report_state
return method(context, 'report_state', **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
Diagnosing issues through RabbitMQ
● # rabbitmqctl list_queues consumers name
0 consumers indicate that nobody listens to the queue
● # rabbitmqctl list_queues messages consumers name
If a queue has consumers, but also messages are accumulating there. It
means that the corresponding service can not process messages in time or got
stuck in a deadlock or cluster is partitioned
Checking RabbitMQ cluster for integrity
# rabbitmqctl cluster_status
Check that its output contains all the nodes in the cluster. You might find that your
cluster is partitioned.
Partitioning is a good reason for some messages to get stuck in queues.
How to fix such issues
For RabbitMQ issues including partitioning, see RabbitMQ docs
Restart of the affected services helps in most cases
How to fix such issues
For RabbitMQ issues including partitioning, see RabbitMQ docs
Restart of the affected services helps in most cases
Force close connections using `rabbitmqctl` or HTTP API
Never set amqp_auto_delete = true
Use a queue expiration policy instead, with a TTL of at least 1 minute
Starting from Mitaka all by default auto-delete queues were replaced with expiring
ones
Why not amqp_auto_delete?
nova-
conductor
nova-
compute
RabbitMQ
compute.node-1.domain.tld
message
auto-delete
auto-delete = true
network hiccup
Queue mirroring is quite expensive
Out testing shows 2x drop in throughput on 3-node cluster with ‘ha-mode: all’
policy comparing with non-mirrored queues.
RPC can live without it
But notifications might be too important (if used for billing)
In later case enable mirroring for notification queues only (example in Fuel)
Use different backends for RPC and Notifications
Different drivers
* Available starting from Mitaka
Use different backends for RPC and Notifications
Different drivers
Same driver. For example:
RPC messages go through one RabbitMQ cluster
Notification messages go through another RabbitMQ cluster
* Available starting from Mitaka
Use different backends for RPC and Notifications
Different drivers
Same driver. For example:
RPC messages go through one RabbitMQ cluster
Notification messages go through another RabbitMQ cluster
Implementation (non-documented)
* Available starting from Mitaka
Part 2
Erlang VM process disappears
Erlang VM process disappears
Syslog, kern.log, /var/log/messages: grep for “killed process”
Erlang VM process disappears
Syslog, kern.log, /var/log/messages: grep for “killed process”
“Cannot allocate 1117203264527168 bytes of memory (of type …)” — move to
Erlang 17.5 or 18.3
RAM usage
RAM usage
`rabbitmqctl status`
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
Stats DB overload
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
With a lot of those the stats DB collector can fall behind
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
With a lot of those the stats DB collector can fall behind
`rabbitmqctl status` reports most RAM used by `mgmt_db`
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
With a lot of those the stats DB collector can fall behind
`rabbitmqctl status` reports most RAM used by `mgmt_db`
You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db),
please_terminate).’`
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
With a lot of those the stats DB collector can fall behind
`rabbitmqctl status` reports most RAM used by `mgmt_db`
You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db),
please_terminate).’`
Resetting is a safe thing to do but may confuse your monitoring tools
Stats DB overload
Connections, channels, queues, and nodes emit stats on a timer
With a lot of those the stats DB collector can fall behind
`rabbitmqctl status` reports most RAM used by `mgmt_db`
You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db),
please_terminate).’`
Resetting is a safe thing to do but may confuse your monitoring tools
New better parallelized event collector coming in RabbitMQ 3.6.2
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
rabbitmq_top
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
rabbitmq_top
`rabbitmqctl list_connections | wc -l`
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
rabbitmq_top
`rabbitmqctl list_connections | wc -l`
`rabbitmqctl list_channels | wc -l`
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
rabbitmq_top
`rabbitmqctl list_connections | wc -l`
`rabbitmqctl list_channels | wc -l`
Reduce TCP buffer size: RabbitMQ Networking guide
RAM usage
`rabbitmqctl status`
`rabbitmqctl list_queues name messages memory consumers`
rabbitmq_top
`rabbitmqctl list_connections | wc -l`
`rabbitmqctl list_channels | wc -l`
Reduce TCP buffer size: RabbitMQ Networking guide
To force per-connection channel limit use`rabbit.channel_max`.
Unresponsive nodes
Unresponsive nodes
`rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'`
Unresponsive nodes
`rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'`
Pivotal & Erlang Solutions contributed a few Mnesia deadlock fixes in
Erlang/OTP 18.3.1 and 19.0
TCP connections are rejected
TCP connections are rejected
Ensure traffic on RabbitMQ ports is accepted by firewall
TCP connections are rejected
Ensure traffic on RabbitMQ ports is accepted by firewall
Ensure RabbitMQ listens on correct network interfaces
TCP connections are rejected
Ensure traffic on RabbitMQ ports is accepted by firewall
Ensure RabbitMQ listens on correct network interfaces
Check open file handles limit (defaults on Linux are completely inadequate)
TCP connections are rejected
Ensure traffic on RabbitMQ ports is accepted by firewall
Ensure RabbitMQ listens on correct network interfaces
Check open file handles limit (defaults on Linux are completely inadequate)
TCP connection backlog size: rabbitmq.tcp_listen_options.backlog,
net.core.somaxconn
TCP connections are rejected
Ensure traffic on RabbitMQ ports is accepted by firewall
Ensure RabbitMQ listens on correct network interfaces
Check open file handles limit (defaults on Linux are completely inadequate)
TCP connection backlog size: rabbitmq.tcp_listen_options.backlog,
net.core.somaxconn
Consult RabbitMQ logs for authentication and authorization errors
TLS connections fail
TLS connections fail
Deserves a talk of its own
TLS connections fail
Deserves a talk of its own
See log files
TLS connections fail
Deserves a talk of its own
See log files
`openssl s_client` (`man 1 s_client`)
TLS connections fail
Deserves a talk of its own
See log files
`openssl s_client` (`man 1 s_client`)
`openssl s_server` (`man 1 s_server`)
TLS connections fail
Deserves a talk of its own
See log files
`openssl s_client` (`man 1 s_client`)
`openssl s_server` (`man 1 s_server`)
Ensure peer CA certificate is trusted and verification depth is sufficient
TLS connections fail
Deserves a talk of its own
See log files
`openssl s_client` (`man 1 s_client`)
`openssl s_server` (`man 1 s_server`)
Ensure peer CA certificate is trusted and verification depth is sufficient
Troubleshooting TLS on rabbitmq.com
TLS connections fail
Deserves a talk of its own
See log files
`openssl s_client` (`man 1 s_client`)
`openssl s_server` (`man 1 s_server`)
Ensure peer CA certificate is trusted and verification depth is sufficient
Troubleshooting TLS on rabbitmq.com
Run Erlang 17.5 or 18.3.1
Message payload inspection
Message payload inspection
Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace
Message payload inspection
Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace
rabbitmq_tracing
Message payload inspection
Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace
rabbitmq_tracing
Tracing puts *very* high load on the system
Message payload inspection
Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace
rabbitmq_tracing
Tracing puts *very* high load on the system
Wireshark (tcpdump, …)
Higher than expected latency
Higher than expected latency
Wireshark (tcpdump, …)
Higher than expected latency
Wireshark (tcpdump, …)
strace, DTrace, …
Higher than expected latency
Wireshark (tcpdump, …)
strace, DTrace, …
Erlang VM scheduler-to-core binding (pinning)
General remarks
General remarks
Guessing is not effective (or efficient)
General remarks
Guessing is not effective (or efficient)
Use tools to gather more data
General remarks
Guessing is not effective (or efficient)
Use tools to gather more data
Always consult log files
General remarks
Guessing is not effective (or efficient)
Use tools to gather more data
Always consult log files
Ask on rabbitmq-users
Thank you
Thank you
@michaelklishin
Thank you
@michaelklishin
rabbitmq-users

Mais conteúdo relacionado

Mais procurados

LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
NGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX, Inc.
 
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링OpenStack Korea Community
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
Openstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsThomas Morin
 
Kolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in SydneyKolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in SydneyVikram G Hosakote
 
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and FanoutOpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and FanoutSaju Madhavan
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기NeoClova
 
Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentSadique Puthen
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKMarian Marinov
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabTaeung Song
 

Mais procurados (20)

LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
NGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX: Basics and Best Practices
NGINX: Basics and Best Practices
 
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링
[OpenInfra Days Korea 2018] (Track 4) - Grafana를 이용한 OpenStack 클라우드 성능 모니터링
 
Deploying IPv6 on OpenStack
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Openstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNs
 
Kolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in SydneyKolla talk at OpenStack Summit 2017 in Sydney
Kolla talk at OpenStack Summit 2017 in Sydney
 
macvlan and ipvlan
macvlan and ipvlanmacvlan and ipvlan
macvlan and ipvlan
 
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and FanoutOpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
 
Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deployment
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLab
 

Destaque

How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepSadique Puthen
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itMichael Klishin
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and moreBrendan Gregg
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
Js remote conf
Js remote confJs remote conf
Js remote confBart Wood
 
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructure
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructureAtf 3 q15-6 - solutions for scaling the cloud computing network infrastructure
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructureMason Mei
 
Hypervisor Selection in CloudStack and OpenStack
Hypervisor Selection in CloudStack and OpenStackHypervisor Selection in CloudStack and OpenStack
Hypervisor Selection in CloudStack and OpenStackTim Mackey
 
Mistral Hong Kong Unconference track
Mistral Hong Kong Unconference trackMistral Hong Kong Unconference track
Mistral Hong Kong Unconference trackRenat Akhmerov
 
The Messy Underlay Dilemma - automating PKI at Defragcon
The Messy Underlay Dilemma - automating PKI at DefragconThe Messy Underlay Dilemma - automating PKI at Defragcon
The Messy Underlay Dilemma - automating PKI at Defragconrhirschfeld
 
Mistral Atlanta design session
Mistral Atlanta design sessionMistral Atlanta design session
Mistral Atlanta design sessionRenat Akhmerov
 
Mining Your Logs - Gaining Insight Through Visualization
Mining Your Logs - Gaining Insight Through VisualizationMining Your Logs - Gaining Insight Through Visualization
Mining Your Logs - Gaining Insight Through VisualizationRaffael Marty
 

Destaque (20)

How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing Sleep
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use it
 
RabbitMQ Operations
RabbitMQ OperationsRabbitMQ Operations
RabbitMQ Operations
 
Scalable Open Source
Scalable Open SourceScalable Open Source
Scalable Open Source
 
3 years with Clojure
3 years with Clojure3 years with Clojure
3 years with Clojure
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Open source responsibly
Open source responsiblyOpen source responsibly
Open source responsibly
 
Js remote conf
Js remote confJs remote conf
Js remote conf
 
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructure
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructureAtf 3 q15-6 - solutions for scaling the cloud computing network infrastructure
Atf 3 q15-6 - solutions for scaling the cloud computing network infrastructure
 
Hypervisor Selection in CloudStack and OpenStack
Hypervisor Selection in CloudStack and OpenStackHypervisor Selection in CloudStack and OpenStack
Hypervisor Selection in CloudStack and OpenStack
 
Mistral Hong Kong Unconference track
Mistral Hong Kong Unconference trackMistral Hong Kong Unconference track
Mistral Hong Kong Unconference track
 
The Messy Underlay Dilemma - automating PKI at Defragcon
The Messy Underlay Dilemma - automating PKI at DefragconThe Messy Underlay Dilemma - automating PKI at Defragcon
The Messy Underlay Dilemma - automating PKI at Defragcon
 
Mistral Atlanta design session
Mistral Atlanta design sessionMistral Atlanta design session
Mistral Atlanta design session
 
RabbitMq
RabbitMqRabbitMq
RabbitMq
 
Mining Your Logs - Gaining Insight Through Visualization
Mining Your Logs - Gaining Insight Through VisualizationMining Your Logs - Gaining Insight Through Visualization
Mining Your Logs - Gaining Insight Through Visualization
 

Semelhante a Troubleshooting common oslo.messaging and RabbitMQ issues

Hunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentationHunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentationOlehLevytskyi1
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...addame
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档YUCHENG HU
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Timothy Spann
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...Timothy Spann
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapiScott Miao
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar AppsTimothy Spann
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLEdunomica
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarTimothy Spann
 
Training Slides: 153 - Working with the CLI
Training Slides: 153 - Working with the CLITraining Slides: 153 - Working with the CLI
Training Slides: 153 - Working with the CLIContinuent
 
Use perl creating web services with xml rpc
Use perl creating web services with xml rpcUse perl creating web services with xml rpc
Use perl creating web services with xml rpcJohnny Pork
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to RealitySriram Subramanian
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 

Semelhante a Troubleshooting common oslo.messaging and RabbitMQ issues (20)

Hunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentationHunting for APT in network logs workshop presentation
Hunting for APT in network logs workshop presentation
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...
Montreal On Rails 5 : Rails deployment using : Nginx, Mongrel, Mongrel_cluste...
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档
 
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Apps
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache Pulsar
 
Training Slides: 153 - Working with the CLI
Training Slides: 153 - Working with the CLITraining Slides: 153 - Working with the CLI
Training Slides: 153 - Working with the CLI
 
Use perl creating web services with xml rpc
Use perl creating web services with xml rpcUse perl creating web services with xml rpc
Use perl creating web services with xml rpc
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 
project_docs
project_docsproject_docs
project_docs
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 

Último

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 

Último (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Troubleshooting common oslo.messaging and RabbitMQ issues

  • 1. Identifying (and fixing) oslo.messaging & RabbitMQ issues Michael Klishin, Pivotal Dmitry Mescheryakov, Mirantis
  • 2. What is oslo.messaging? ● Library for ○ building RPC clients/servers ○ emitting/handling notifications
  • 3. What is oslo.messaging? ● Library for ○ building RPC clients/servers ○ emitting/handling notifications ● Supports several backends: ○ RabbitMQ ■ based on Kombu - the oldest and most well known (and we will speak about it) ■ based on Pika - recent addition ○ AMQP 1.0
  • 4. What is oslo.messaging? ● Library for ○ building RPC clients/servers ○ emitting/handling notifications ● Supports several backends: ○ RabbitMQ ■ based on Kombu - the oldest and most well known (and we will speak about it) ■ based on Pika - recent addition ○ AMQP 1.0
  • 5. What is oslo.messaging? ● Library for ○ building RPC clients/servers ○ emitting/handling notifications ● Supports several backends: ○ RabbitMQ ■ based on Kombu - the oldest and most well known (and we will speak about it) ■ based on Pika - recent addition ○ AMQP 1.0
  • 6. Spawning a VM in Nova nova-api nova-api nova-api nova- conductor nova- conductor nova- scheduler nova- scheduler nova- scheduler nova- compute nova- compute nova- compute nova- compute Client HTTP RPC
  • 7. Examples Internal: ● nova-compute sends a report to nova-conductor every minute ● nova-conductor sends a command to spawn a VM to nova-compute ● neutron-l3-agent requests router list from neutron-server ● …
  • 8. Examples Internal: ● nova-compute sends a report to nova-conductor every minute ● nova-conductor sends a command to spawn a VM to nova-compute ● neutron-l3-agent requests router list from neutron-server ● … External: ● Every OpenStack service sends notifications to Ceilometer
  • 9. Where is RabbitMQ in this picture? nova- conductor nova- compute RabbitMQ compute.node-1.domain.tld reply_b6686f7be58b4773a2e0f5475368d19a request response RPC
  • 11. Spotting oslo.messaging logs 2016-04-15 11:16:57.239 16181 DEBUG nova.service [req-d83ae554-7ef5-4299- 82ce-3f70b00b6490 - - - - -] Creating RPC server for service scheduler start /usr/lib/python2.7/dist-packages/nova/service.py:218 2016-04-15 11:16:57.258 16181 DEBUG oslo.messaging._drivers.pool [req- d83ae554-7ef5-4299-82ce-3f70b00b6490 - - - - -] Pool creating new connection create /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/pool.py:109
  • 12. ... File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 420, in _send result = self._waiter.wait(msg_id, timeout) File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 318, in wait message = self.waiters.get(msg_id, timeout=timeout) File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 223, in get 'to message ID %s' % msg_id) MessagingTimeout: Timed out waiting for a reply to message ID 9e4a677887134a0cbc134649cd46d1ce My favorite oslo.messaging exception
  • 13. oslo.messaging operations ● Cast - fire RPC request and forget about it ● Notify - the same, only format is different ● Call - send RPC request and receive reply Call throws a MessagingTimeout exception when a reply isn’t received in a certain amount of time
  • 14. Making a Call 1. Client -> request -> RabbitMQ 2. RabbitMQ -> request -> Server 3. Server processes the request and produces the response 4. Server -> response -> RabbitMQ 5. RabbitMQ -> response -> Client If the process gets stuck on any step from 2 to 5, client gets a MessagingTimeout exception.
  • 15. Debug shows the truth L3 Agent log CALL msg_id: ae63b165611f439098f1461f906270de exchange: neutron topic: q-reports-plugin received reply msg_id: ae63b165611f439098f1461f906270de * Examples from Mitaka
  • 16. Debug shows the truth L3 Agent log CALL msg_id: ae63b165611f439098f1461f906270de exchange: neutron topic: q-reports-plugin received reply msg_id: ae63b165611f439098f1461f906270de Neutron Server received message msg_id: ae63b165611f439098f1461f906270de reply to: reply_df2405440ffb40969a2f52c769f72e30 REPLY msg_id: ae63b165611f439098f1461f906270de reply queue: reply_df2405440ffb40969a2f52c769f72e30 * Examples from Mitaka
  • 19. If you don’t have debug enabled Examine the stack trace Find which operation failed Guess the destination service Try to find correlating log entries around the time the request was made
  • 20. If you don’t have debug enabled Examine the stack trace Find which operation failed Guess the destination service Try to find correlating log entries around the time the request was made File "/opt/stack/neutron/neutron/agent/dhcp/agent.py", line 571, in _report_state self.state_rpc.report_state(ctx, self.agent_state, self.use_call) File "/opt/stack/neutron/neutron/agent/rpc.py", line 86, in report_state return method(context, 'report_state', **kwargs) File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
  • 21. Diagnosing issues through RabbitMQ ● # rabbitmqctl list_queues consumers name 0 consumers indicate that nobody listens to the queue ● # rabbitmqctl list_queues messages consumers name If a queue has consumers, but also messages are accumulating there. It means that the corresponding service can not process messages in time or got stuck in a deadlock or cluster is partitioned
  • 22. Checking RabbitMQ cluster for integrity # rabbitmqctl cluster_status Check that its output contains all the nodes in the cluster. You might find that your cluster is partitioned. Partitioning is a good reason for some messages to get stuck in queues.
  • 23. How to fix such issues For RabbitMQ issues including partitioning, see RabbitMQ docs Restart of the affected services helps in most cases
  • 24. How to fix such issues For RabbitMQ issues including partitioning, see RabbitMQ docs Restart of the affected services helps in most cases Force close connections using `rabbitmqctl` or HTTP API
  • 25. Never set amqp_auto_delete = true Use a queue expiration policy instead, with a TTL of at least 1 minute Starting from Mitaka all by default auto-delete queues were replaced with expiring ones
  • 27. Queue mirroring is quite expensive Out testing shows 2x drop in throughput on 3-node cluster with ‘ha-mode: all’ policy comparing with non-mirrored queues. RPC can live without it But notifications might be too important (if used for billing) In later case enable mirroring for notification queues only (example in Fuel)
  • 28. Use different backends for RPC and Notifications Different drivers * Available starting from Mitaka
  • 29. Use different backends for RPC and Notifications Different drivers Same driver. For example: RPC messages go through one RabbitMQ cluster Notification messages go through another RabbitMQ cluster * Available starting from Mitaka
  • 30. Use different backends for RPC and Notifications Different drivers Same driver. For example: RPC messages go through one RabbitMQ cluster Notification messages go through another RabbitMQ cluster Implementation (non-documented) * Available starting from Mitaka
  • 31.
  • 33.
  • 34. Erlang VM process disappears
  • 35. Erlang VM process disappears Syslog, kern.log, /var/log/messages: grep for “killed process”
  • 36. Erlang VM process disappears Syslog, kern.log, /var/log/messages: grep for “killed process” “Cannot allocate 1117203264527168 bytes of memory (of type …)” — move to Erlang 17.5 or 18.3
  • 39. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers`
  • 41. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer
  • 42. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer With a lot of those the stats DB collector can fall behind
  • 43. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer With a lot of those the stats DB collector can fall behind `rabbitmqctl status` reports most RAM used by `mgmt_db`
  • 44. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer With a lot of those the stats DB collector can fall behind `rabbitmqctl status` reports most RAM used by `mgmt_db` You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db), please_terminate).’`
  • 45. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer With a lot of those the stats DB collector can fall behind `rabbitmqctl status` reports most RAM used by `mgmt_db` You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db), please_terminate).’` Resetting is a safe thing to do but may confuse your monitoring tools
  • 46. Stats DB overload Connections, channels, queues, and nodes emit stats on a timer With a lot of those the stats DB collector can fall behind `rabbitmqctl status` reports most RAM used by `mgmt_db` You can reset it: `rabbitmqctl eval ‘exit(erlang:whereis(rabbit_mgmt_db), please_terminate).’` Resetting is a safe thing to do but may confuse your monitoring tools New better parallelized event collector coming in RabbitMQ 3.6.2
  • 47. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers` rabbitmq_top
  • 48. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers` rabbitmq_top `rabbitmqctl list_connections | wc -l`
  • 49. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers` rabbitmq_top `rabbitmqctl list_connections | wc -l` `rabbitmqctl list_channels | wc -l`
  • 50. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers` rabbitmq_top `rabbitmqctl list_connections | wc -l` `rabbitmqctl list_channels | wc -l` Reduce TCP buffer size: RabbitMQ Networking guide
  • 51. RAM usage `rabbitmqctl status` `rabbitmqctl list_queues name messages memory consumers` rabbitmq_top `rabbitmqctl list_connections | wc -l` `rabbitmqctl list_channels | wc -l` Reduce TCP buffer size: RabbitMQ Networking guide To force per-connection channel limit use`rabbit.channel_max`.
  • 53. Unresponsive nodes `rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'`
  • 54. Unresponsive nodes `rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'` Pivotal & Erlang Solutions contributed a few Mnesia deadlock fixes in Erlang/OTP 18.3.1 and 19.0
  • 56. TCP connections are rejected Ensure traffic on RabbitMQ ports is accepted by firewall
  • 57. TCP connections are rejected Ensure traffic on RabbitMQ ports is accepted by firewall Ensure RabbitMQ listens on correct network interfaces
  • 58. TCP connections are rejected Ensure traffic on RabbitMQ ports is accepted by firewall Ensure RabbitMQ listens on correct network interfaces Check open file handles limit (defaults on Linux are completely inadequate)
  • 59. TCP connections are rejected Ensure traffic on RabbitMQ ports is accepted by firewall Ensure RabbitMQ listens on correct network interfaces Check open file handles limit (defaults on Linux are completely inadequate) TCP connection backlog size: rabbitmq.tcp_listen_options.backlog, net.core.somaxconn
  • 60. TCP connections are rejected Ensure traffic on RabbitMQ ports is accepted by firewall Ensure RabbitMQ listens on correct network interfaces Check open file handles limit (defaults on Linux are completely inadequate) TCP connection backlog size: rabbitmq.tcp_listen_options.backlog, net.core.somaxconn Consult RabbitMQ logs for authentication and authorization errors
  • 62. TLS connections fail Deserves a talk of its own
  • 63. TLS connections fail Deserves a talk of its own See log files
  • 64. TLS connections fail Deserves a talk of its own See log files `openssl s_client` (`man 1 s_client`)
  • 65. TLS connections fail Deserves a talk of its own See log files `openssl s_client` (`man 1 s_client`) `openssl s_server` (`man 1 s_server`)
  • 66. TLS connections fail Deserves a talk of its own See log files `openssl s_client` (`man 1 s_client`) `openssl s_server` (`man 1 s_server`) Ensure peer CA certificate is trusted and verification depth is sufficient
  • 67. TLS connections fail Deserves a talk of its own See log files `openssl s_client` (`man 1 s_client`) `openssl s_server` (`man 1 s_server`) Ensure peer CA certificate is trusted and verification depth is sufficient Troubleshooting TLS on rabbitmq.com
  • 68. TLS connections fail Deserves a talk of its own See log files `openssl s_client` (`man 1 s_client`) `openssl s_server` (`man 1 s_server`) Ensure peer CA certificate is trusted and verification depth is sufficient Troubleshooting TLS on rabbitmq.com Run Erlang 17.5 or 18.3.1
  • 70. Message payload inspection Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace
  • 71. Message payload inspection Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace rabbitmq_tracing
  • 72. Message payload inspection Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace rabbitmq_tracing Tracing puts *very* high load on the system
  • 73. Message payload inspection Message tracing: `rabbitmqctl trace_on -p my-vhost`, amq.rabbitmq.trace rabbitmq_tracing Tracing puts *very* high load on the system Wireshark (tcpdump, …)
  • 75. Higher than expected latency Wireshark (tcpdump, …)
  • 76. Higher than expected latency Wireshark (tcpdump, …) strace, DTrace, …
  • 77. Higher than expected latency Wireshark (tcpdump, …) strace, DTrace, … Erlang VM scheduler-to-core binding (pinning)
  • 79. General remarks Guessing is not effective (or efficient)
  • 80. General remarks Guessing is not effective (or efficient) Use tools to gather more data
  • 81. General remarks Guessing is not effective (or efficient) Use tools to gather more data Always consult log files
  • 82. General remarks Guessing is not effective (or efficient) Use tools to gather more data Always consult log files Ask on rabbitmq-users
  • 83.

Notas do Editor

  1. Casts don’t have message id, but are distinguished by a unique_id
  2. Casts don’t have message id, but are distinguished by a unique_id
  3. Depends on to which partition sender and listener are connected.