SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
Troubleshooting
streaming replication
Alexey Lesovsky
alexey.lesovsky@dataegret.com
dataegret.com
Quick introduction
• WAL and replication internals.
• Replication setup.
Troubleshooting tools
• 3rd party tools.
• Builtin tools.
Troubleshooting cases
• Symptoms and problems.
• Detection and solutions.
• Lessons learned.
02
03
01
Quick
introduction
01
Goals01
dataegret.com
Better understanding of streaming replication.
How to quickly find and fix problems.
https://goo.gl/Mm3ugt
Agenda01
dataegret.com
Write-Ahead Log.
Streaming replication internals.
Streaming replication setup.
Troubleshooting tools overview (3-rd party).
Troubleshooting tools overview (builtin).
Troubleshooting in practice.
Questions.
Write Ahead Log01
dataegret.com
Durability in ACID.
Almost all changes fixed in WAL.
pg_xlog/ (pg_wal/) directory in the DATADIR.
Synchronous WAL write by backends.
Asynchronous WAL write by WAL writer.
Recovery process relies on WAL.
Streaming replication internals01
dataegret.com
WAL Sender process.
WAL Receiver process.
Startup process (recovery).
Streaming replication vs. WAL archiving.
Streaming replication internals01
dataegret.com
WAL
Buffers
Storage
WAL
Sender
Network
Startup
Process
Storage
WAL
Receiver
Streaming replication setup01
dataegret.com
Master:
● postgresql.conf;
● Restart.
Standby:
● pg_basebackup;
● postgresql.conf;
● recovery.conf setup.
Master setup01
dataegret.com
wal_level, max_wal_senders, max_replication_slots.
archive_mode, archive_command.
wal_keep_segments.
wal_sender_timeout.
synchronous_standby_names.
Master setup pitfalls01
dataegret.com
wal_level, archive_mode, max_wal_senders,
max_replication_slots – require restart.
wal_keep_segments – requires extra storage space.
wal_sender_timeout – reduce that, if network is bad.
synchronous_standby_names – master freezes if standby fails.
Standby setup01
dataegret.com
hot_standby.
max_standby_streaming_delay, max_standby_archiving_delay.
hot_standby_feedback.
wal_receiver_timeout.
Standby setup pitfalls01
dataegret.com
hot_standby – enables SELECT queries.
max_standby_streaming_delay – increases max possible lag.
hot_standby_feedback:
● postpones vacuum;
● potential tables/indexes bloat.
wal_receiver_timeout – reduce that, if network is bad.
Recovery.conf01
dataegret.com
primary_conninfo and/or restore_command.
standby_mode.
trigger_file.
Recovery target:
● immediate;
● particular point/xid/timestamp;
● recovery_min_apply_delay.
Recovery.conf pitfalls01
dataegret.com
Any changes require restart.
Troubleshooting
tools
02
3rd party troubleshooting tools02
dataegret.com
Top (procps).
Iostat (sysstat), iotop.
Nicstat.
pgCenter.
Perf.
3rd party troubleshooting tools02
dataegret.com
Top (procps) – CPU usage, load average, mem/swap usage.
Iostat (sysstat), iotop – storage utilization, process IO.
Nicstat – network interfaces utilization.
pgCenter – replication stats.
Perf – deep investigations.
Builtin troubleshooting tools02
dataegret.com
Statistics views.
Auxiliary functions.
pg_xlogdump utility.
Builtin troubleshooting tools02
dataegret.com
Statistics view:
● pg_stat_replication;
● pg_stat_databases, pg_stat_databases_conflicts;
● pg_stat_activity;
● pg_stat_archiver.
Builtin troubleshooting tools02
dataegret.com
Auxiliary functions:
● pg_current_xlog_location, pg_last_xlog_receive_location;
● pg_xlog_location_dif;
● pg_xlog_replay_pause, pg_xlog_replay_resume;
● pg_is_xlog_replay_paused.
Builtin troubleshooting tools02
dataegret.com
pg_xlogdump:
● Decodes and displays XLOG for debugging;
● Can give wrong results when the server is running.
pg_xlogdump -f -p /xlog_96 
$(psql -qAtX -c "select pg_xlogfile_name(pg_current_xlog_location())")
Troubleshooting
cases
03
Troubleshooting cases03
dataegret.com
Replication lag.
pg_xlog/ bloat.
Long transactions and recovery conflicts.
Recovery process: 100% CPU usage.
Replication lag03
dataegret.com
Main symptom – answers differ between master and standbys.
Detection:
● pg_stat_replication and pg_xlog_location_dif();
● pg_last_xact_replay_timestamp().
Replication lag03
dataegret.com
# d pg_stat_replication
View "pg_catalog.pg_stat_replication"
Column | Type | Modifiers
------------------+--------------------------+-----------
pid | integer |
usesysid | oid |
usename | name |
application_name | text |
client_addr | inet |
client_hostname | text |
client_port | integer |
backend_start | timestamp with time zone |
backend_xmin | xid |
state | text |
sent_location | pg_lsn |
write_location | pg_lsn |
flush_location | pg_lsn |
replay_location | pg_lsn |
sync_priority | integer |
sync_state | text |
Replication lag03
dataegret.com
# SELECT
client_addr AS client, usename AS user, application_name AS name,
state, sync_state AS mode,
(pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending,
(pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write,
(pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush,
(pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay,
(pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag
FROM pg_stat_replication;
сlient | user | name | state | mode | pending | write | flush | replay | total_lag
----------+--------+-------------+-----------+-------+---------+-------+-------+--------+-----------
10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480
10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025
10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552
10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
Replication lag03
dataegret.com
# SELECT
client_addr AS client, usename AS user, application_name AS name,
state, sync_state AS mode,
(pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending,
(pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write,
(pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush,
(pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay,
(pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag
FROM pg_stat_replication;
сlient | user | name | state | mode | pending | write | flush | replay | total_lag
----------+--------+-------------+-----------+-------+---------+-------+-------+--------+-----------
10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480
10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025
10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552
10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
Replication lag03
dataegret.com
# SELECT
client_addr AS client, usename AS user, application_name AS name,
state, sync_state AS mode,
(pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending,
(pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write,
(pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush,
(pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay,
(pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag
FROM pg_stat_replication;
сlient | user | name | state | mode | pending | write | flush | replay | total_lag
----------+--------+-------------+-----------+-------+---------+-------+-------+--------+-----------
10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480
10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025
10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552
10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
Replication lag03
dataegret.com
# SELECT
client_addr AS client, usename AS user, application_name AS name,
state, sync_state AS mode,
(pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending,
(pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write,
(pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush,
(pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay,
(pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag
FROM pg_stat_replication;
сlient | user | name | state | mode | pending | write | flush | replay | total_lag
----------+--------+-------------+-----------+-------+---------+-------+-------+--------+-----------
10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480
10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025
10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552
10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
Replication lag03
dataegret.com
Network problems – nicstat.
Storage problems – iostat, iotop.
Recovery stucks – top, pg_stat_activity.
WAL pressure:
● pg_stat_activity, pg_stat_progress_vacuum;
● pg_xlog_location_dif().
Replication lag03
dataegret.com
Network/storage problems:
● check workload;
● upgrade hardware.
Recovery stucks – wait or cancel queries on standby.
WAL pressure:
● Reduce amount of work;
● Reduce amount of WAL:
● full_page_writes = of, wal_compression = on,
wal_log_hints = of;
● expand interval between checkpoints.
pg_xlog/ bloat03
dataegret.com
Main symptoms:
● unexpected increase in the usage of the disk space;
● abnormal size of pg_xlog/ directory.
pg_xlog/ bloat03
dataegret.com
Detection:
● du -csh;
● pg_replication_slots, pg_stat_archiver;
● errors in postgres logs.
pg_xlog/ bloat03
dataegret.com
Problems:
● Massive CRUD.
● Unused slot.
● Broken archive_command.
pg_xlog/ bloat03
dataegret.com
Solutions:
● check replication lag;
● reduce checkpoints_segments/max_wal_size,
wal_keep_segments;
● change reserved space ratio (ext filesystems);
● add an extra space (LVM, ZFS, etc);
● drop unused slot or fix slot consumer;
● fix WAL archiving;
● checkpoint, checkpoint, chekpoint.
Recovery conflicts03
dataegret.com
Main symptoms – errors in postgresql or application logs.
postgres.c:errdetail_recovery_conflict():
● User was holding shared bufer pin for too long.
● User was holding a relation lock for too long.
● User was or might have been using tablespace that must be dropped.
● User query might have needed to see row versions that must be removed.
● User transaction caused bufer deadlock with recovery.
● User was connected to a database that must be dropped.
Recovery conflicts03
dataegret.com
Detection:
● pg_stat_databases + pg_stat_databases_conflicts;
● postgresql logs.
Recovery conflicts03
dataegret.com
Problems:
● queries are cancelled too often;
● long transactions on a standby – check pg_stat_activity;
● huge apply lag – check pg_stat_replication.
Recovery conflicts03
dataegret.com
Solutions:
● increase streaming delay (potentially causes lag);
● enable hot_standby_feedback (potentially causes bloat);
● rewrite queries;
● setup dedicated standby for long queries.
Recovery 100% CPU usage03
dataegret.com
Main symptoms:
● huge apply lag;
● 100% CPU usage by recovery process.
Recovery 100% CPU usage03
dataegret.com
Detection:
● top – CPU usage;
● pg_stat_replication – amount of lag.
Recovery 100% CPU usage03
dataegret.com
Investigation:
● perf top/record/report (required debug symbols);
● pg_xlogdump.
Recovery 100% CPU usage03
dataegret.com
Solutions:
● depend on investigation' results;
● change problematic workload (if found).
Lessons learned03
dataegret.com
Streaming replication problems are always distributed.
There are many sources of problems:
● system resources, app/queries, workload.
Always use monitoring.
Learn how to use builtin tools.
Links
dataegret.com
PostgreSQL official documentation – The Statistics Collector
https://www.postgresql.org/docs/current/static/monitoring-stats.html
PostgreSQL Mailing Lists (general, performance, hackers)
https://www.postgresql.org/list/
PostgreSQL-Consulting company blog
http://blog.postgresql-consulting.com/
Thanks for watching!
dataegret.com alexey.lesovsky@dataegret.com

Mais conteúdo relacionado

Mais procurados

Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyAlexander Kukushkin
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaPostgreSQL-Consulting
 
MariaDB Galera Cluster presentation
MariaDB Galera Cluster presentationMariaDB Galera Cluster presentation
MariaDB Galera Cluster presentationFrancisco Gonçalves
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScaleMariaDB plc
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재PgDay.Seoul
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability Mydbops
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQLMydbops
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance OptimisationMydbops
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Postgres Vision 2018: WAL: Everything You Want to Know
Postgres Vision 2018: WAL: Everything You Want to KnowPostgres Vision 2018: WAL: Everything You Want to Know
Postgres Vision 2018: WAL: Everything You Want to KnowEDB
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationMydbops
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceMariaDB plc
 
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLMySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLOlivier DASINI
 
HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18Derek Downey
 

Mais procurados (20)

Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
MariaDB Galera Cluster presentation
MariaDB Galera Cluster presentationMariaDB Galera Cluster presentation
MariaDB Galera Cluster presentation
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance Optimisation
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
PostgreSQL replication
PostgreSQL replicationPostgreSQL replication
PostgreSQL replication
 
Postgres Vision 2018: WAL: Everything You Want to Know
Postgres Vision 2018: WAL: Everything You Want to KnowPostgres Vision 2018: WAL: Everything You Want to Know
Postgres Vision 2018: WAL: Everything You Want to Know
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLMySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
 
HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18
 

Destaque

RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
G1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningG1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningSimone Bordet
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Tier1 App
 
Row Pattern Matching in SQL:2016
Row Pattern Matching in SQL:2016Row Pattern Matching in SQL:2016
Row Pattern Matching in SQL:2016Markus Winand
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
 
Shell,信号量以及java进程的退出
Shell,信号量以及java进程的退出Shell,信号量以及java进程的退出
Shell,信号量以及java进程的退出wang hongjiang
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsBrendan Gregg
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 

Destaque (11)

RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
G1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningG1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and Tuning
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Row Pattern Matching in SQL:2016
Row Pattern Matching in SQL:2016Row Pattern Matching in SQL:2016
Row Pattern Matching in SQL:2016
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Shell,信号量以及java进程的退出
Shell,信号量以及java进程的退出Shell,信号量以及java进程的退出
Shell,信号量以及java进程的退出
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 

Semelhante a Troubleshooting streaming replication issues like lag, bloat and recovery conflicts

Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practiceAlexey Lesovsky
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Vladimir Kochetkov
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logsStefan Krawczyk
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLCommand Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2PgTraining
 
2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - Presentation2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - PresentationBiju Thomas
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deploymentsOdoo
 
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...Lucidworks
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Logging with Logback in Scala
Logging with Logback in ScalaLogging with Logback in Scala
Logging with Logback in ScalaKnoldus Inc.
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLCommand Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Ontico
 
LOGBack and SLF4J
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4Jjkumaranc
 
LOGBack and SLF4J
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4Jjkumaranc
 

Semelhante a Troubleshooting streaming replication issues like lag, bloat and recovery conflicts (20)

Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible!
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logs
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Gg steps
Gg stepsGg steps
Gg steps
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - Presentation2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - Presentation
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deployments
 
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Logging with Logback in Scala
Logging with Logback in ScalaLogging with Logback in Scala
Logging with Logback in Scala
 
Php version 7
Php version 7Php version 7
Php version 7
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
 
LOGBack and SLF4J
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
 
LOGBack and SLF4J
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
 

Mais de Alexey Lesovsky

Отладка и устранение проблем в PostgreSQL Streaming Replication.
Отладка и устранение проблем в PostgreSQL Streaming Replication.Отладка и устранение проблем в PostgreSQL Streaming Replication.
Отладка и устранение проблем в PostgreSQL Streaming Replication.Alexey Lesovsky
 
Call of Postgres: Advanced Operations (part 5)
Call of Postgres: Advanced Operations (part 5)Call of Postgres: Advanced Operations (part 5)
Call of Postgres: Advanced Operations (part 5)Alexey Lesovsky
 
Call of Postgres: Advanced Operations (part 4)
Call of Postgres: Advanced Operations (part 4)Call of Postgres: Advanced Operations (part 4)
Call of Postgres: Advanced Operations (part 4)Alexey Lesovsky
 
Call of Postgres: Advanced Operations (part 3)
Call of Postgres: Advanced Operations (part 3)Call of Postgres: Advanced Operations (part 3)
Call of Postgres: Advanced Operations (part 3)Alexey Lesovsky
 
Call of Postgres: Advanced Operations (part 2)
Call of Postgres: Advanced Operations (part 2)Call of Postgres: Advanced Operations (part 2)
Call of Postgres: Advanced Operations (part 2)Alexey Lesovsky
 
Call of Postgres: Advanced Operations (part 1)
Call of Postgres: Advanced Operations (part 1)Call of Postgres: Advanced Operations (part 1)
Call of Postgres: Advanced Operations (part 1)Alexey Lesovsky
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterAlexey Lesovsky
 
PostgreSQL Streaming Replication
PostgreSQL Streaming ReplicationPostgreSQL Streaming Replication
PostgreSQL Streaming ReplicationAlexey Lesovsky
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedAlexey Lesovsky
 
PostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellPostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellAlexey Lesovsky
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.Alexey Lesovsky
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterAlexey Lesovsky
 
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumNine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumAlexey Lesovsky
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practiceAlexey Lesovsky
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetAlexey Lesovsky
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Alexey Lesovsky
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).Alexey Lesovsky
 
Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015Alexey Lesovsky
 

Mais de Alexey Lesovsky (20)

Отладка и устранение проблем в PostgreSQL Streaming Replication.
Отладка и устранение проблем в PostgreSQL Streaming Replication.Отладка и устранение проблем в PostgreSQL Streaming Replication.
Отладка и устранение проблем в PostgreSQL Streaming Replication.
 
Call of Postgres: Advanced Operations (part 5)
Call of Postgres: Advanced Operations (part 5)Call of Postgres: Advanced Operations (part 5)
Call of Postgres: Advanced Operations (part 5)
 
Call of Postgres: Advanced Operations (part 4)
Call of Postgres: Advanced Operations (part 4)Call of Postgres: Advanced Operations (part 4)
Call of Postgres: Advanced Operations (part 4)
 
Call of Postgres: Advanced Operations (part 3)
Call of Postgres: Advanced Operations (part 3)Call of Postgres: Advanced Operations (part 3)
Call of Postgres: Advanced Operations (part 3)
 
Call of Postgres: Advanced Operations (part 2)
Call of Postgres: Advanced Operations (part 2)Call of Postgres: Advanced Operations (part 2)
Call of Postgres: Advanced Operations (part 2)
 
Call of Postgres: Advanced Operations (part 1)
Call of Postgres: Advanced Operations (part 1)Call of Postgres: Advanced Operations (part 1)
Call of Postgres: Advanced Operations (part 1)
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
PostgreSQL Streaming Replication
PostgreSQL Streaming ReplicationPostgreSQL Streaming Replication
PostgreSQL Streaming Replication
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 
PostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellPostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of Hell
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumNine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Pgcenter overview
Pgcenter overviewPgcenter overview
Pgcenter overview
 
Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015
 

Último

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 

Último (20)

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 

Troubleshooting streaming replication issues like lag, bloat and recovery conflicts

  • 2. dataegret.com Quick introduction • WAL and replication internals. • Replication setup. Troubleshooting tools • 3rd party tools. • Builtin tools. Troubleshooting cases • Symptoms and problems. • Detection and solutions. • Lessons learned. 02 03 01
  • 4. Goals01 dataegret.com Better understanding of streaming replication. How to quickly find and fix problems. https://goo.gl/Mm3ugt
  • 5. Agenda01 dataegret.com Write-Ahead Log. Streaming replication internals. Streaming replication setup. Troubleshooting tools overview (3-rd party). Troubleshooting tools overview (builtin). Troubleshooting in practice. Questions.
  • 6. Write Ahead Log01 dataegret.com Durability in ACID. Almost all changes fixed in WAL. pg_xlog/ (pg_wal/) directory in the DATADIR. Synchronous WAL write by backends. Asynchronous WAL write by WAL writer. Recovery process relies on WAL.
  • 7. Streaming replication internals01 dataegret.com WAL Sender process. WAL Receiver process. Startup process (recovery). Streaming replication vs. WAL archiving.
  • 9. Streaming replication setup01 dataegret.com Master: ● postgresql.conf; ● Restart. Standby: ● pg_basebackup; ● postgresql.conf; ● recovery.conf setup.
  • 10. Master setup01 dataegret.com wal_level, max_wal_senders, max_replication_slots. archive_mode, archive_command. wal_keep_segments. wal_sender_timeout. synchronous_standby_names.
  • 11. Master setup pitfalls01 dataegret.com wal_level, archive_mode, max_wal_senders, max_replication_slots – require restart. wal_keep_segments – requires extra storage space. wal_sender_timeout – reduce that, if network is bad. synchronous_standby_names – master freezes if standby fails.
  • 13. Standby setup pitfalls01 dataegret.com hot_standby – enables SELECT queries. max_standby_streaming_delay – increases max possible lag. hot_standby_feedback: ● postpones vacuum; ● potential tables/indexes bloat. wal_receiver_timeout – reduce that, if network is bad.
  • 14. Recovery.conf01 dataegret.com primary_conninfo and/or restore_command. standby_mode. trigger_file. Recovery target: ● immediate; ● particular point/xid/timestamp; ● recovery_min_apply_delay.
  • 17. 3rd party troubleshooting tools02 dataegret.com Top (procps). Iostat (sysstat), iotop. Nicstat. pgCenter. Perf.
  • 18. 3rd party troubleshooting tools02 dataegret.com Top (procps) – CPU usage, load average, mem/swap usage. Iostat (sysstat), iotop – storage utilization, process IO. Nicstat – network interfaces utilization. pgCenter – replication stats. Perf – deep investigations.
  • 19. Builtin troubleshooting tools02 dataegret.com Statistics views. Auxiliary functions. pg_xlogdump utility.
  • 20. Builtin troubleshooting tools02 dataegret.com Statistics view: ● pg_stat_replication; ● pg_stat_databases, pg_stat_databases_conflicts; ● pg_stat_activity; ● pg_stat_archiver.
  • 21. Builtin troubleshooting tools02 dataegret.com Auxiliary functions: ● pg_current_xlog_location, pg_last_xlog_receive_location; ● pg_xlog_location_dif; ● pg_xlog_replay_pause, pg_xlog_replay_resume; ● pg_is_xlog_replay_paused.
  • 22. Builtin troubleshooting tools02 dataegret.com pg_xlogdump: ● Decodes and displays XLOG for debugging; ● Can give wrong results when the server is running. pg_xlogdump -f -p /xlog_96 $(psql -qAtX -c "select pg_xlogfile_name(pg_current_xlog_location())")
  • 24. Troubleshooting cases03 dataegret.com Replication lag. pg_xlog/ bloat. Long transactions and recovery conflicts. Recovery process: 100% CPU usage.
  • 25. Replication lag03 dataegret.com Main symptom – answers differ between master and standbys. Detection: ● pg_stat_replication and pg_xlog_location_dif(); ● pg_last_xact_replay_timestamp().
  • 26. Replication lag03 dataegret.com # d pg_stat_replication View "pg_catalog.pg_stat_replication" Column | Type | Modifiers ------------------+--------------------------+----------- pid | integer | usesysid | oid | usename | name | application_name | text | client_addr | inet | client_hostname | text | client_port | integer | backend_start | timestamp with time zone | backend_xmin | xid | state | text | sent_location | pg_lsn | write_location | pg_lsn | flush_location | pg_lsn | replay_location | pg_lsn | sync_priority | integer | sync_state | text |
  • 27. Replication lag03 dataegret.com # SELECT client_addr AS client, usename AS user, application_name AS name, state, sync_state AS mode, (pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending, (pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write, (pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush, (pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay, (pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag FROM pg_stat_replication; сlient | user | name | state | mode | pending | write | flush | replay | total_lag ----------+--------+-------------+-----------+-------+---------+-------+-------+--------+----------- 10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480 10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025 10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552 10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
  • 28. Replication lag03 dataegret.com # SELECT client_addr AS client, usename AS user, application_name AS name, state, sync_state AS mode, (pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending, (pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write, (pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush, (pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay, (pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag FROM pg_stat_replication; сlient | user | name | state | mode | pending | write | flush | replay | total_lag ----------+--------+-------------+-----------+-------+---------+-------+-------+--------+----------- 10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480 10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025 10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552 10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
  • 29. Replication lag03 dataegret.com # SELECT client_addr AS client, usename AS user, application_name AS name, state, sync_state AS mode, (pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending, (pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write, (pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush, (pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay, (pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag FROM pg_stat_replication; сlient | user | name | state | mode | pending | write | flush | replay | total_lag ----------+--------+-------------+-----------+-------+---------+-------+-------+--------+----------- 10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480 10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025 10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552 10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
  • 30. Replication lag03 dataegret.com # SELECT client_addr AS client, usename AS user, application_name AS name, state, sync_state AS mode, (pg_xlog_location_diff(pg_current_xlog_location(),sent_location) / 1024)::int as pending, (pg_xlog_location_diff(sent_location,write_location) / 1024)::int as write, (pg_xlog_location_diff(write_location,flush_location) / 1024)::int as flush, (pg_xlog_location_diff(flush_location,replay_location) / 1024)::int as replay, (pg_xlog_location_diff(pg_current_xlog_location(),replay_location))::int / 1024 as total_lag FROM pg_stat_replication; сlient | user | name | state | mode | pending | write | flush | replay | total_lag ----------+--------+-------------+-----------+-------+---------+-------+-------+--------+----------- 10.6.6.9 | repmgr | walreceiver | streaming | async | 0 | 0 | 0 | 410480 | 410480 10.6.6.7 | repmgr | walreceiver | streaming | async | 0 | 2845 | 95628 | 112552 | 211025 10.6.6.6 | repmgr | walreceiver | streaming | async | 0 | 0 | 3056 | 9496 | 12552 10.6.6.8 | repmgr | walreceiver | streaming | async | 847582 | 0 | 0 | 3056 | 850638
  • 31. Replication lag03 dataegret.com Network problems – nicstat. Storage problems – iostat, iotop. Recovery stucks – top, pg_stat_activity. WAL pressure: ● pg_stat_activity, pg_stat_progress_vacuum; ● pg_xlog_location_dif().
  • 32. Replication lag03 dataegret.com Network/storage problems: ● check workload; ● upgrade hardware. Recovery stucks – wait or cancel queries on standby. WAL pressure: ● Reduce amount of work; ● Reduce amount of WAL: ● full_page_writes = of, wal_compression = on, wal_log_hints = of; ● expand interval between checkpoints.
  • 33. pg_xlog/ bloat03 dataegret.com Main symptoms: ● unexpected increase in the usage of the disk space; ● abnormal size of pg_xlog/ directory.
  • 34. pg_xlog/ bloat03 dataegret.com Detection: ● du -csh; ● pg_replication_slots, pg_stat_archiver; ● errors in postgres logs.
  • 35. pg_xlog/ bloat03 dataegret.com Problems: ● Massive CRUD. ● Unused slot. ● Broken archive_command.
  • 36. pg_xlog/ bloat03 dataegret.com Solutions: ● check replication lag; ● reduce checkpoints_segments/max_wal_size, wal_keep_segments; ● change reserved space ratio (ext filesystems); ● add an extra space (LVM, ZFS, etc); ● drop unused slot or fix slot consumer; ● fix WAL archiving; ● checkpoint, checkpoint, chekpoint.
  • 37. Recovery conflicts03 dataegret.com Main symptoms – errors in postgresql or application logs. postgres.c:errdetail_recovery_conflict(): ● User was holding shared bufer pin for too long. ● User was holding a relation lock for too long. ● User was or might have been using tablespace that must be dropped. ● User query might have needed to see row versions that must be removed. ● User transaction caused bufer deadlock with recovery. ● User was connected to a database that must be dropped.
  • 38. Recovery conflicts03 dataegret.com Detection: ● pg_stat_databases + pg_stat_databases_conflicts; ● postgresql logs.
  • 39. Recovery conflicts03 dataegret.com Problems: ● queries are cancelled too often; ● long transactions on a standby – check pg_stat_activity; ● huge apply lag – check pg_stat_replication.
  • 40. Recovery conflicts03 dataegret.com Solutions: ● increase streaming delay (potentially causes lag); ● enable hot_standby_feedback (potentially causes bloat); ● rewrite queries; ● setup dedicated standby for long queries.
  • 41. Recovery 100% CPU usage03 dataegret.com Main symptoms: ● huge apply lag; ● 100% CPU usage by recovery process.
  • 42. Recovery 100% CPU usage03 dataegret.com Detection: ● top – CPU usage; ● pg_stat_replication – amount of lag.
  • 43. Recovery 100% CPU usage03 dataegret.com Investigation: ● perf top/record/report (required debug symbols); ● pg_xlogdump.
  • 44. Recovery 100% CPU usage03 dataegret.com Solutions: ● depend on investigation' results; ● change problematic workload (if found).
  • 45. Lessons learned03 dataegret.com Streaming replication problems are always distributed. There are many sources of problems: ● system resources, app/queries, workload. Always use monitoring. Learn how to use builtin tools.
  • 46. Links dataegret.com PostgreSQL official documentation – The Statistics Collector https://www.postgresql.org/docs/current/static/monitoring-stats.html PostgreSQL Mailing Lists (general, performance, hackers) https://www.postgresql.org/list/ PostgreSQL-Consulting company blog http://blog.postgresql-consulting.com/
  • 47. Thanks for watching! dataegret.com alexey.lesovsky@dataegret.com