1. Managing PostgreSQL with pgCenter
PgConf EU 2016, Estonia, Tallinn
Alexey Lesovsky
lesovsky@pgco.me
2. I am a PostgreSQL DBA:
● Linux administration, internals;
● and PostgreSQL of course.
Work in PostgreSQL-Consulting:
● Consulting, support, troubleshooting, profiling, training, etc...
https://goo.gl/NYRFQV
About me
4. + The most subsystems and objects have stats.
+ Getting stats is quite easy.
– Stats are provided as counters.
– No history or what happened X minutes ago?
– No builtin tools, only psql and hand-made queries.
PostgreSQL statistics
6. Written in the C, uses libpq and ncurses.
Support PostgreSQL 9.x (9.0 ... 9.6).
Linux only.
Sources on Github.
Packages:
● ALT Linux;
● RedHat/CentOS/Fedora (pgdg, epel-testing);
● Ubuntu (Launchpad);
What is the pgCenter
7. Top-like interface for viewing stats.
System resource utilization (cpu, memory, storage, networking).
PostgreSQL general utilization (connections, autovacuum, qps).
Common admin tasks.
Major features
8. The same options as in the psql:
● pgcenter -h 127.0.0.1 -p 5432 -U postgres -d mydb
● pgcenter -U postgres mydb
● pgcenter mydb
Environment variables:
● PGHOST, PGPORT, PGUSER, PGDATABASE, PGPASSWORD
Connections file (~/.pgcenterrc).
How to run pgCenter
24. N: open new connection in a new tab
Ctrl+D: close current tab.
1..8: switch between tabs.
W: save opened connections settings to the connfile.
General actions
27. C: show config menu.
E: edit config menu.
R: reload postgres service.
l: show log file.
Admin tasks
28. C: show config menu.
E: edit config menu
R: reload postgres service.
l: show log file.
-: сancel query using pid.
_: terminate backend using pid.
Del: cancel group of queries using mask.
Shift+Del: terminate group of backends using mask.
.: show current mask, >: set new mask.
Admin tasks
29. C: show config menu.
E: edit config menu.
R: reload postgres service.
l: show log file.
-: cancel query using pid.
_: terminate backend using pid.
Del: cancel group of queries using mask.
Shift+Del: terminate group of backends using mask.
.: show current mask, >: set new mask.
p: open psql session.
Admin tasks
30. B: open iostat.
I: open nicstat.
L: tail postgres log (show the latest log lines).
Additional Information
33. B: open iostat.
I: open nicstat.
L: tail postgres log (show the latest log lines).
ERROR: cannot execute SELECT FOR UPDATE in a read-only transaction
ERROR: cannot execute SELECT FOR UPDATE in a read-only transaction
LOG: checkpoint starting: time
LOG: checkpoint complete: wrote 40 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 26 recycled; write=3.924 s, sync=0.
ERROR: cannot execute SELECT FOR UPDATE in a read-only transaction
Additional Information
34. Troubleshoot:
● quick overview, resource utilization and postgresql activity;
● autovacuum issues;
● replication problems;
● database anomalies;
● tables, indexes and functions;
● where is the space?
● bad company;
● queries investigation.
Troubleshoot
35. Quick overview:
● run pgCenter;
● check resources usage (cpu, mem, disk, net);
● check the postgresql usage;
● make a plan what's next.
Troubleshoot
38. PostgreSQL general state:
● too many connections;
● too many bad connections (waiting, idle_in_xact);
● too many vacuums, long vacuum;
● too many queries, long transactions/queries;
Troubleshoot
40. conn1 [ok]: 127.0.0.1:5432 postgres@postgres (ver: 9.6.0, up 05:02:50)
activity: 44 total, 2 idle, 2 idle_in_xact, 9 active, 31 waiting, 0 others
autovacuum: 6/8 workers/max, 0 manual, 6 wraparound, 02:13:02 vac_maxtime
statements: 172 stmt/s, 5.390 stmt_avgtime, 00:18:37 xact_maxtime
too many active – check activity.
idle in transactions, waiting – check activity, cancel or terminate that.
what about autovacuum worker limit? – increase the limit, play with cost parameters.
check pg_stat_progress_vacuum, disks utilization.
long transactions – cancel or terminate.
Troubleshoot
41. Replication problems:
● replication lag – queries with different results;
● network utilization, errors;
● disk utilization, bandwidth.
Troubleshoot
52. relation index idx_scan idx_tup_read idx_tup_fetch idx_read idx_hit
public.job_bodies job_bodies_pkey 850 850 850 0 14560
public.job_bodies job_bodies_refcount_idx 850 880 850 0 14960
public.job_bodies job_bodies_reftype_idx 170 255 170 1136 7024
public.job_bodies job_bodies_spc2_idx 0 0 0 0 0
public.job_bodies job_bodies_spc5_idx 0 0 0 0 0
zero idx_scan – unused indexes;
tip: before drop them, check its usage on standbys.
Troubleshoot
53. Functions usage:
● long running functions;
● run psql, edit function with ef funcname.
Troubleshoot
54. Where is the space Postgres ?
● check tables sizes (with and without indexes);
● check tables size changes;
● use filters to see interesting tables.
Troubleshoot
56. Bad company:
● long running queries or idle transactions;
● waiting queries/transactions;
● cancel queries or terminate backends using pid or mask;
● change age threshold to hide unwanted things.
Troubleshoot
58. pid cl_addr cl_port datname usename state wait_etype wait_event xact_age query_age change_age query
6942 -1 shopdb shop_app active Lock transactionid 00:10:14 00:00:17 00:00:17 update >
6930 -1 shopdb shop_app active 00:08:17 00:00:12 00:00:12 update >
3429 -1 shopdb shop_app active Lock transactionid 00:07:01 00:00:02 00:00:00 update >
3857 -1 shopdb shop_app active 00:00:00 00:00:00 00:00:00 select >
5781 -1 shopdb shop_bg active 00:03:29 00:00:01 00:00:01 select >
6901 -1 shopdb shop_bg active 00:01:10 00:00:01 00:00:01 select >
idle in transaction, waiting – cancel or terminate.
use age thresholds and filters.
Troubleshoot
59. Query investigation:
● cpu- or disk hog queries;
● most called queries;
● queries doing a lot of IO;
● pg_stat_statements;
● query reports;
● looking for a query example;
● run psql and EXPLAIN ANALYZE query;
● rewrite a query, build an index, move a query to the standby, blame
developers...
Troubleshoot
61. summary:
total_time: 01:03:47, cpu_time: 00:59:38, io_time: 00:04:09 (ALL: 100.00%, CPU: 93.47%, IO: 6.53%),
total queries: 258,798,086
query info:
usename: streamcast,
datname: outpost,
calls (relative to all queries): 86,250,226 (33.33%),
rows (relative to all queries): 86,250,226 (94.46%),
total time (relative to all queries): 00:58:14 (ALL: 91.3%, CPU: 97.7%, IO: 0.0%),
average time (only for this query): 0.04ms, cpu_time: 0.04ms, io_time: 0.00ms, (ALL: 100.0%, CPU: 100.0%, IO: 0.0%),
query text (id: 14a58a3b9f):
SELECT tags.tg_id FROM tags WHERE tags.id IN (?, ?, ?) AND (id NOT IN (?)) GROUP BY id HAVING count(distinct tg_id) > ? LIMIT ?
Troubleshoot
62. pgCenter has many features but:
● psql is always available – 'p' hotkey.
● use the help – 'h' hotkey.
In the end
63. pgCenter is useful:
● to check what's going on;
● for quick overview;
● to make simple admin operations;
● to manage postgres easier and faster (I hope).
In the end