Table partitioning in PostgreSQL + Rails

Table Partitioning
PostgreSQL + Rails
Agnieszka Figiel, Cambridge Ruby User Group Mar 2014

What is table partitioning in
PostgreSQL?
1 logical table = n smaller physical tables

What are the benefits?
improved query performance for big tables *
* works best when results are coming from
single partition

When to consider partitioning?
Big table that cannot be queried efficiently, as
the index is enormous as well.
Rule of thumb: table does not fit into memory

How is data split?
By ranges:
year > 2010 AND year <= 2012
By lists of key values:
company_id = 5

How does it work?
At the heart of it lie 2 mechanisms:
➔ table inheritance
➔ table CHECK constraints

Table inheritance: schema
CREATE TABLE C (extra_field TEXT) INHERITS (A);
Table "public.c"
-------------+---------+-----------
value | integer |
extra_field | text |
Inherits: a
NB: may inherit from multiple tables

Table inheritance: querying
INSERT INTO A VALUES (0);
INSERT INTO B VALUES (10);
INSERT INTO C VALUES (20, 'zonk');
SELECT * FROM A WHERE value < 20;
value
-------
0
10
SELECT * FROM ONLY A WHERE value < 20;
value
-------
0

What happened?
EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..69.38 rows=1290 width=4) (actual time=0.020..0.033 rows=2 loops=1)
-> Seq Scan on a (cost=0.00..4.00 rows=80 width=4) (actual time=0.019..0.021 rows=1 loops=1)
Filter: (value < 20)
-> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.005..0.006 rows=1 loops=1)
-> Seq Scan on c (cost=0.00..25.38 rows=410 width=4) (actual time=0.005..0.005 rows=0 loops=1)
Rows Removed by Filter: 1

Table CHECK constraint
CREATE TABLE A (value INT);
CREATE TABLE B (CHECK (value < 20)) INHERITS (A);
CREATE TABLE C (CHECK (value >= 20)) INHERITS (A);
INSERT INTO B VALUES (10);
INSERT INTO C VALUES (20);
INSERT INTO C VALUES(5);
ERROR: new row for relation "c" violates check constraint "c_value_check"
DETAIL: Failing row contains (5).

Ta da!
EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..40.00 rows=801 width=4) (actual time=0.015..0.017 rows=1 loops=1)
-> Seq Scan on a (cost=0.00..0.00 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1)
-> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.012..0.013 rows=1 loops=1)

Gotcha #1
Make sure that:
SET constraint_exclusion = on;

Gotcha #2
All check constraints and not-null constraints on a parent table are
automatically inherited by its children.
Other types of constraints (unique, primary key, and foreign key constraints)
are not inherited.

Solution #2: create constraints or
indexes in all partitions
CREATE TABLE A (id serial NOT NULL, value INT NOT NULL);
CREATE TABLE B (
CONSTRAINT B_pkey PRIMARY KEY (id),
CHECK (value < 20)
) INHERITS (A);
CREATE TABLE C (
CONSTRAINT C_pkey PRIMARY KEY (id),
CHECK (value >= 20)
) INHERITS (A);

Gotcha #3
There is no practical way to enforce uniqueness of a SERIAL id across
partitions.
INSERT INTO B (value) VALUES (10);
INSERT INTO C (value) VALUES (20);
INSERT INTO C VALUES (1, 30);
SELECT * FROM A;
id | value
----+-------
1 | 10
2 | 20
1 | 30

Solution #3
➔ carry on and always filter by partitioning key
➔ use UUID (uuid-ossp)

Gotcha #4
Specifying that another table's column REFERENCES a(value) would allow the
other table to contain “a values”, but not “b or c values”. There is no good
workaround for this case.

How to insert / update rows?
➔ PostgreSQL docs recommend using triggers
➔ also possible to do it using rules (overhead, bulk)

Trigger example
CREATE OR REPLACE FUNCTION a_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.value < 20 THEN
INSERT INTO b VALUES (NEW.*);
ELSE
INSERT INTO c VALUES (NEW.*);
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER insert_a_trigger
BEFORE INSERT ON a
FOR EACH ROW EXECUTE PROCEDURE a_insert_trigger();

Partitioned gem
https://github.com/fiksu/partitioned
class Company < ActiveRecord::Base; end
class ByCompanyId < Partitioned::ByForeignKey
self.abstract_class = true
belongs_to :company
def self.partition_foreign_key
return :company_id
end
partitioned do |partition|
partition.index :id, :unique => true
end
end
class Employee < ByCompanyId; end
employee = Employee.from_partition(1).find(1)

UUID in Rails
Rails 4:
enable_extension 'uuid-ossp'
create_table :documents, id: :uuid do |t|
t.string :title
t.string :author
t.timestamps
end
Rails 3:
gem postgres_ext https://github.com/dockyard/postgres_ext

Actual performance improvements?
➔ table with ~13 mln rows
➔ partitioned by date into 8 partitions
➔ complex, long-running query
➔ baseline: 0.07 tps
➔ when results in single partition: 0.15 tps
➔ when results in 2 partitions: like baseline

References
➔ http://www.postgresql.org/docs/9.3/static/ddl-
partitioning.html
➔ PostgreSQL 9.0 High Performance Gregory Smith

Table partitioning in PostgreSQL + Rails

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Table partitioning in PostgreSQL + Rails

Similar to Table partitioning in PostgreSQL + Rails (20)

Recently uploaded

Recently uploaded (20)

Table partitioning in PostgreSQL + Rails