Co-Founder Peter Mattis presented this deck to the NYC PostgreSQL User Group on Nov. 4, 2015. It compares PostgreSQL to CockroachDB SQL- logical data structures, kv layers, data storage, online schema changes, and more.
19. @cockroachdb
INSERT INTO test VALUES (1, “ball”, 3.33);
PostgreSQL: Logical Data Storage
Tuple ID (Page# / Item#) Row
(0, 1) (1, “ball”, 3.33)
test (heap)
20. @cockroachdb
INSERT INTO test VALUES (1, “ball”, 3.33);
PostgreSQL: Logical Data Storage
Tuple ID (Page# / Item#) Row
(0, 1) (1, “ball”, 3.33)
Index Key Tuple ID
1 (0, 1)
test (heap)test_pkey (btree)
21. @cockroachdb
INSERT INTO test VALUES (1, “ball”, 3.33);
INSERT INTO test VALUES (2, “glove”, 4.44);
PostgreSQL: Logical Data Storage
Tuple ID (Page# / Item#) Row
(0, 1) (1, “ball”, 3.33)
(0, 2) (2, “glove”, 4.44)
Index Key Tuple ID
1 (0, 1)
2 (0, 2)
test (heap)test_pkey (btree)
36. @cockroachdb
■NULL indicates value does not exist
■NULL is weird: NULL != NULL
■CockroachDB: NULL values are not explicitly stored
CockroachDB: NULL Column Values
37. @cockroachdb
INSERT INTO test VALUES (1, “ball”, NULL);
CockroachDB: NULL Column Values
Key: /<table>/<index>/<key>/<column> Value
/test/primary/1/name “ball”
38. @cockroachdb
INSERT INTO test VALUES (1, “ball”, NULL);
INSERT INTO test VALUES (2, NULL, NULL);
CockroachDB: NULL Column Values
Key: /<table>/<index>/<key>/<column> Value
/test/primary/1/name “ball”
??? ???
39. @cockroachdb
INSERT INTO test VALUES (1, “ball”, NULL);
INSERT INTO test VALUES (2, NULL, NULL);
CockroachDB: NULL Column Values
Key: /<table>/<index>/<key>[/<column>] Value
/test/primary/1 Ø
/test/primary/1/name “ball”
/test/primary/2 Ø
40. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
■Multiple table rows with equal indexed values are
not allowed
CockroachDB: Unique Indexes
41. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
CockroachDB: Unique Indexes
Key: /<table>/<index>/<key> Value
/test/bar/”ball” 1
42. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
INSERT INTO test VALUES (2, “glove”, 3.33);
CockroachDB: Unique Indexes
Key: /<table>/<index>/<key> Value
/test/bar/”ball” 1
/test/bar/”glove” 2
43. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
INSERT INTO test VALUES (2, “glove”, 3.33);
INSERT INTO test VALUES (3, “glove”, 4.44);
CockroachDB: Unique Indexes
Key: /<table>/<index>/<key> Value
/test/bar/”ball” 1
/test/bar/”glove” 2
/test/bar/”glove” 3
45. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (3, NULL, NULL);
CockroachDB: Unique Indexes (NULL Values)
Key: /<table>/<index>/<key> Value
/test/bar/NULL 3
46. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (3, NULL, NULL);
INSERT INTO test VALUES (4, NULL, NULL);
CockroachDB: Unique Indexes (NULL Values)
Key: /<table>/<index>/<key> Value
/test/bar/NULL 3
/test/bar/NULL 4
47. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (3, NULL, NULL);
CockroachDB: Unique Indexes (NULL Values)
Key: /<table>/<index>/<key>[/<pkey>] Value
/test/bar/NULL/3 Ø
48. @cockroachdb
CREATE UNIQUE INDEX bar ON test (name);
INSERT INTO test VALUES (3, NULL, NULL);
INSERT INTO test VALUES (4, NULL, NULL);
CockroachDB: Unique Indexes (NULL Values)
Key: /<table>/<index>/<key>[/<pkey>] Value
/test/bar/NULL/3 Ø
/test/bar/NULL/4 Ø
49. @cockroachdb
CREATE INDEX foo ON test (name);
■Multiple table rows with equal indexed values are
allowed
CockroachDB: Non-Unique Indexes
50. @cockroachdb
CREATE INDEX foo ON test (name);
■Multiple table rows with equal indexed values are
allowed
■Primary key is a unique index
CockroachDB: Non-Unique Indexes
51. @cockroachdb
CREATE INDEX foo ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
CockroachDB: Non-Unique Indexes
Key: /<table>/<index>/<key>/<pkey> Value
/test/foo/”ball”/1 Ø
52. @cockroachdb
CREATE INDEX foo ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
INSERT INTO test VALUES (2, “glove”, 3.33);
CockroachDB: Non-Unique Indexes
Key: /<table>/<index>/<key>/<pkey> Value
/test/foo/”ball”/1 Ø
/test/foo/”glove”/2 Ø
53. @cockroachdb
CREATE INDEX foo ON test (name);
INSERT INTO test VALUES (1, “ball”, 2.22);
INSERT INTO test VALUES (2, “glove”, 3.33);
INSERT INTO test VALUES (3, “glove”, 4.44);
CockroachDB: Non-Unique Indexes
Key: /<table>/<index>/<key>/<pkey> Value
/test/foo/”ball”/1 Ø
/test/foo/”glove”/2 Ø
/test/foo/”glove”/3 Ø
54. @cockroachdb
■Keys and values are strings
■NULL column values
■Unique indexes
■Non-unique indexes
CockroachDB: Logical Data Storage
55. @cockroachdb
Logical Data Storage
PostgreSQL CockroachDB
Keys are composite structures Keys are strings
Heap storage for rows Required primary key
Per-table heap/indexes Monolithic map
59. @cockroachdb
Schema Change (the easy way)
1. Apologize for down time
2. Lock table
3. Adjust table data (add column, populate index, etc.)
4. Unlock table
60. @cockroachdb
Schema Change (the MySQL way)
1. Create new table with altered schema
2. Capture changes from source to the new table
3. Copy rows from the source to the new table
4. Synchronize source and new table
5. Swap/rename source and new table
67. @cockroachdb
CockroachDB: CREATE INDEX
CREATE INDEX foo ON TEST
1. Add index to TableDescriptor as delete-only
2. Wait for descriptor propagation
3. Mark index as write-only
4. Wait for descriptor propagation
5. Backfill index entries
6. Mark index as read-write
This is a PostgreSQL meetup, why should I care about CockroachDB?
The CockroachDB SQL grammar is based on the Postgres grammar.
Postgres is a SQL database. CockroachDB is a distributed SQL database.
Layered abstractions make it possible to deal with complexity
Higher levels can treat lower levels as functional black boxes
Top two layers are logical
Don’t stop at SQL!
Monolithic sorted map is distributed layer
RocksDB at physical layer
This is a brief overview of logical data storage in PostgreSQL. I’m using the term “logical” to refer to how SQL data is mapped down into PostgreSQL structures and distinguishing it from “physical” data storage which is exactly how those structures are implemented.
The heap structure is unindexed storage of row tuples. Think of it as a hash table where rows are given a unique id at insertion time, except that it is unfortunately not that simple.
Tuples (rows) are located by a tuple ID which is composed of a page# and item# within the page.
A Btree stores values sorted by key.
Btree index key is a tuple of the columns in the index. Value is the row’s tuple ID.
An example will make this clearer.
Tuple IDs just happen to be ordered in this example. They are not in general. And tuple IDs are an internal detail of tables and not stable for external usage.
Row sentinel!
Row sentinel!
Row sentinel!
Value contains any primary key column that is not stored in index key.
Value contains any primary key column that is not stored in index key.
Value contains any primary key column that is not stored in index key.
Value contains any primary key column that is not stored in index key.
Value contains any primary key column that is not stored in index key.
Poll audience about experience with production schema change.