This presentation is about the work that I did during the Google Summer of Code 2014 to PostgreSQL. The project is about change an Unlogged Table to Logged and vice-versa. Project wiki page: https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014
I present this work to Uniritter IT students in Canoas/RS (2015-05-18) and Porto Alegre/RS (FAPA - 2015-05-20).
5. But what wikipedia tell us?
A database is an integrated and organized
collection of logically related records or files or
data that are stored in a computer system
which consolidates records previously stored in
a separate files into a common pool of data
records that provides data for many
applications.
Source: http://en.wikipedia.org/wiki/Database
7. Ok, more simple and in pt-br !!
“Bancos de dados ou bases de dados são
coleções organizadas de dados que se
relacionam de forma a criar algum sentido
(Informação) e dar mais eficiência durante uma
pesquisa ou estudo.” (Wikipedia)
Source: http://pt.wikipedia.org/wiki/Banco_de_dados
8. Database is a concept
Relational Database Management Systems
is the implementation of this concept
10. ● IT experience since 1993
○ Programming Languages (Basic, C, Clipper, Pascal,
PHP, Javascript, …)
○ Operating Systems (Windows “argh”, Unix and
Linux)
○ PostgreSQL, Firebird, MySQL, Oracle
○ Agile Methodologies (XP, Lean, Scrum, …)
○ …
Fabrízio de Royes Mello
11. Fabrízio de Royes Mello
● Bachelor in Information Systems in 2002
● Entrepeneur at http://timbira.com
● Agile Methodologies Specialization student 2014/2015
● PostgreSQL colaborator since 2008 (Brazilian
community and now the international too)
12.
13. … and nowadays
● PostgreSQL contributor (more than 27
patches as developer and/or reviewer)
● Brazilian Community
○ http://postgresql.org.br
○ http://listas.postgresql.org.br
● PostgreSQL Consultant at Timbira
○ http://timbira.com.br
● Judô Practitioner
15. PostgreSQL (http://postgresql.org)
● The world’s most advanced open source database
● Run in all major operating systems: Linux, UNIX (AIX,
BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and
Windows
● Fully ACID compliant (Atomicity, Consistency, Isolation
and Durability)
● Full support for foreign keys, joins, views, triggers, and
stored procedures (in multiple languages)
● Native programming interfaces for C/C++, Java, .Net,
Perl, Python, Ruby, Tcl, ODBC, among others.
16. PostgreSQL (http://postgresql.org)
● Before : born from INGRES
● 1986 : Project start (Berkley)
● 1987 : First Postgres version Postgres
● 1991 : (v 3) with the most of the actual features
● 1993 : (v 4.2) last released by Berkley
● 1994 : Andrew Yu and Jolly Chen release Postgre95
with support to SQL language
● 1997 : (v 6) Name changes to PostgreSQL
● 2000 : (v 7) Support to Foreign Keys
17. PostgreSQL (http://postgresql.org)
● 2005 : (v 8) Native port to Windows, Tablespaces,
Savepoints, Point-In-Time-Recovery
● 2005 : (v 8.1) Two-phase Commit, Roles
● 2006 : (v 8.2) [Insert, Update, Delete] Returning,
improve performance OLTP and BI
● 2008 : (v 8.3) Debug PL/PgSQL, Tsearch2 (XML)
incorporated to the core, performance improvements
● 2009 : (v 8.4) Windowing Functions, Common Table
Expressions and Recursive Queries, Parallel Restore,
“pg_upgrade”
18. PostgreSQL (http://postgresql.org)
● 2010 : (v 9.0) Hot Standby and Streaming Replication
● 2011 : (v 9.1) Synchronous Replicacion, FDW
(SQL/MED), CREATE EXTENSION, Unlogged Tables
● 2012 : (v 9.2) Index-only Scans, Cascading Replication,
JSON, Range Types
● 2013 : (v 9.3) Materialized Views, Lateral Join, writable
FDW, Event Triggers, Background Workers
● 2014 : (v 9.4) JSONB, Logical Decoding, Dynamic
Background Workers
19. PostgreSQL (http://postgresql.org)
● 2015 : (v 9.5) INSERT … ON CONFLICT UPDATE
(upsert), IMPORT FOREIGN SCHEMA, ALTER TABLE
.. SET LOGGED, Parallel Infrastructure
● 2016 : (v 9.6) Parallel Query??? BDR (Bi-directional
Replication)???
21. FOSS (free and open source software) and me
● My first contact was using Linux in 1997
● I fell in love with this culture since then
● In 1999 I met PostgreSQL so since then I
knew this would be part of my life
● Because of this decision I had a lot of
troubles, including financial…
● But here I am :-)
22. Is a global program that
offers students stipends to
write code for open source
projects.
We have worked with the
open source community to
identify and fund exciting
projects for the upcoming
summer.
24. GSoC and PostgreSQL
● Since 2006
● Cool projects
○ Fast GiST index build
○ New phpPgAdmin Plugin Architecture (brazilian)
○ pgAdmin database designer
○ Better indexing for ranges
○ Document collection Foreign-data Wrapper
25. And now my project ...
PostgreSQL 9.1 introduced a new kind of table
Unlogged Tables
26. What means “Unlogged”?
First we need to know what means “WAL”
PostgreSQL is Full-ACID and to guarantee data
integrity uses a standard method called
WAL (Write-Ahead Logging)
27. WAL (Write-Ahead Logging)
“In computer science, write-ahead logging (WAL) is a family
of techniques for providing atomicity and durability (two of
the ACID properties) in database systems.
In a system using WAL, all modifications are written to a log
before they are applied. Usually both redo and undo
information is stored in the log.”
http://en.wikipedia.org/wiki/Write-ahead_logging
28. Ok, and what means “Unlogged” ?
● Unlogged means that the data written in
these tables is not written to WAL.
● So it makes written really, really fast
compared to written into regular tables.
29. So I’ll use it to all of my tables...
● However you won’t want to do that, because
● They are neither crash-safe (an unlogged
table is automatically truncated after a crash
or unclean shutdown)
● And they are nor replicated using SR
30. But there are some cool use cases
● Speed ETL jobs
● Cache
● Session State
● Queues?!
● ...
31. And now we have the power to ...
● change from UNLOGGED to LOGGED
○ ALTER TABLE name SET LOGGED;
● change from LOGGED to UNLOGGED
○ ALTER TABLE name SET UNLOGGED;
32. Already committed
commit: f41872d0c1239d36ab03393c39ec0b70e9ee2a3c
author: Alvaro Herrera <alvherre@alvh.no-ip.org>
date: Fri, 22 Aug 2014 14:27:00 -0400
Implement ALTER TABLE .. SET LOGGED / UNLOGGED
This enables changing permanent (logged) tables to unlogged and
vice-versa.
(Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that
hopefully makes more sense than the original.)
Author: Fabrízio de Royes Mello
Reviewed by: Christoph Berg, Andres Freund, Thom Brown
Some tweaking by Álvaro Herrera
33. How it works
1. Acquire AcessExclusiveLock
2. Check dependencies
a. Cannot change temp tables
b. Check Foreign Keys
3. Change indexes “relpersistence”
4. Create new heap/toast with new relpersistence
5. Rewrite heap/toast
6. Rewrite indexes
34. New patch with refactoring
1. Acquire AcessExclusiveLock
2. Check dependencies
a. Cannot change temp tables
b. Check Foreign Keys
3. Create new heap/toast with new relpersistence
(pass down relpersistence to reindex_index)
4. Rewrite heap/toast
5. Rewrite indexes
36. Future work
● Don’t rewrite datafiles when wal_level =
minimal
● Unlogged Indexes on Regular Tables
● Unlogged Materialized Views (was reverted
by Tom Lane because of the bad design)
38. Special thanks to
● Stephen Frost (mentor)
● Josh Berkus and Thom Brown (organizers)
● Christoph Berg (patch review)
● Álvaro Herrera (patch review and commit)
● Maristela Kohlrausch de Andrade (my
english teacher)