This is the ppt used by Illay for his presentation at pgDay Asia 2016 - "Lessons PostgreSQL learned from commercial
databases, and didn’t". The talk takes you through some of the really good things that PostgreSQL has done really well and somethings that PostgreSQL can learn from other databases
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Lessons PostgreSQL learned from commercial databases, and didn’t
1. Lessons PostgreSQL learned from commercial
databases, and didn’t
Ilya Kosmodemiansky
ik@postgresql-consulting.com
2. Preamble
PostgreSQL is a great database!
• (You always need to say so if you are going to say PostgreSQL
lags behind commercial databases or has some limitations)
3. Preamble
PostgreSQL is a great database!
• The only open source database technology, massively used as
an alternative to commercial RDBMSs
• Moreover, 10 years ago it was seriously disputed (by some
people), if PostgreSQL can outperform MySQL
• Moreover, 5 years ago any Oracle to Postgres migration
case-study meant you will be 100% accepted to any
PostgreSQL conference
• Only PostgreSQL did such impressive progress!
• Well, Linux did, but Linux is not a database system
4. What made that possible?
• Good initial architecture
• Well organized community work
• SQL close to standard
• Procedural languages
• Lots of things - you probably know those things if you are here
5. Did PostgreSQL learne something?
(from commercial databases)
• Well, not directly
• At least, this a worst possible way to start discussion on
[HACKERS]: ”...we need this feature because Oracle has it...”
• Most likely people came from Oracle, did not find some
beloved instruments and started to implement a substitution
6. Anecdotally
• Prominent Soviet aircraft designer Tupolev, being unofficially
accused of plagiarizing some of his models, used to say that all
beautiful aircrafts look similar and that is why they can fly
• Tupolev’s ill-wishers believed that he definitely plagiarized that
formula as well - from some another aircraft designer...
• For aviation engineers, it was always obvious, that internally
airplanes were totally different
• Anyway, databases _are_ like aircrafts: common theory
beneath makes them look similar
7. That common theory was
Transactions
• If your data is important, use a database which supports ACID
transactions
• In PostgreSQL: MVCC implementation since version 6.5
(1999), WAL since 7.1 (2001)
• Adopting MVCC instead of pure-locking scheduler was wise
(DB2 and MS SQL Server proved that over the time)
• That allowed to implement reliable backup/recovery
mechanism and replication for high availability
• And that was actually a pivotal point, which started
PostgreSQL adoption in enterprise-level solutions
• Ironically, current MVCC implementation itself became some
limitation for Postgres
8. OK, hold on
What can actually stop you from choosing Postgres instead
of Oracle or DB2?
9. OK, hold on
What can actually stop you from choosing Postgres instead
of Oracle or DB2?
• Write performance - Yes, absolutely
• Database size - Yes, definitely
• Lack of diagnostics tools - Yes
• We need to run PostgreSQL in Microsoft environment - Yes
• Lack of qualified people - Maybe
• Lack of build in analog of RAC/PureScale - Yes and No
10. OK, hold on
What can actually stop you from choosing Postgres instead
of Oracle or DB2?
• Write performance - Yes, absolutely
• Database size - Yes, definitely
• Lack of diagnostics tools - Yes
• We need to run PostgreSQL in Microsoft environment - Yes
• Lack of qualified people - Maybe
• Lack of build in analog of RAC/PureScale - Yes and No
• We are talking about heavy workloads and comparing with
enterprise licenses
13. PostgreSQL uses buffered writes
• Effectively, one PostgreSQL process writes pages one by one to
kernel buffer, then that buffer will be flushed to disk
• Besides double-caching, this is slow and does not allow to use
some cool features (O_ATOMIC)
• Oracle can bypass kernel buffer using direct IO. Moreover,
both Oracle’s database writer and logwriter can swap threads
to write asynchronously
• That is a serious limitation for reaching high TPS figures on a
single instance
14. Huge database
• Same problem - double caching
• Storage overhead
• Backup performance and recovery time
• Autovacuum performance becomes an issue
15. Backup performance
• No build-in parallelism
• Level 0 plus PITR only
• Keeping undo information right in datafiles can be a problem
for incremental backups
16. Current MVCC implementation is a limitation itself
Nothing new, I only want to mention that it can be largest
challenge for PostgreSQL in the next 20 years
• It solves only one, the ”snapshot to old”, problem (and modern
Oracle solves it better)
• Undo information, spreaded inside datafiles brings a lot of
problems
17. Luck of diagnostics tools
• OK, there are plenty of them
• Tools, which require kernel developer experience, such as perf,
are not proper tools for a DBA
• Full time PostgreSQL developers are not DBAs. We need to
explain them, what we need and why
• Adding wait information to pg_stat_activity is a good
example of such joint effort
• And a good lesson learned from Oracle. Not the last I hope
18. PostgreSQL performance on Windows
• Well, there is no such thing. By the way, Oracle performs well
• At the same time, a lot of PostgreSQL on Windows
• Lack of enthusiasts for proper porting
• At the same time, we support various BSD and even Tru64
UNIX!
• Welcome to the world of open source!
19. Documetation
• Relatively small, but efficient, not over-engendered, covers all
topics well - at a first glance
• No graphic diagrams. It seems much easier to decide about
graphical format, than to rework MVCC!
• No guidebooks. Application developer must read a half of
documentation, to install Postgres in test environment!
• OK, there is PostgreSQL wiki, but it is not under release
control
20. In spite of all this
PostgreSQL is a great database!
• It is still relatively simple to start with and to live with
• It is safe. We have no listener, but we have no thick books
about securing listener from external attack.
• It learns fast
• May be it will change databases global market like Linux
change operating systems global market