Mais conteĂșdo relacionado Semelhante a Implementing Parallelism in PostgreSQL - PGCon 2014 (20) Implementing Parallelism in PostgreSQL - PGCon 20141. © 2013 EDB All rights reserved. 1
Implementing Parallelism in
PostgreSQL
âą
Robert Haas | PGCon 2014
2. © 2014 EDB All rights reserved. 2
âą
Between 1996 and 2004, single-threaded CPU
performance on SPECint and SPECfp benchmarks
increased by >50% per year. Between 2004 and 2012,
it increased by ~21% per year.
â http://preshing.com/20120208/a-look-back-at-single-threaded-cpu-performance/
âą
Single-threaded 7-zip performance was only 39%
faster on 2 x Intel Xeon L5640 (March 16, 2010; 2.7
GHz, 12 MB cache) than on 4 x AMD Opteron 880
(September 26, 2005; 2.4 GHz, 2MB cache). That's
only 7.7% per year.
â http://www.anandtech.com/show/6825/inside-anandtech-2013-cpu-performance
Parallelism: Why? (1)
3. © 2014 EDB All rights reserved. 3
âą
Dell Configuration Tool (as of 2014-05-15):
â 2x IntelÂź XeonÂź E7-4890 v2 Processor 2.8GHz, 37.5M
Cache, 8.0 GT/s QPI, Turbo, 15 Core, 155W [add $7,735.68]
â 2x IntelÂź XeonÂź E7-8893 v2 Processor 3.4GHz, 37.5M
Cache, 8.0 GT/s QPI, Turbo, 6 Core, 155W [add $8,410.30]
Parallelism: Why? (2)
4. © 2014 EDB All rights reserved. 4
Hash Join
Join Cond: foo.x = bar.x
â Seq Scan on foo
Filter: something_complicated
â Hash
â Seq Scan on bar
âą
One backend could run the Seq Scan and apply the
filter condition; it could then stream the results to
another backend to perform the Hash Join.
Parallel Query: Inter-Node
5. © 2014 EDB All rights reserved. 5
Hash Join
Join Cond: foo.x = bar.x
â Seq Scan on foo
Filter: something_complicated
â Hash
â Seq Scan on bar
âą
Multiple backends could cooperate to perform Seq
Scan â or Hash Join.
Parallel Query: Intra-Node
6. © 2014 EDB All rights reserved. 6
âą
CREATE INDEX
â Parallel Heap Scan
â Parallel Sort
âą
VACUUM
â Parallel Heap Scan
â Worker Per Index (suggestion from Andres and Heikki)
Parallel Maintenance / DDL Commands
7. © 2014 EDB All rights reserved. 7
âą
Processes â Not Threads
â None of our fundamental subsystems are thread-safe (e.g.
palloc/pfree, ereport, syscache, relcache, buffer manager).
â Making them thread-safe would add synchronization overhead
even in the single-threaded case â and also bugs.
âą
Started By Postmaster â Not created via fork()
â Can't fork() on Windows, where many of our users are.
â Currently, all backends are direct children of the postmaster;
seems preferable to keep it that way.
Architectural Overview (1)
8. © 2014 EDB All rights reserved. 8
âą
Shared Memory â Not Pipes or Files
â Files would cause more system calls and more I/O.
â Pipes are a good paradigm, but shared memory is more
flexible.
â We can use shared memory to emulate a pipe if we need to â
see shm_mq. (This also dodges platform dependencies.)
âą
Dynamic Shared Memory â Not Main Segment
â For an application such as parallel sort, we might need a LOT
of memory, like a terabyte. We can't pre-reserve that!
âą
Dynamic Shared Memory Could Be At a Different
Address in Every Process
â No good, general techniques for achieving this.
Architectural Overview (2)
9. © 2014 EDB All rights reserved. 9
âą
Basic Facilities (done in 9.4)
âą
Plumbing (some work done/in progress)
âą
Parallel Environment (a little unpublished work done)
âą
Parallel Execution (some study/thought)
âą
Parallel Planning (no idea yet)
What Do We Need To Build?
10. © 2014 EDB All rights reserved. 10
âą
Dynamic Background Workers (done in 9.4)
âą
Dynamic Shared Memory (done in 9.4)
Basic Facilities
11. © 2014 EDB All rights reserved. 11
âą
DSM Table of Contents (done in 9.4)
â I just mapped this dynamic shared memory segment; how do I
figure out what it contains?
âą
Message Queueing (done in 9.4)
â How does a background worker send tuples, errors, notices,
etc. to a user backend?
âą
Error Propagation (working on it)
â Common infrastructure to make using message queueing
easy.
âą
Shared Memory Allocator (early draft posted)
âą
Shared Hash Table (someday)
Plumbing
12. © 2014 EDB All rights reserved. 12
âą
Make the Background Worker Look Enough Like a
Regular User Backend To Do Useful Work
â Copy Relevant State (e.g. User, Database, Snapshot)
âą
Useful Work Doesn't Mean Everything
â Some operations seem fundamentally unsafe in a parallel
context (e.g. calling a user-defined function that sets a GUC).
â Some operations could theoretically be made safe, but we
might not bother (e.g. setseed() + random()).
â Even if we share lots of state, arbitrary user-supplied code can
never be safe; must label unsafe functions.
Parallel Environment
13. © 2014 EDB All rights reserved. 13
âą
User ID and Database
âą
GUCs
âą
Transaction State
âą
Current and Active Snapshot
âą
Combo CID Hash
Parallel Environment: What To Copy
14. © 2014 EDB All rights reserved. 14
âą
Sequence Operations
âą
Generation of Invalidation Messages
âą
Cursor Operations
âą
Large Object Manipulation
âą
LISTEN/NOTIFY
âą
Access to Temporary Buffers
âą
Prepared Statements
Parallel Environment: What To Prohibit
15. © 2014 EDB All rights reserved. 15
âą
Background Workers Can't Rely on User Backend To
Hold Necessary Locks
â The user backend might die or be killed before the background
worker terminates.
âą
If Background Workers Re-Lock The Same Relations,
Parallel Query Might Self Deadlock
â User backend locks X; another process queues for a
conflicting lock on X; background worker tries to re-lock X.
âą
Probably Need a Concept of Locking Groups Inside the
Lock Manager
Parallel Environment: Lock Management
16. © 2014 EDB All rights reserved. 16
âą
This is the âeasyâ part.
âą
Parallel sorting algorithms are described in the
literature and well-understood.
âą
For parallel sequential scan, grab blocks or block
ranges in alternation.
âą
Amdahl's Law: If α is the fraction of running time a
program spends executing serially, the maximum
speedup from parallelism is 1/α.
Parallel Execution
17. © 2014 EDB All rights reserved. 17
âą
This is probably hard.
âą
Right now, we do costing based on estimating the page
access costs (CPU and I/O) and tuple processing
costs.
âą
For parallelism, need to consider worker startup costs
and IPC costs.
âą
A plan that's a little cheaper for me might be much
more expensive in total.
Parallel Query Planning
18. © 2014 EDB All rights reserved. 18
âą
Any questions?
Thanks.