Robert Haas
Why does my query need a plan? Sequential scan vs. index scan. Join strategies. Join reordering. Joins you can't reorder. Join removal. Aggregates and DISTINCT. Using EXPLAIN. Row count and cost estimation. Things the query planner doesn't understand. Other ways the planner can fail. Parameters you can tune. Things that are nearly always slow. Redesigning your schema. Upcoming features and future work.
39. Each join strategy takes an “outer” relation and an “inner” relation and produces a result relation.
40.
41. Cost is roughly proportional to product of table sizes – bad if BOTH are large.
42. Nested Loop Example #1 SELECT * FROM foo, bar WHERE foo.x = bar.x Nested Loop Join Filter: (foo.x = bar.x) -> Seq Scan on bar -> Materialize -> Seq Scan on foo This might be very slow!
43. Nested Loop Example #2 SELECT * FROM foo, bar WHERE foo.x = bar.x Nested Loop -> Seq Scan on foo -> Index Scan using bar_pkey on bar Index Cond: (bar.x = foo.x) Nested loop with inner index-scan! Much better... though probably still not the best plan.
44.
45. Put both input relations into sorted order (using sort or index scan) and scan through the two in parallel, matching up equal values.
46.
47. Merge Join Example SELECT * FROM foo, bar WHERE foo.x = bar.x Merge Join Merge Cond: (foo.x = bar.x) -> Sort Sort Key: foo.x -> Seq Scan on foo -> Materialize -> Sort Sort Key: bar.x -> Seq Scan on bar
48.
49. Hash each row from the inner relation to create a hash table. Then, hash each row from the outer relation and probe the hash table for matches.
50. Very fast – but requires enough memory to store inner tuples. Can get around this using multiple “batches”.
79. If the planner overestimates the row count, it may choose a sequential scan instead of an index scan, or a merge or hash join instead of a nested loop.
80. Small values for LIMIT tilt the planner toward fast-start plans and magnify the effect of bad estimates.