Applied Database Systems - Tutorial 7

This will be the final tutorial that indroduces a new topic. Next weeks tutorial is reserved for questions regarding the assignment. This tutorial is on understanding query execution plans in PostgreSQL. It can very well be completed together with last weeks tutorial on indexing.

The tutorial consists of two parts:

 Query Execution Plans

PostgreSQL devises a query plan for each query it is given. You can use the EXPLAIN command to see what query plan the planner creates for any query. The command syntax is:

EXPLAIN [ ANALYZE ] statement

EXPLAIN displays the execution plan that the PostgreSQL planner generates for the supplied statement. If you use EXPLAIN ANALYZE the statement is actually executed and the resulting run times will be shown. In the following, some basic information about how to read an execution plan that is output by EXPLAIN is given (most of the following text was taken from the PostgreSQL on-line tutorial on using EXPLAIN; adopted to our Munros example).

Here is a first example to see how the output looks like:

sXXXXXX=> EXPLAIN SELECT * FROM munros;

                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on munros  (cost=0.00..5.76 rows=276 width=44)

The result shows that, in order to answer the given query, a full table scan on relation munros is performed. The numbers that are quoted by EXPLAIN are:

  • Estimated start-up cost (Time expended before output scan can start, e.g., time to do the sorting in a sort node.)
  • Estimated total cost (If all rows were to be retrieved, which they may not be: for example, a query with a LIMIT clause will stop short of paying the total cost of the Limit plan node's input node.). This is the value following the two dots (..).
  • Estimated number of rows output by this plan node (Again, only if executed to completion.)
  • Estimated average width (in bytes) of rows output by this plan node

The costs are measured in arbitrary units determined by the planner's cost parameters. Traditional practice is to measure the costs in units of disk page fetches. It's important to note that the cost of an upper-level node includes the cost of all its child nodes.

If you add a WHERE clause to the above query the resulting query execution plan will look like this:

sXXXXXX=> EXPLAIN SELECT heightm FROM munros WHERE mname = 'Ben Lui';
                      QUERY PLAN
------------------------------------------------------
 Seq Scan on munros  (cost=0.00..6.45 rows=1 width=4)
   Filter: ((mname)::text = 'Ben Lui'::text)

In the output the WHERE clause is being applied as a "filter" condition; this means that the plan node checks the condition for each row it scans, and outputs only the ones that pass the condition. The estimate of output rows has gone down because of the WHERE clause. However, the scan will still have to visit all rows, so the cost hasn't decreased; in fact it has gone up a bit to reflect the extra CPU time spent checking the WHERE condition.

Now, lets see how an index can affect the query execution plan. We start with the following simple query:

sXXXXXX=> EXPLAIN SELECT mname FROM munros WHERE heightm > 1000;
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on munros  (cost=0.00..6.45 rows=134 width=18)
   Filter: (heightm > 1000)
 

If we create an index on heightm the query execution plan will be as follows:

sXXXXXX=> CREATE INDEX idx_munros_heightm ON munros(heightm);
CREATE INDEX
sXXXXXX=> EXPLAIN SELECT mname FROM munros WHERE heightm > 1000;
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on munros  (cost=0.00..6.45 rows=134 width=18)
   Filter: (heightm > 1000)

Surprisingly (?) nothing changed. However, if we make the condition more restrictive, the output will look like this:

sXXXXXX=> EXPLAIN SELECT mname FROM munros WHERE heightm > 1300;
                                   QUERY PLAN

---------------------------------------------------------------------------------
 Bitmap Heap Scan on munros  (cost=2.02..5.19 rows=5 width=18)
   Recheck Cond: (heightm > 1300)
   ->  Bitmap Index Scan on idx_munros_heightm  (cost=0.00..2.02 rows=5 width=0)
         Index Cond: (heightm > 1300)

Here the planner has decided to use a two-step plan: the bottom plan node (starting at ->) visits index idx_munros_heightm to find the locations of rows matching the index condition The upper plan node then actually fetches those rows from the table itself. Fetching the rows separately is much more expensive than sequentially reading them, but because not all the pages of the table have to be visited, this is still cheaper than a sequential scan. (The reason for using two levels of plan is that the upper plan node sorts the row locations identified by the index into physical order before reading them, so as to minimize the costs of the separate fetches. The "bitmap" mentioned in the node names is the mechanism that does the sorting.). In our previous example (WHERE heightm > 1000) the estimated number of matching rows is high and therefore the query optimizer prefers a sequential scan over an index scan. If the WHERE condition is selective enough, the planner may switch completely to an index scan plan:

sXXXXXX=> EXPLAIN SELECT mname FROM munros WHERE heightm > 1400;
                                    QUERY PLAN

---------------------------------------------------------------------------------
 Index Scan using idx_munros_heightm on munros  (cost=0.00..5.41 rows=1 width=1)
   Index Cond: (heightm > 1400)

In this case the table rows are fetched in index order, which makes them even more expensive to read, but there are so few that the extra cost of sorting the row locations is not worth it. You'll most often see this plan type for queries that fetch just a single row, and for queries that request an ORDER BY condition that matches the index order.

To see the effect an index can have on on a query that joins two tables we have to switch to larger tables. Consider the following example:

sXXXXXX=> EXPLAIN SELECT e.primacc, c.text FROM entry e, comment c
sXXXXXX-> WHERE e.primacc = c.primacc AND c.type = 'DISEASE';
                                   QUERY PLAN

---------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..9969.34 rows=1293 width=44)
   ->  Seq Scan on "comment" c  (cost=0.00..2178.43 rows=1293 width=44)
         Filter: (("type")::text = 'DISEASE'::text)
   ->  Index Scan using entry_pkey on entry e  (cost=0.00..6.01 rows=1 width=10)
         Index Cond: (e.primacc = "outer".primacc)

The query will be answered using a Nested-Loop-Join. The outer scan (first ->) is a sequential scan of comment in combination with a filter on type. For the inner scan (second ->), the primacc value of the current outer-scan row is plugged into the inner index scan (on the index that is automatically created on the primary key of entry) to produce an index condition like e.primacc = constant.

Question: Why is entry chosen as the inner relation and comment chosen as the outer relation in the Nested-Loop-Join?

Here is a last example showing another different query plan:

sXXXXXX=> EXPLAIN SELECT e.primacc, f.type FROM entry e, feature f
sXXXXXX-> WHERE e.primacc = f.primacc;
                                QUERY PLAN
---------------------------------------------------------------------------
 Hash Join  (cost=5262.30..29118.66 rows=390463 width=20)
   Hash Cond: ("outer".primacc = "inner".primacc)
   ->  Seq Scan on feature f  (cost=0.00..8093.63 rows=390463 width=20)
   ->  Hash  (cost=4583.44..4583.44 rows=91944 width=10)
         ->  Seq Scan on entry e  (cost=0.00..4583.44 rows=91944 width=10)

This plan sequentially scans relation entry and hashes the primacc values in an in-memory hash-table. It then does a sequential scan of feature, probing into the hash table for possible matches of f.primacc = e.primacc at each feature row.

 Exercise
 Question 1

Create an index on attribute type in comment and explain how the query execution plan changes from the example above:

SELECT e.primacc, c.text FROM entry e, comment c
WHERE e.primacc = c.primacc AND c.type = 'DISEASE';

Why is the index not used if you change the condition to the following?

SELECT e.primacc, c.text FROM entry e, comment c
WHERE e.primacc = c.primacc AND c.type = 'SUBCELLULAR LOCATION';
 Question 2

Consider the query given in Tutorial 6 - Query 1:

SELECT date, region, location
FROM ukweather
WHERE temperature > 30 AND time > '1500' AND conditions = 'Sunny';

Which of the following three indexes is (most likely to be) used when executing the query? Explain why.

  • CREATE INDEX idx_ukw_cond ON ukweather(conditions)
  • CREATE INDEX idx_ukw_temp ON ukweather(temperature)
  • CREATE INDEX idx_ukw_time ON ukweather(time)
 Question 3

Consider again the query given in Tutorial 6 - Query 1. Which of the following two indexes will give the better query performance?

  • CREATE INDEX idx_ukw_temp_time_cond ON ukweather(temperature, time, conditions)
  • CREATE INDEX idx_ukw_cond_time_temp ON ukweather(conditions, time, temperature)
 Question 4

Consider the query given in Tutorial 6 - Query 4. Create the follwoing three sets of indexes independently (i.e., make sure there are no other indexes defined on ukweather). Compare the resulting execution plans and explain why the last set of indexes apparently leads to the most efficient plan.

Note: Make sure that you run ANALYZE ukweather after creating each set of indexes.

Index Set 1:

CREATE INDEX idx_ukw_ldt ON ukweather(location, date, time);

Index Set 2:

CREATE INDEX idx_ukw_ldt ON ukweather(location, date, time);
CREATE INDEX idx_ukw_datetime ON ukweather(date, time);

Index Set 3:

CREATE INDEX idx_ukw_location ON ukweather(location);
CREATE INDEX idx_ukw_datetime ON ukweather(date, time);

Alternative: Creating indexes on a large table like ukweather may take a long time. Here is a file that shows the execution plans for all three sets of indexes in case you want to save time.

End of Tutorial 7. Please send any corrections or suggestions to Heiko Müller.