PostgreSQL query plan explanation

PostgreSQL query plan explanation

{< author “Tereschenko M.” >}}

This article describes the main operations displayed in the query execution plans of PostgreSQL.

Seq Scan

The operation scans the entire table in the order in which it is stored on disk. Usually, the presence of this operation has a negative impact on performance, since it involves sequential reading to retrieve a large number of rows, resulting in slower processing. But there are exceptions, for example, using the Limit clause, which limits the number of records returned.

Index Scan

An index scan traverses an index, looks at the leaf nodes to find all matching records, and retrieves the matching data from the table. In most cases, it is good for performance because it directly accesses the required rows of data.

Bitmap Index Scan

While Index Scan fetches one tuple pointer at a time from the index and immediately moves to that tuple in the table, Bitmap Index Scan fetches all the tuple pointers from the index at once, sorts them using an in-memory bitmap data structure, and then scans the table tuples in the order in which the tuples are physically located.

Merge Join

A merge join combines two sorted lists. Both sides of the join must be sorted first.

Nested Loops

A nested loops joins two tables by selecting the result from one table and querying the other table for each row from the first. It is very common algorithm. It performs a fairly efficient join on relatively small data sets.

A nested loops join does not require the input data to be sorted.

Hash Join

A hash join loads candidate records from one side of the join into a hash table, which is then checked for each row on the other side of the join. It is used whenever other types of joins cannot be applied: if the data sets being joined are large enough and/or the data sets are not ordered by the join columns.

Sort

Sorts the set by the columns specified in the Sort Key. The sort operation requires large amounts of memory to materialize the intermediate result.

Aggregate

Appears in the plan if the query has an aggregate function used to calculate individual results from multiple input rows: COUNT, SUM, AVG, MAX, MIN, etc.

GroupAggregate

Groups a presorted set according to the GROUP BY clause. This operation does not buffer intermediate results.

HashAggregate

Uses a temporary hash table to group records. The HashAggregate operation does not require a pre-sorted data set, but instead uses large amounts of memory to materialize the intermediate result. The output is not ordered in any meaningful way.

Unique

Removes duplicate data. Uses virtually no memory: compares the value in the previous row with the current one and discards it if they are the same. The data must be pre-sorted.

Filter

Applies a filter to a set of rows.

Limit

Aborts the execution of operations when the required number of rows have been selected.

Append

Runs multiple sub-operations and returns all the rows they return as a single result. Used in queries containing UNION or UNION ALL.

HashSetOp

This operation is used by INTERSECT and EXCEPT (with the optional ALL modifier). It works by running Append sub-operations on a pair of subqueries, and then deciding which rows to return based on the result and the optional ALL modifier.

Materialize

The operation takes data from an underlying operation and places it in memory (or partially in memory) so that it can be used more quickly, or adds additional properties to it that the previous operation does not provide.

CTE Scan

Similar to Materialize. The operation executes part of a query and stores its output so that it can be used by another part (or parts) of the query.

SubPlan

Operation denotes a subquery that has references to the main query. Called to compute data from the subquery that depends on the current row.

InitPlan

Operation denotes a subquery that has no references to the main query. An operation appears in a plan when there is a part of the query that can (or should) be computed first, and it does not depend on anything in the rest of the query.

Subquery Scan

Operation denotes a subquery that is part of a UNION.