PostgreSQL CROSSTAB

In PostgreSQL, you can rotate a table using the CROSSTAB function. This function is passed a SQL query as a text parameter, which returns three columns:

  • row ID – this column contains values identifying the resulting (rotated) row;
  • category – unique values in this column determine the columns of the rotated table. Let’s note that, unlike PIVOT , these values themselves are of no significance; important is their number only, since it specifies the maximal possible number of columns;
  • category value – the actual category values. The values are put in columns from left to right, and the category names play no role – just their order determined by the way the query is sorted is important.

We’ll explain this using the Painting database as an example.

Let’s sum up the amount of paint of each color for each square:
select b_q_id, v_color, sum(b_vol) qty from utb join utv on b_v_id = v_id
where b_q_id between 12 and 16
group by b_q_id, v_color
order by b_q_id, case v_color when 'R' then 1 when 'G' then 2 else 3 end;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Here, we confined ourselves to squares with IDs in the range between 12 to 16 in order to, on the one hand, keep the output small, yet, on the other hand, retain its representativeness. The colors are sorted in RGB order. Here is the result set:

b_q_idv_colorQty
12R255
12G255
12B255
13B123
14R50
14B111
15R100
15G100
16G100
16B150

In CROSSTAB terms, the spray can numbers represent row IDs, and the colors correspond to categories. The result of the table rotation should be as follows:

squareRGB
12255255255
13123
1450111
15100100
16100150

Now, let’s try to write a query using CROSSTAB that will return the desired result:

select * from
crosstab(
$$select b_q_id, v_color, sum(b_vol) qty from utb join utv on b_v_id = v_id
where b_q_id between 12 and 16
group by b_q_id, v_color
order by b_q_id, case v_color when 'R' then 1 when 'G' then 2 else 3 end; $$
) as ct(square int, "R" bigint,"G" bigint,"B" bigint)
order by square;

Here, we have to list all columns specifying their types. Yet some of the category columns may be omitted. Let’s look at the result (you can check your queries using the console, and setting PostgreSQL for execution):

squareRGB
12255255255
13123
1450111
15100100
16100150

This result isn’t exactly what we expected. We recall that only the order of the categories matters. If the square has been painted with one color only, the value (total amount of paint) lands in the first category (named R in our query) no matter what color it actually was. Let’s rewrite our query to make it return values for all colors, and that in the order needed. Missing colors should be replaced by NULL. To achieve this, we’ll add for each square and color a row with a NULL in the paint amount field:

select b_q_id, v_color, sum(b_vol) qty from(
select b_q_id, v_color, b_vol from utb
join utv on b_v_id = v_id
union all --here addition is
select * from (select distinct b_q_id from utb) X
cross join (select 'R' color, null::smallint vol
                union all select 'G', null union all select 'B', null) Y
) X
where b_q_id between 12 and 16
group by b_q_id, v_color
order by b_q_id, case v_color when 'R' then 1 when 'G' then 2 else 3 end;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]
b_q_idv_colorQty
12R255
12G255
12B255
13R
13G
13B123
14R50
14G
14B111
15R100
15G100
15B
16R
16G100
16B150

Now the query below will return the required result.

select * from
crosstab(
$$select b_q_id, v_color, sum(b_vol) qty from(
select b_q_id, v_color, b_vol from utb
join utv on b_v_id = v_id
union all
select * from (select distinct b_q_id from utb) X
cross join (select 'R' color, null::smallint vol
               union all select 'G', null union all select 'B', null) Y
) X
where b_q_id between 12 and 16
group by b_q_id, v_color
order by b_q_id, case v_color when 'R' then 1 when 'G' then 2 else 3 end;$$
) as ct(square int, "R" bigint,"G" bigint,"B" bigint)
order by square;

You probably already start asking yourself whether there is some easier way to do that.

The answer is “yes”. As it turns out CROSSTAB can be passed another optional parameter – a query returning the category list in the same order used for the columns. Then, we can modify our first query as follows to make it yield the correct result:

select * from
crosstab(
$$select b_q_id, v_color, sum(b_vol) qty from utb join utv on b_v_id = v_id
where b_q_id between 12 and 16
group by b_q_id, v_color
order by b_q_id, case v_color when 'R' then 1 when 'G' then 2 else 3 end; $$,
$$select 'R' union all select 'G' union all select 'B';$$
) as ct(square int, "R" bigint,"G" bigint,"B" bigint)
order by square;

Since PostgreSQL allows the usage of table value constructors, the query

select 'R' union all select 'G' union all select 'B';

can be replaced by a shorter one:

values('R'),('G'),('B');

If, when running the query, you get an error message that crosstab is not recognized as a function, it means you don’t have the tablefunc module installed. This can be fixed by a simple command (starting with version 9.1)

CREATE
EXTENSION IF NOT EXISTS tablefunc;

that has to be executed for a given database. For versions prior to 9.1 you just need to copy the following file: share/contrib/tablefunc.sql to pgadmin, and execute it.