COUNT DISTINCT and window functions
We can easily calculate the total number of PCs for each manufacturer, as well as the number of unique models of this manufacturer in the PC table:
select maker, count(*) models, count(distinct pc.model) unique_models
from product p
join pc on p.model=pc.model
group by maker
order by maker;
🚫
[[ error ]]
[[ column ]] |
---|
NULL [[ value ]] |
maker | models | unique_models |
---|---|---|
A | 8 | 2 |
B | 3 | 1 |
E | 1 | 1 |
If we need to get details about each model, along with their total number for each manufacturer, we can use the window function:
select maker, pc.model, pc.price,
count(*) over(partition by maker) models
from product p
join pc on p.model=pc.model
order by maker, pc.model;
🚫
[[ error ]]
[[ column ]] |
---|
NULL [[ value ]] |
maker | model | price | models |
---|---|---|---|
A | 1232 | 600,00 | 8 |
A | 1232 | 400,00 | 8 |
A | 1232 | 350,00 | 8 |
A | 1232 | 350,00 | 8 |
A | 1233 | 600,00 | 8 |
A | 1233 | 950,00 | 8 |
A | 1233 | 980,00 | 8 |
A | 1233 | 970,00 | 8 |
B | 1121 | 850,00 | 3 |
B | 1121 | 850,00 | 3 |
B | 1121 | 850,00 | 3 |
E | 1260 | 350,00 | 1 |
Now imagine that we need to supplement this information with the number of unique models. An expected attempt
select maker, pc.model,pc.price,
count(*) over(partition by maker) models,
count(distinct pc.model) over(partition by maker) unique_models
from product p
join pc on p.model=pc.model
order by maker, pc.model;
🚫
[[ error ]]
[[ column ]] |
---|
NULL [[ value ]] |
fails:
The DISTINCT keyword is not allowed with the OVER clause.
The error message clearly describes the problem. The question is how to work around it.
Using a subquery
with cte as
(select maker, pc.model, pc.price,
count(*) over(partition by maker) models
from product p
join pc on p.model=pc.model
)
select maker, model, models,
(select count(distinct model)
from cte t
where t.maker=cte.maker) unique_models
from cte
order by maker,model;
🚫
[[ error ]]
[[ column ]] |
---|
NULL [[ value ]] |
maker | model | models | unique_models |
---|---|---|---|
A | 1232 | 8 | 2 |
A | 1232 | 8 | 2 |
A | 1232 | 8 | 2 |
A | 1232 | 8 | 2 |
A | 1233 | 8 | 2 |
A | 1233 | 8 | 2 |
A | 1233 | 8 | 2 |
A | 1233 | 8 | 2 |
B | 1121 | 3 | 1 |
B | 1121 | 3 | 1 |
B | 1121 | 3 | 1 |
E | 1260 | 1 | 1 |
Using DENSE_RANK
with cte as
(select maker, pc.model, pc.price,
count(*) over(partition by maker) models,
DENSE_RANK() over(partition by maker order by pc.model) drnk
from product p
join pc on p.model=pc.model
)
select maker, model, price, models,
max(drnk) over(partition by maker) unique_models
from cte
order by maker, model;
🚫
[[ error ]]
[[ column ]] |
---|
NULL [[ value ]] |
Here we take advantage of the fact that the last rank value, max(drnk)
, turns out to be equal to the number of unique models.