Exercise #15 (tips and solutions)
The incorrect solution 1.11.1 for this exercise can be corrected easily by using, instead of the model number, the proper column – the primary key code - for distinguishing PCs:
SELECT DISTINCT t.hd
FROM PC t
WHERE EXISTS (SELECT *
FROM PC
WHERE pc.hd = t.hd AND
pc.code <> t.code
);
[[ column ]] |
---|
[[ value ]] |
Since it is sufficient to check matching hard drive capacities for as little as two PCs, a self-join of the PC table with the same conditions can be used:
SELECT DISTINCT pc1.hd
FROM PC pc1, PC pc2
WHERE pc1.hd = pc2.hd AND
pc1.code <> pc2.code;
[[ column ]] |
---|
[[ value ]] |
However, the optimal solution would be using grouping with a filtering condition in a HAVING clause
SELECT
PC.hd FROM PC
GROUP BY hd
HAVING COUNT(hd) > 1;
[[ column ]] |
---|
[[ value ]] |
For the sake of completeness, here is another solution using a subquery with grouping, whose performance is inferior to that of the one above:
SELECT DISTINCT hd
FROM PC
WHERE (SELECT COUNT(hd)
FROM PC pc2
WHERE pc2.hd = pc.hd
) > 1;
[[ column ]] |
---|
[[ value ]] |
The reason for the low efficiency of the solutions with subqueries is that they all use a correlated subquery, i. e. a subquery to be re-executed for each row returned by the main query. The query using join is the worst-performing one. This is quite understandable, since join operations are very costly despite the fairly efficient algorithms of their implementation [ 5 ].