Exercise #15 (tips and solutions)

The incorrect solution 1.11.1 for this exercise can be corrected easily by using, instead of the model number, the proper column – the primary key code - for distinguishing PCs:

SELECT DISTINCT t.hd
FROM PC t
WHERE EXISTS (SELECT *
            FROM PC
            WHERE pc.hd = t.hd 
            AND pc.code <> t.code
            );

🚫

[[ error ]]

[[ column ]]
NULL [[ value ]]

Since it is sufficient to check matching hard drive capacities for as little as two PCs, a self-join of the PC table with the same conditions can be used:

SELECT DISTINCT pc1.hd
FROM PC pc1, PC pc2
WHERE pc1.hd = pc2.hd 
    AND pc1.code <> pc2.code;

🚫

[[ error ]]

[[ column ]]
NULL [[ value ]]

However, the optimal solution would be using grouping with a filtering condition in a HAVING clause

SELECT
PC.hd FROM PC
GROUP BY hd
HAVING COUNT(hd) > 1;

🚫

[[ error ]]

[[ column ]]
NULL [[ value ]]

For the sake of completeness, here is another solution using a subquery with grouping, whose performance is inferior to that of the one above:

SELECT PC.hd 
FROM PC
GROUP BY hd
HAVING COUNT(hd) > 1;

🚫

[[ error ]]

[[ column ]]
NULL [[ value ]]

The reason for the low efficiency of the solutions with subqueries is that they all use a correlated subquery, i. e. a subquery to be re-executed for each row returned by the main query. The query using join is the worst-performing one. This is quite understandable, since join operations are very costly despite the fairly efficient algorithms of their implementation [5].

Return to discussion of exercise #15

Solve this task at SQL-EX.RU

#Tips and Solutions #Correlated Subqueries #Self-Join #Exercise 15

Exercise #11 (tips and solutions)Exercise #16 (tips and solutions)