Exercise #15 (tips and solutions)

The incorrect solution 1.11.1 for this exercise can be corrected easily by using, instead of the model number, the proper column – the primary key code - for distinguishing PCs:

SELECT DISTINCT t.hd
FROM PC t
WHERE EXISTS (SELECT *
              FROM PC
              WHERE pc.hd = t.hd AND
                    pc.code <> t.code
              );
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Since it is sufficient to check matching hard drive capacities for as little as two PCs, a self-join of the PC table with the same conditions can be used:

SELECT DISTINCT pc1.hd
FROM PC pc1, PC pc2
WHERE pc1.hd = pc2.hd AND
      pc1.code <> pc2.code;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

However, the optimal solution would be using grouping with a filtering condition in a HAVING clause

SELECT
PC.hd FROM PC
GROUP BY hd
HAVING COUNT(hd) > 1;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

For the sake of completeness, here is another solution using a subquery with grouping, whose performance is inferior to that of the one above:

SELECT DISTINCT hd
FROM PC
WHERE (SELECT COUNT(hd)
       FROM PC pc2
       WHERE pc2.hd = pc.hd
       ) > 1;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The reason for the low efficiency of the solutions with subqueries is that they all use a correlated subquery, i. e. a subquery to be re-executed for each row returned by the main query. The query using join is the worst-performing one. This is quite understandable, since join operations are very costly despite the fairly efficient algorithms of their implementation [ 5 ].

Return to discussion of exercise #15

Solve this task at SQL-EX.RU