Exercise #8 (tips and solutions)

Let’s start with a couple of “natural” solutions that differ only in the predicate checking the maker doesn’t manufacture any laptops.

Solution 4.4.1. Predicate NOT IN

SELECT DISTINCT maker
FROM Product
WHERE type = 'PC' AND
      maker NOT IN (SELECT maker
                    FROM Product
                    WHERE type = 'Laptop'
                    );
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Solution 4.4.2. Predicate EXISTS (as usual for this predicate, it uses a correlated subquery)

SELECT DISTINCT maker
FROM Product AS pc_product
WHERE type = 'pc' AND
      NOT EXISTS (SELECT maker
                  FROM Product
                  WHERE type = 'laptop' AND
                        maker = pc_product.maker
                  );
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Now, there are a few more original solutions.

Solution 4.4.3. Using correlated subqueries with grouping

SELECT DISTINCT maker
FROM Product as p
WHERE (SELECT COUNT(1)
       FROM Product pt
       WHERE pt.type = 'PC' AND
             pt.maker = p.maker
       ) > 0 AND
      (SELECT COUNT(1)
       FROM Product pt
       WHERE pt.type = 'Laptop' AND
             pt.maker = p.maker
       ) = 0;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The subqueries check that the number of PC models by the manufacturer in the outer query is greater than zero, while the number of models of portable computers of the same maker is zero.

Please note the usage of COUNT (1). The standard defines two types of arguments for this function: “” and an expression. The usage of “” results in calculating the number of rows returned by the query. Using an expression counts the number of rows for which said expression has a value that is not NULL. Since in most cases, a column name acts as an expression, using a constant can be surprising for those not yet sufficiently familiar with the language. Obviously, a constant (1 in our case) can’t be NULL; thus, this expression is completely equivalent to COUNT (*).

On the example of this simple task, a great variety of solution approaches, owing to the flexibility of SQL, can be demonstrated.

Solution 4.4.4. Outer self-join

SELECT DISTINCT p.maker
FROM Product p LEFT JOIN
Product p1 ON p.maker = p1.maker AND
p1.type = 'Laptop'
WHERE p.type = 'PC' AND
  p1.maker IS NULL;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The Product table is joined to itself using LEFT JOIN on the condition the maker is the same, and the product type for the second table is  laptop. The p1.maker column will be NULL if a manufacturer doesn’t have any laptop models; this is used in the WHERE clause together with the condition the product type for the record checked is PC.

Solution 4.4.5. Grouping

SELECT maker
FROM (SELECT DISTINCT maker, type
FROM Product
WHERE type IN ('PC', 'Laptop')
) AS a
GROUP BY maker
HAVING COUNT(*) = 1 AND
  MAX(type) = 'PC';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

In the subquery, unique pairs (supplier, type) are selected with type being either PC or Laptop. Then, grouping by supplier is performed, whereby the rows grouped have to meet the following conditions:

  • COUNT(*) = 1 — which means, the maker has to manufacture only one product type of the selected (since printers have already been excluded by the IN predicate, it’s either PC or Laptop);
  • MAX(type) = ‘PC’ — this single manufactured product type is PC. Since the HAVING clause can’t contain columns not referred to by aggregate functions, it uses MAX (type); MIN(type) would work equally well.

With so many solution approaches, it’s natural to ask about their efficiency – that is, which query will be executed faster. Leading the field both in number of operations and estimated execution time is solution 4.4.5. The third solution has the weakest performance. Others are estimated to take about twice as much time to execute as the leader.

Note

In Management Studio (SQL Server), you can get the estimated execution time value and the query execution plan by first running the command

SET SHOWPLAN_ALL ON;

and then the queries to be analyzed. To return to normal mode, type

SET SHOWPLAN_ALL OFF;

If SQL Server isn’t installed on your computer, you can get the query execution plan directly from the site.

Return to discussion of exercise #8

Solve this task at SQL-EX.RU