CASE statement

Let the list of all the models of PC is required along with their prices. Besides that, if the model is not on sale (not in PC table), in the place of price must be the text “Not available”.

The list of all the PC models with their prices we can obtain running the query:

SELECT DISTINCT Product.model, price
FROM Product LEFT JOIN
PC ON Product.model = PC.model
WHERE product.type = 'pc';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Missing prices will be replaced by NULL values:

Modelprice
1121850
1232350
1232400
1232600
1233600
1233950
1233980
1260350
2111NULL
2112NULL

The CASE statement helps us to get required text instead of NULLs:

SELECT DISTINCT product.model,
CASE
WHEN price IS NULL
THEN 'Not available'
ELSE CAST(price AS CHAR(20))
END price
FROM Product LEFT JOIN
PC ON Product.model = PC.model
WHERE product.type = 'pc';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Depending on defined conditions, the CASE statement returns one of the possible values. The condition in above example is the checking for NULL. If this condition is satisfied, the text “Not available” will be returned; otherwise (ELSE) it will be the price.

One principal moment is here. As a table is always the result of the SELECT statement, all values from any column must be of the same data type (having regard to implicit type conversions). Then, we cannot combine character constant with a price (numeric type) within a single column. That is the reason why we use the type conversion to the price column to reduce its type to character. As a result, we get

modelprice
1121850
1232350
1232400
1232600
1233600
1233950
1233980
1260350
2111Not available
2112Not available

The CASE statement may be used in one of two syntax forms:

The first form

CASE < input expression >
   WHEN < when expression 1>
   THEN < return expression 1 >  
     
   WHEN < when expression N >  
   THEN < return expression N >  
[ELSE < return expression >]  
END

The second form

CASE  
   WHEN < predicate 1 >  
   THEN < return expression 1 >  
     
   WHEN < predicate N >  
   THEN < return expression N >  
[ELSE < return expression >]  
END

All WHEN clauses must be in the same syntax form, i.e. first and second forms cannot be mixed. When using the first syntax form, the WHEN condition is satisfied as soon as the value of when expression will be equal to the value of input expression. When using the second syntax form, the WHEN condition is satisfied as soon as the predicate evaluates to TRUE. When satisfying condition, the CASE statement returns the return expression from the corresponding THEN clause. If no WHEN expression is satisfied, the return expression from the ELSE clause will be used. If no ELSE clause is specified, a NULL value will be returned. If more than one condition are satisfied, the first return expression from them will be returned.

The above example uses the second form of the CASE statement.

It should be noted that checking for NULL could also be made using the standard function COALESCE, which is simpler. This function has arbitrary number of arguments and returns the first not-NULL expression among them. In the case of two arguments, COALESCE(A, B) is equivalent to the following CASE statement:

CASE
WHEN A IS NOT NULL
THEN A
ELSE B
END

When using the COALESCE function, the solution to the above example may be rewritten as follows

SELECT DISTINCT Product.model,
COALESCE(CAST(price as CHAR(20)),'Not available') price
FROM Product LEFT JOIN
PC ON Product.model = PC.model
WHERE Product.type = 'pc';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Usage of the first syntax form of the CASE statement can be demonstrated by the following example: Get all available PC models, their prices, and information about the most expensive and cheap models.

Example 5.10.1

Get all available PC models, their prices, and information about the most expensive and cheap models.
SELECT DISTINCT model, price,
CASE price
WHEN (SELECT MAX(price)
FROM PC
)
THEN 'Most expensive'
WHEN (SELECT MIN(price)
FROM PC
)
THEN 'Most cheap'
ELSE 'Mean price'
END comment
FROM PC
ORDER BY price;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The result set

modelpricecomment
1232350Most cheap
1260350Most cheap
1232400Mean price
1232600Mean price
1233600Mean price
1121850Mean price
1233950Mean price
1233980Most expensive

CASE operator can be used not only in SELECT clause. Here you can find other examples of its use.

Suggested exercises: 52, 59, 64, 65, 66, 68, 69, 74, 75, 83, 84, 869192, 93, 99, 110, 111, 113, 118,126, 128, 130, 131, 132

Let’s consider a few more examples.

Determine number of trips from Rostov to Moscow, and number of trips to other towns (not to Moscow).

Here we can use additional computed column for grouping by it.

SELECT flag, COUNT(*) qty FROM
(SELECT CASE WHEN town_to ='Moscow' THEN 'Moscow' ELSE 'Other' END flag
FROM Trip
WHERE town_from='Rostov'
) X
GROUP BY flag;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]
flagqty
Moscow4
Other2
Determine total number of trips from Rostov, and number of trips from Rostov to towns other than Moscow.

This problem requires to do grouping by two sets also, but now one set is a subset of another one. So exactly the same approach based on computed column is not fit here. In the previous task we had two non-intersecting sets to do grouping for each.

To solve this task we can calculate count over total set and use subquery to calculate the count over subset (second referense to a table), or we can use CASE statement in the conjunction  with aggregate function to avoid second reading of a table. Let’s check the estimation of optimizer for these two variants.

Use of subquery

SELECT COUNT(*) total,
(SELECT COUNT(*) FROM Trip
WHERE town_from='Rostov' AND town_to <> 'Moscow') non_moscow
FROM Trip
WHERE town_from='Rostov';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Use of CASE with aggregate function

SELECT COUNT(*) total_qty,
SUM(CASE WHEN town_to <>'Moscow' THEN 1 ELSE 0 END) non_moscow
FROM Trip
WHERE town_from='Rostov';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The result will be the same naturally:

totalnon_moscow
62

but the cost of the second query, as was expected,  is twice as less than of the first one.

You can compare the real run time of these queries on sufficient large amount of data.

The second variant can be rewritten in more short form with use of NULLIF function:

SELECT COUNT(*) total_qty,
COUNT(NULLIF(town_to, 'Moscow')) non_moscow
FROM Trip
WHERE town_from='Rostov';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

NULLIF function returns NULL, if its arguments are equal, or the first one otherwise.

The solution exploits the fact that the aggregate functions don’t take into account NULL values which arise in argument of COUNT function when the landing town is ‘Moscow’.

Beginning from version 2012, IIF function is available in SQL Server. This function is well known to those who use Access VBA and is an alternative to the CASE expression in standard SQL. The syntax of IIF function is

IIF(< condition>, < expression if condition is true>, < expression if condition is not true>)

    The function returns the value of expression in the second parameter, if the condition is evaluated as true, or the value of expression in the third parameter otherwise. So, the function

IIF(condition, expression_1, expression_2)

is equivalent to the following CASE statement:

CASE WHEN condition THEN expression_1 ELSE expression_2 END

Using IIF function, we can rewrite the solution of the first task as follows:

SELECT DISTINCT product.model,
    IIF(price IS NULL'Not available', CAST(price AS CHAR(20))) price
    FROM Product LEFT JOIN
    PC ON Product.model = PC.model
    WHERE product.type = 'PC';
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

    In the case where there are more than two variants of branching, you can use nested IIF functions. For example, solution for task 5.10.1 you can get with the following query:

SELECT DISTINCT model, price,
    IIF(price=(SELECT MAX(price) FROM PC), 'Most expensive',
        IIF(price=(SELECT MIN(price) FROM PC), 'Most cheap', 'Mean price')) comment
    FROM PC
    ORDER BY price;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

    If situation will go in the same manner, SWITCH statement could soon appear in T-SQL. :-)