Exercise #(-2) (tips and solutions)
When solving this exercise one typical mistake, which has place in the solution 3.6.2 also, is admitted by participants. But I’ll just simplify the formulation for you not to loose pleasure to do this exercise on your own. So,
We can determine the distribution of ships’ quantity by the years in such a way.
SELECT launched [year], COUNT(*) cnt
FROM Ships
GROUP BY launched;
[[ column ]] |
---|
[[ value ]] |
Note
In SQL Server, names of columns can be bounded by square brackets ([]) to avoid ambiguity. For example, ambiguity arises at use of blanks in names when the first word could be interpreted as a name of a column, and the second - as its alias. Good practice is to avoid usage of blanks in names, however, their use for formation of report’s headings is quite justified.
In our case ([year]) square brackets are applied in order to prevent messing with year() function which returns year from the argument of datetime data type.
Now we should leave from all the rows, returned by this query, only those that have maximal quantity (cnt), so that is:
cnt >= ALL(SELECT COUNT(*) cnt
FROM Ships
GROUP BY launched
)
Finally we get.
Solution 4.19.1
SELECT *
FROM (SELECT launched [year], COUNT(*) cnt
FROM Ships
GROUP BY launched
) x
WHERE cnt >= ALL(SELECT COUNT(*) cnt
FROM Ships
GROUP BY launched
);
[[ column ]] |
---|
[[ value ]] |
However, there’s a mistake. This mistake is not associated with a formal scheme of solution. It’s unquestioned. It usually happens when you solve exercises on the website, origin of the mistake is in the incorrect accounting of domain model’s features, namely its constraints. Database schema of the exercise we are considering adopts a fact that there may be ships with an unknown year of launching in the database. This is firstly due to nullable of the launched column and, second, for a head ship, which is present only in Outcomes table, launch year is unknown also.
Building ships is not a rabbit breeding :-). Ships are being built for years. That’s why if for a number of ships the launching year is unknown (NULL), then there’s a high probability that the amount of such ships will be higher than the amount of ships launched in any real year. The feature of grouping lies in the fact (mentioned in Standard) that NULL values treat as equal. Hence, all ships with unknown launching year will be summarized with a year NULL. I suppose that the result shouldn’t include this string, because the ‘unknown year’ doesn’t mean the ‘same year’. Certainly, it’s disputable. However, all disputes will come to permissibility of using of a specific NULL value in relational model. Discussions about this are being carried on from the times of creation of this model by Codd E.F., whom the idea of NULL value belongs to. However, as far as I know, no deserving alternative was offered.
Returning to our task, in token of measureless respect to Codd I will change the solution in this way:
Solution 4.19.2
SELECT * FROM (SELECT launched [year], COUNT(*) cnt
FROM Ships
WHERE launched IS NOT NULL
GROUP BY launched
) x
WHERE cnt >= ALL(SELECT COUNT(*) cnt
FROM Ships
WHERE launched IS NOT NULL
GROUP BY launched
);
[[ column ]] |
---|
[[ value ]] |
Checking for NULL values in the subquery of WHERE clause is not necessary if you’ll use function COUNT (launched) instead of function COUNT (*) because only the ships with known launch year in this case will be counted up:
WHERE cnt >= ALL(SELECT COUNT(launched) cnt
FROM Ships
GROUP BY launched
)
For all ships with unknown launch year the row with value 0 will be received so, if in a set there is no records, COUNT function returns this value. The last should not confuse us as the quantity of the ships in the main query is more than zero if there is at least one ship with known launch year. It is possible to act similarly in the main query also to allow us to receive more brief form of the solution:
Solution 4.19.3
SELECT * FROM (SELECT launched [year], COUNT(launched) cnt
FROM Ships
GROUP BY launched
) x
WHERE cnt >= ALL(SELECT COUNT(launched) cnt
FROM Ships
GROUP BY launched
);
[[ column ]] |
---|
[[ value ]] |
For the sake of justice it is necessary to note that the cost of the solution 4.19.3 in accordance with its execution plan in SQL Server 2000 slightly concedes (in the third significant digit) against the cost of the solution 4.19.2.
Note
It is possible to see cost of any query to educational databases and also the procedural plan of its execution on this page of the site.