ROW_NUMBER function
ROW_NUMBER() function numbers the rows extracted by a query. It can helps to perform more complex ordering of rows in the report, than allow the ORDER BY clause in SQL-92 Standard.
Before this function, to number rows extracted by a query, had to use a fairly complex algorithm intuitively incomprehensible, as described in the paragraph. The only advantage of that algorithm is that it works on almost all DBMS that support SQL-92 Standard.
Note
Naturally, the numbering can be performed by Procedural Languages, using the cursors and/or temporary tables. But we are talking about “pure” SQL.
The ROW_NUMBER function allows:
- specify the numbering, which differs from the order of rows in the result set;
- create a “non-through” numbering, i.e. select group of the total number of rows and number them separately for each group;
- use multiple methods of numbering, because actually the numbering does not depend on the ordering of a query strings.
Let us show the capability of ROW_NUMBER function on the examples.
Example 1
Solution
SELECT row_number() over(order by trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
[[ column ]] |
---|
[[ value ]] |
The OVER clause, which uses with ROW_NUMBER function, specifies the order of numbering rows. At the same time an additional ORDER BY clause is used, which has no relation to the order of output rows of the query. Considering the results it is obvious that the order of rows and the numbering, in the result set, are not coincide:
num | trip_no | id_comp |
---|---|---|
3 | 1181 | 1 |
4 | 1182 | 1 |
5 | 1187 | 1 |
6 | 1188 | 1 |
7 | 1195 | 1 |
8 | 1196 | 1 |
1 | 1145 | 2 |
2 | 1146 | 2 |
Note
Criteria id_comp < 3 is used only to reduce the size of a sample.
Of course, the numerical order can be changed, by rewriting the last line like here:
order by trip_no
(or, equivalently, order by num)
Or, conversely, to number rows in the order given by sort:
SELECT row_number() over(order by id_comp, trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp<3
ORDER BY id_comp, trip_no;
[[ column ]] |
---|
[[ value ]] |
num | trip_no | id_comp |
---|---|---|
1 | 1181 | 1 |
2 | 1182 | 1 |
3 | 1187 | 1 |
4 | 1188 | 1 |
5 | 1195 | 1 |
6 | 1196 | 1 |
7 | 1145 | 2 |
8 | 1146 | 2 |
What about renumbering of flights for each company separately? For this purpose we need to use PARTITION BY in the OVER clause.
PARTITION BY specifies a group of rows for which the independent numbering is performed. Group is defined by the equality of values in the column list specified in that construction.
Example 2
SELECT row_number() over(partition by id_comp order by id_comp,trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
[[ column ]] |
---|
[[ value ]] |
PARTITION BY id_comp means that flights of each company form a group, for which an independent numbering is performed. As a result, we obtain:
num | trip_no | id_comp |
---|---|---|
1 | 1181 | 1 |
2 | 1182 | 1 |
3 | 1187 | 1 |
4 | 1188 | 1 |
5 | 1195 | 1 |
6 | 1196 | 1 |
1 | 1145 | 2 |
2 | 1146 | 2 |
Lack of PARTITION BY clause, as it was in the first example, means that all rows in the result set form one single group.
Suggested exercises: 65, 97, 116, 125, 130, 137
MySQL has not ranking/window functions in its staff, but you can use variables in a SQL query directly. Particularly, the problem of numbering query result rows can be solved with using variables. Let’s demonstrate it by example being considered on the previous page.
SELECT @i:=@i+1 num,
trip_no, id_comp
FROM Trip, (select @i:=0) X
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
num | id_comp | trip_no |
---|---|---|
1 | 1 | 1181 |
2 | 1 | 1182 |
3 | 1 | 1187 |
4 | 1 | 1188 |
5 | 1 | 1195 |
6 | 1 | 1196 |
7 | 2 | 1145 |
8 | 2 | 1146 |
In the third line of the script, initializing of variable is accomplished. As a result, each row of Trip table will be joined the single-column row containing 0 (just Cartesian product).
In the first line of the script, variable is incrementing by 1, and this action takes place when calculating each next row in the sorting that is specified in ORDER BY clause. The numbering in this case corresponds to the sorting order.
If you omit the initialization of variable, you might obtain the correct result but without garantee. If you run the query (without initializing) once more in the current session of connection with database, you should obtain numbering which is starting from the maximal value of variable @i reached at the previous execution of the query.
We can get numbering of rows for each company separately, i.e. to imitate the behaviour of PARTITION BY in the query:
SELECT row_number() over(PARTITION BY id_comp ORDER BY id_comp,trip_no) num,
trip_no, id_comp
FROM Trip
WHERE ID_comp < 3
ORDER BY ID_comp, trip_no;
[[ column ]] |
---|
[[ value ]] |
Let’s introduce another variable for storing the company ID (@comp). When initializing variable, we assign unexisting ID (0, for example). Then for each row, the value of variable is comparing with company ID of the current row. If these values equal each other, @i is incrementing; if not - @i is resetting to 1. Finally, we assign the company ID from current row to variable @comp. The matter is that the comparison takes place before assigning, in so doing we compare the current value of company ID with the ID from previous row (in the given sorting order). Here the query is.
SELECT
case when @comp=id_comp then @i:=@i+1 else @i:=1 end num,
@comp:=id_comp id_comp, trip_no
FROM Trip, (select @i:=0, @comp:=0) X
WHERE ID_comp < 3
ORDER BY ID_comp, trip_no;
num | id_comp | trip_no |
---|---|---|
1 | 1 | 1181 |
2 | 1 | 1182 |
3 | 1 | 1187 |
4 | 1 | 1188 |
5 | 1 | 1195 |
6 | 1 | 1196 |
1 | 2 | 1145 |
2 | 2 | 1146 |
As we don’t follow SQL Standard here, let’s use MySQL IF function to make the query shorter.
SELECT
IF(@comp=id_comp, @i:=@i+1, @i:=1) num,
@comp:=id_comp id_comp, trip_no
FROM Trip, (select @i:=0, @comp:=0) X
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
To check the above queries, use console when selecting MySQL as target DBMS.