ROW_NUMBER function

ROW_NUMBER() function numbers the rows extracted by a query. It can helps to perform more complex ordering of rows in the report, than allow the ORDER BY clause in SQL-92 Standard.

Before this function, to number rows extracted by a query, had to use a fairly complex algorithm intuitively incomprehensible, as described in the paragraph. The only advantage of that algorithm is that it works on almost all DBMS that support SQL-92 Standard.

Note

Naturally, the numbering can be performed by Procedural Languages, using the cursors and/or temporary tables. But we are talking about “pure” SQL.

The ROW_NUMBER function allows:

  • specify the numbering, which differs from the order of rows in the result set;
  • create a “non-through” numbering, i.e. select group of the total number of rows and number them separately for each group;
  • use multiple methods of numbering, because actually the numbering does not depend on the ordering of a query strings.

Let us show the capability of ROW_NUMBER function on the examples.

Example 1

Number all flights from the Trip table in ascending order of their numbers. Order by {id_comp, trip_no}.

Solution

SELECT row_number() over(order by trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

The OVER clause, which uses with ROW_NUMBER function, specifies the order of numbering rows. At the same time an additional ORDER BY clause is used, which has no relation to the order of output rows of the query. Considering the results it is obvious that the order of rows and the numbering, in the result set, are not coincide:

numtrip_noid_comp
311811
411821
511871
611881
711951
811961
111452
211462

Note

Criteria id_comp < 3 is used only to reduce the size of a sample.

Of course, the numerical order can be changed, by rewriting the last line like here:

order by trip_no

(or, equivalently, order by num)

Or, conversely, to number rows in the order given by sort:

SELECT row_number() over(order by id_comp, trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp<3
ORDER BY id_comp, trip_no;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]
numtrip_noid_comp
111811
211821
311871
411881
511951
611961
711452
811462

What about renumbering of flights for each company separately? For this purpose we need to use PARTITION BY in the OVER clause.

PARTITION BY specifies a group of rows for which the independent numbering is performed. Group is defined by the equality of values in the column list specified in that construction.

Example 2

Renumber flights of each company separately in order of increasing numbers of flights.
SELECT row_number() over(partition by id_comp order by id_comp,trip_no) num,
trip_no, id_comp
FROM trip
WHERE ID_comp < 3
ORDER BY id_comp, trip_no;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

PARTITION BY id_comp means that flights of each company form a group, for which an independent numbering is performed. As a result, we obtain:

numtrip_noid_comp
111811
211821
311871
411881
511951
611961
111452
211462

Lack of PARTITION BY clause, as it was in the first example, means that all rows in the result set form one single  group.

Suggested exercises: 65, 97, 116, 125, 130, 137

MySQL has not ranking/window functions in its staff, but you can use variables in a SQL query directly. Particularly, the problem of numbering query result rows can be solved with using variables. Let’s demonstrate it by example being considered on the previous page.

SELECT @i:=@i+1 num,
     trip_no, id_comp
    FROM Trip, (select @i:=0) X
    WHERE ID_comp < 3
    ORDER BY id_comp, trip_no;
numid_comptrip_no
111181
211182
311187
411188
511195
611196
721145
821146

In the third line of the script, initializing of variable is accomplished. As a result, each row of Trip table will be joined the single-column row containing 0 (just Cartesian product).

In the first line of the script, variable is incrementing by 1, and this action takes place when calculating each next row in the sorting that is specified in ORDER BY clause. The numbering in this case corresponds to the sorting order.

If you omit the initialization of variable, you might obtain the correct result but without garantee. If you run the query (without initializing) once more in the current session of connection with database, you should obtain numbering which is starting from the maximal value of variable @i reached at the previous execution of the query.

We can get numbering of rows for each company separately, i.e. to imitate the behaviour of PARTITION BY in the query:

SELECT row_number() over(PARTITION BY id_comp ORDER BY id_comp,trip_no) num,
     trip_no, id_comp
    FROM Trip
    WHERE ID_comp < 3
    ORDER BY ID_comp, trip_no;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

 Let’s introduce another variable for storing the company ID (@comp). When initializing variable, we assign unexisting ID (0, for example). Then for each row, the value of variable is comparing with company ID of the current row. If these values equal each other, @i is incrementing; if not - @i is resetting to 1. Finally, we assign the company ID from current row to variable @comp.  The matter is that the comparison takes place before assigning, in so doing we compare the current value of company ID with the ID from previous row (in the given sorting order). Here the query is.

SELECT
case when @comp=id_comp then @i:=@i+1 else @i:=1 end num,
@comp:=id_comp id_comp, trip_no
    FROM Trip, (select @i:=0, @comp:=0) X
    WHERE ID_comp < 3
    ORDER BY ID_comp, trip_no;
numid_comptrip_no
111181
211182
311187
411188
511195
611196
121145
221146

As we don’t follow SQL Standard here, let’s use MySQL IF function to make the query shorter.

SELECT
IF(@comp=id_comp, @i:=@i+1, @i:=1) num,
@comp:=id_comp id_comp, trip_no
    FROM Trip, (select @i:=0, @comp:=0) X
    WHERE ID_comp < 3
    ORDER BY id_comp, trip_no;

To check the above queries, use console when selecting MySQL as target DBMS.