Random sampling of rows from a table in SQL Server page 2 |
||||||||||||||||||||||||||||||||||||||||
Now, let us employ the RAND function for our problem, namely, for producing two randomly selected models: We obtain two models ascendingly ordered by their numbers. That's where pseudorandomness shows up. Let's have a look at what kind of random numbers we've got: For brevity sake, I drop part of them:
I don't know which algorythm is used for random number computing, nevertheless I can state that the RAND function behaves monotonously within this range of model numbers. That is why we failed. Электронная документация SQL Server Books OnlineBOL gives an example of generating a sequence of random numbers using the system time in order to dynamically change initial value. Here it is:
However, in this form, the query can be run only in slow packages, so that its repeated execution be less frequent than the time unit used in the query (milliseconds). The use of this function will obviously not meet the condition. But multiplying the argument by some unique identifier will score a success (this approach was suggested by V.I. Gershovich):
But here is another problem, related to the argument of the RAND function being integer. If we exceed the maximum allowable value for the integer (for A database management system (DBMS) by Microsoft Corporation. SQL(Structured Query Language) is a database computer language designed for the retrieval and management of data in relational database management systems (RDBMS), database schema creation and modification, and database object access control management.SQL Server it makes 2^31-1 or 2 147 483 647), we'll get the following error message: Arithmetic overflow error converting expression to data type int. You can check it by running the above query on the website. The error appears for model numbers exceeding 2000. In such cases a normalizing factor is needed as well, for instance,
Here, model % 10 (remainder from division by 10) is incorporated, lest significants digits should be lost; otherwise, for models which numbers differ by less then ten equal values might be produced. The net solution looks like this (naturally, sorting should be performed by rnd rather then by model which we left for clearness of result). Solution 8.8.3
And now compare results:
Conclusion. The simpler and more reliable way of solving the problem in question is to use the NEWID() function which guarantees the uniqueness of values. However, these values are not numeric. Therefore, when it is a number that is needed, one should pay attention to the RAND() function.
|