loading..
Русский    English
11:46

Random sampling of rows from a table in SQL Server page 1

In due time we used random sampling of entries to create a list of questions for a test. We did it on the client with the help of basic language functions which generate pseudorandom numbers within a given range (for example, the RND function in Visual Basic).

It turned out, however, quite simple to arrange it on the server side. Both by similar tools (with the help of the RAND function in T-SQL) and on the basis of uniqueidentifier datatype which is called global unique udentifier and lools like:

  1. xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx,
where each x is a sexadecimal digit between 0-9 or a-f. For example,
EEA8BF3A-7633-477A-B8C1-8C60DC9AD20C.

For the solution of our problem it is essential that this unique identifier can be automatically generated by employing the NEWID function. Just write down in QA (Query Analyzer) or in MS (Management Studio)

Console
Execute
  1. SELECT NEWID();

or

Console
Execute
  1. SELECT TOP 1 NEWID() FROM PC
on the SQL-EX.RU website, and you will see everything. Every time the query being executed, you'll be getting different values. Therein lies the uniqueness of the identifier.

So, the idea is to add a computed column NEWID, by which the sorting should be performed, to the query yielding the required number of entries. Since generated values are arbitrary, we'll get a random fetch.

Assume that we need to select two random models from the Product table:

Solution 8.8.1

Console
Execute
  1. SELECT model
  2. FROM (SELECT TOP 2 model, NEWID() [id]
  3. FROM Product
  4. ORDER BY [id]
  5. ) x;
Or simpler

Siolution 8.8.2

Console
Execute
  1. SELECT TOP 2 model
  2. FROM Product
  3. ORDER BY NEWID();

Let us run this query several times:

--1--
1232
1433
--2--
1276
1260
--3--
1750
1298

You can carry on if you like.

Now let us come back to the "traditional" way. The RAND function generates pseudorandom floating point numbers between 0 and 1.

The term 'pseudorandom' denotes that such numbers are computed by means of a certain atithmetic algorythm. That is, with the initial conditions (input) being equal, the derived numbers will be the same. The initial conditions may be set explicitly, by supplying the function argument which can be any number of tinyint, int or smallint type, or implicitly. In the last case the argument is omitted and the initial value is selected by  SQL(Structured Query Language) is a database computer language designed for the retrieval and management of data in relational database management systems (RDBMS), database schema creation and modification, and database object access control management.SQL Server.

Let us execute the following query:

Console
Execute
  1. SELECT TOP 1 RAND(), RAND(350)
  2. FROM Product;

We will run it in MS. I got 0.0485421339242268 and 0.72009490018203537. It may be safely stated that the first number you get will be different, but the second may be the same, since in this case the initial value was specified (350).


Pages 1 2
Tags
aggregate functions Airport ALL AND AS keyword ASCII AVG Battles Bezhaev Bismarck C.J.Date calculated columns Cartesian product CASE cast CHAR CHARINDEX Chebykin check constraint classes COALESCE common table expressions comparison predicates Computer firm CONSTRAINT CONVERT correlated subqueries COUNT CROSS APPLY CTE data type conversion data types database schema DATEADD DATEDIFF DATENAME DATEPART DATETIME date_time functions DDL DEFAULT DEFAULT VALUES DELETE DISTINCT DML duplicates edge equi-join EXCEPT exercise (-2) More tags
The book was updated
month ago
©SQL-EX,2008 [Evolution] [Feedback] [About] [Links] [Team]
All right reserved.