Explicit join operations

Explicit join operation for two and more tables may be present in the FROM clause. Among the join operation series described in the SQL standard, the join-on-predicate operation is only supported by the number of database servers. A join-on-predicate syntax is:

FROM < table 1 >
[INNER]
{{LEFT | RIGHT | FULL } [OUTER]} JOIN < table 2>
[ON < predicate >]

A join type may be either the inner or one of the outers. The INNER and OUTER keywords may be omitted, because the outer join is uniquely defined by its type: LEFT, RIGHT, or FULL, whereas the inner join is simply referred to as JOIN.

A predicate specifies the condition of joining the rows from different tables. In so doing INNER JOIN means that the result set will only include those combinations of rows in two tables for which the predicate evaluates to TRUE. As a rule, the predicate specifies the equi-join on foreign and primary keys of the tables joined, although need not be so.

Example 5.6.3.

Find the maker, model number, and price for each computer in the database:
SELECT maker, Product.model AS model_1, PC.model AS model_2, price
FROM Product INNER JOIN
PC ON PC.model = Product.model
ORDER BY maker, model_1
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

In this example, the query returns only the row combinations from the PC and Product tables with identical model numbers.

For better control, the result includes the model number both in the PC table and in the Product table:

Makermodel_1model_2price
A12321232600
A12321232400
A12321232350
A12321232350
A12331233600
A12331233950
A12331233980
A12331233970
B11211121850
B11211121850
B11211121850
E12601260350

The LEFT JOIN implies that all the rows from the first (left) table are to be in the result set along with the rows for which the predicate evaluates to true. In so doing, the non-matching column values in the right table are returned as NULL values.

Suggested exercises: 6, 7, 9, 13, 14, 18, 19, 21, 23, 25, 26, 27, 28, 32, 36, 37, 39, 43, 48, 49, 50, 51, 52, 54, 57, 58, 66, 72, 76, 77, 79, 87, 88, 93, 94, 102, 114, 127

Example 5.6.4

Find all PC models, makers, and prices:
SELECT maker, Product.model AS model_1, pc.model AS model_2, price
FROM Product LEFT JOIN
PC ON PC.model = Product.model
WHERE type = 'pc'
ORDER BY maker, PC.model;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

It is essential that using the WHERE clause is indispensable for sampling only PC makers as compared with the previous example. Otherwise, laptop and printer models will also get into the result set. This condition is negligible for the above example because the rows only were joined with identical model numbers, and the PC table was between the two joined tables. Here is the result set:

Makermodel_1model_2price
A12321232600
A12321232400
A12321232350
A12321232350
A12331233600
A12331233950
A12331233980
B11211121850
B11211121850
B11211121850
E2111NULLNULL
E2112NULLNULL
E12601260350

Because models 2111 and 2112 in the Product table are absent in the PC table, the columns of the PC table contain NULL.

The RIGHT JOIN is the reverse of the LEFT JOIN, i.e. the result set will include all the rows from the second table and only those rows from the first table for which the join condition is met. In our case the left join

Product LEFT JOIN PC ON PC.model = Product.model

will be equivalent to the following right join

PC RIGHT JOIN Product ON PC.model = Product.model

Just the query

SELECT maker, Product.model AS model_1, PC.model AS model_2, price
FROM Product RIGHT JOIN
PC ON PC.model = Product.model
ORDER BY maker, PC.model;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]
gives the same results as those for inner join, because the right table (PC) has no models that would be missing from the left table (Product), which is reasonable for one-to-many relationship type that occurs between the PC and Product tables.

Finally, the result set with FULL JOIN will include not only the matching rows in the both tables but also all conceivable non-matching rows in initial tables. All the non-matching column values in both tables are returned as NULL values. A full join presents the combination of the left and right outer joins.

So, for the above mentioned A and B tables, the query

SELECT A.*, B.*
FROM A FULL JOIN
  B ON A.a = B.c;

gives the following result set:

AbCd
12NULLNULL
2124
NULLNULL33

Note this join is symmetrical, i.e. “A FULL JOIN B” is equivalent to “B FULL JOIN A”. Also, it should be noted that notation A.* implies “all columns from the А table”.

Suggested exercises: 16, 29, 30, 34, 46, 55, 56, 59, 60, 64, 69, 70, 74, 109, 113, 128

UNION JOIN

This join type have been introduced in SQL-92 language standard, but disappeared in later versions of SQL standard. Particularly, it is absent from SQL2003 (ANSI and ISO). As many other structures of SQL, UNION JOIN is excessive because it can be expressed as substraction of full outer join and inner join. Formally, we can write this expression as follows:

A UNION JOIN B :=
(A FULL JOIN B)
EXCEPT
(A INNER JOIN B)

If DBMS does not support FULL JOIN (MySQL), it can be obtained via union of left and right outer joins. So our formula takes the form

A UNION JOIN B :=
((A LEFT JOIN B)
UNION
(A RIGHT JOIN B))
EXCEPT
(A INNER JOIN B)

To demonstrate the case where this type of join could be useful, let’s consider the following task.

Find out the makers which produce printers but not PCs or produce PCs but not printers.

Having the ability to use UNION JOIN, we could solve the task as follows:

select * from
(select distinct maker from Product where type='pc') m_pc
UNION JOIN
(select distinct maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker;

Let’s use the formula. Full join of PC makers with printer makers gives us those who produce only one type of product as well as those who produce both types - PCs and printers.

select * from
(select distinct maker from Product where type='pc') m_pc
FULL JOIN
(select distinct maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

Now we’ll subtract from result obtained above those who produce the both types of products (inner join):

select m_pc.maker m1, m_printer.maker m2  from
(select maker from Product where type='pc') m_pc
FULL JOIN
(select maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker
EXCEPT
select * from
(select maker from Product where type='pc') m_pc
inner join
(select maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

In so doing I remove from the solution superfluous DISTINCT keywords because EXCEPT will exclude duplicate rows.
This is a single useful lesson here, as substraction operation (EXCEPT) can be replaced by a simple predicate:

where m_pc.maker IS NULL or m_printer.maker IS NULL

or even

m_pc.maker + m_printer.maker IS NULL

in view of the fact that concatenation with NULL gives NULL.

select * from
(select distinct maker from Product where type='pc') m_pc
FULL JOIN
(select distinct maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker
where m_pc.maker IS NULL or m_printer.maker IS NULL;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

 At last, to deduce the result in one column, let’s use COALESCE function:

select COALESCE(m_pc.maker, m_printer.maker) from
(select distinct maker from Product where type='pc') m_pc
FULL JOIN
(select distinct maker from Product where type='printer') m_printer
on m_pc.maker = m_printer.maker
where m_pc.maker IS NULL or m_printer.maker IS NULL;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]

It is needless to say that this is only one of possible methods to solve the task. The purpose of approach presented was to demonstrate a possible replacement of the deprecated join type.

I don’t know any DBMS which support UNION JOIN.

Commutative & Associative Laws and Joins

Inner and full outer joins are both commutative and associative, i.e. the following is fair for them:

A [FULL | INNER] JOIN B = B [FULL | INNER] JOIN A

and

(A [FULL | INNER] JOIN B) [FULL | INNER] JOIN С =
A [FULL | INNER] JOIN (B [FULL | INNER] JOIN С)

It is obvious that left/right joins are not commutative in view of

A LEFT JOIN B = B RIGHT JOIN A

But its are associative, for example:

(A LEFT JOIN B) LEFT JOIN C = A LEFT JOIN (B LEFT JOIN C)

From the practical point of view, associativity means that we might use no brackets defining the treatment order of joins.

However the law of associativity, which is fair for connections of the same type, is being broken when the joins of different types are used in a query. Let’s show this on example.

with a(a_id) as
(select * from (values('1'),('2'),('3')) x(y)),
b(b_id) as
(select * from (values('1'),('2'),('4')) x(y)),
c(c_id) as
(select * from (values('5'),('2'),('3')) x(y))
select a_id, b_id, c_id  from (a left join b on a_id=b_id) inner join c on b_id=c_id
union all
select '','',''
union all
select a_id, b_id, c_id  from  a left join (b inner join c on b_id=c_id) on a_id=b_id;
mssql
🚫
[[ error ]]
[[ column ]]
[[ value ]]
a_idb_idc_id
222
1NULLNULL
222
3NULLNULL

Results of two queries are being separated by blank row for convenience.

Let’s notice that in absence of brackets we shall receive the result conterminous with result of the first query, as joins will be carried out in the order that they are written in.