loading..
Русский    English
05:01

Querying graph database data

The original form of the Utb table can be restored by means of an ordinary SQL query:

  1. SELECT B.b_datetime, Q.q_id b_q_id, V.v_id b_v_id, B.b_vol 
  2. FROM utbG B JOIN utqG Q ON B.$to_id = Q.$node_id
  3. JOIN utvG V ON B.$from_id = V.$node_id;

However, to make graph navigation easier SQL Server introduces the MATCH function that defines a search pattern for nodes based on their relations. The pattern syntax resembles the one used by Cypher, the Neo4j’s graph query language.

In a simplified form, a pattern definition looks as follows:

  1. < node the edge originates FROM > - (< edge >) -> < node the edge terminates at >

The node is represented by the table name or alias. The use of aliases becomes indispensable if a table is used in a pattern more than once, e.g. in case of a self-join. The MATCH function patterns can be combined using the AND operator. Let’s continue with some examples. For a start, we rewrite the previous query, making it graph-oriented:

  1. SELECT B.b_datetime, Q.q_id b_q_id, V.v_id b_v_id, B.b_vol 
  2. FROM utbG B, utqG Q, utvG V
  3. WHERE match(V-(B)->Q);

Here, we use aliases for the sake of shortness only. We can rearrange the nodes, but the direction of the edge (from the spray can to the square) has to be retained:

  1. SELECT B.b_datetime, Q.q_id b_q_id, V.v_id b_v_id, B.b_vol 
  2. FROM utbG B, utqG Q, utvG V
  3. WHERE match(Q<-(B)-V);

Unlike in Cypher, the direction of the relationship has to be specified, in other words, the following function call

  1. match(V-(B)-Q);
will result in an error. Cypher is more flexible in this respect; the relationship direction may be omitted if it can be unambiguously identified; say, for a certain kind of relationship only one direction is possible.

Get the squares dyed with red paint. Display the square ID and the amount of red paint.

Relational schema

Console
Execute
  1. SELECT b_q_id, SUM(b_vol) qty
  2. FROM utB JOIN utV ON B_V_ID =V_ID
  3. WHERE V_COLOR ='R'
  4. GROUP BY b_q_id;

Graph-oriented schema

  1. SELECT q_id, SUM(b_vol) qty
  2. FROM utbG B, utvG V, utqG Q
  3. WHERE match(V-(B)->Q) AND V_COLOR ='R'
  4. GROUP BY q_id;

Get the squares dyed both with red and blue paint. Display the names of the squares.

Relational schema

Console
Execute
  1. SELECT q_name FROM utQ JOIN (
  2. SELECT b_q_id FROM utB JOIN utV ON B_V_ID =V_ID
  3. WHERE V_COLOR ='R'
  4. INTERSECT
  5. SELECT b_q_id FROM utB JOIN utV ON B_V_ID =V_ID
  6. WHERE V_COLOR ='B'
  7. ) X ON Q_ID=b_q_id;

Graph-oriented schema

  1. SELECT DISTINCT q_name
  2. FROM utbG B1,utbG B2, utvG VR, utvG VB, utqG Q
  3. WHERE
  4. match(VR-(B1)->Q<-(B2)-VB) AND VR.V_COLOR ='R' AND VB.V_COLOR ='B';

Note that you have to include the corresponding table into the FROM clause for each (generally different) node used in the pattern, as well as the relationship table. If the same node is implied, as the square in our case, we can use it for a bidirectional relationship. The predicate

  1. match(VR-(B1)->Q<-(B2)-VB)
could be rewritten as

  1. match(VR-(B1)->Q) AND match(VB-(B2)->Q)
or as

  1. match(VR-(B1)->Q AND VB-(B2)->Q)

DISTINCT is necessary in these queries, because without it, duplicate records will occur for squares painted by more than one red (and/or blue) spray cans.

Get squares dyed with all three colors.

Relational schema

The most straightforward approach would be INTERSECTing our previous example with an additional query. However, let’s present a more effective solution that uses the fact the number of distinct colors the square has been painted with is three:

Console
Execute
  1. SELECT q_name FROM utQ JOIN (
  2. SELECT b_q_id
  3. FROM utB JOIN utV ON B_V_ID =V_ID
  4. GROUP BY b_q_id
  5. HAVING COUNT(DISTINCT v_color)=3
  6. ) X ON q_id=b_q_id;

Graph-oriented schema

  1. SELECT DISTINCT Q.q_name
  2. FROM utbG B1,utbG B2,utbG B3, utvG VR, utvG VB, utvG VG, utqG Q
  3. WHERE
  4. match(VR-(B1)->Q<-(B2)-VB AND VG-(B3)->Q)
  5. AND VG.V_COLOR ='G' AND VR.V_COLOR ='R' AND VB.V_COLOR ='B';

Get the spray cans used for painting more than one square.

Relational schema

Console
Execute
  1. SELECT v_name FROM
  2. utB JOIN utV ON B_V_ID =V_ID
  3. GROUP BY b_v_id,v_name
  4. HAVING COUNT(DISTINCT b_q_id)>1;

Graph-oriented schema

  1. SELECT DISTINCT v_name
  2. FROM utbG B1,utbG B2, utvG V, utqG Q,utqG Q2
  3. WHERE
  4. match(Q2<-(B2)-V-(B1)->Q) AND Q.q_id <> Q2.q_id;

The condition match(Q2<-(B2)-V-(B1)->Q) states a spray can participated in two paint events, while Q.q_id <> Q2.q_id claims there were different squares involved.

Preliminary conclusions

At first glance, relational and graph-oriented queries don't seem to be radically different. Moreover, one can omit the graph-oriented syntax completely, keeping within SQL boundaries. However, the examples about aren’t very illustrative, since the advantages of graph databases become apparent for subject areas with complex relationships between the data entities.

Tags
aggregate functions Airport ALL AND AS keyword ASCII AVG Battles Bezhaev Bismarck C.J.Date calculated columns Cartesian product CASE cast CHAR CHARINDEX Chebykin check constraint classes COALESCE common table expressions comparison predicates Computer firm CONSTRAINT CONVERT correlated subqueries COUNT CROSS APPLY CTE data type conversion data types database schema DATEADD DATEDIFF DATENAME DATEPART DATETIME date_time functions DDL DEFAULT DEFAULT VALUES DELETE DISTINCT DML duplicates edge equi-join EXCEPT exercise (-2) More tags
The book was updated
month ago
продать monobank
©SQL-EX,2008 [Evolution] [Feedback] [About] [Links] [Team]
All right reserved.