Recursive Query In Sqlite With Cte

Mar 13th, 2020 - written by Kimserey with .

Few months ago we talked about the Basket of Apple problem where we represented a tree of rules to distribute apples in baskets. Trees are quite common in software, a simpler example is the example of managers and employees where employees can be managers and managers themselves can have managers. In today’s post, we will look at how we can represent a tree structure in SQLite and how we can retrieve the nodes of a tree using Common Table Expressions (CTE).

Tree in SQLite

Taking the example of managers and employees, we create the following table:

1
2
3
4
5
CREATE TABLE employee (
id INT PRIMARY KEY,
name TEXT NOT NULL,
manager_id INT REFERENCES employee(id)
)


We have a table of employees which might have managers. If an employee has a manager the manager_id will contain the id.

We can then insert the following data:

1
2
3
4
5
6
7
8
9
INSERT INTO employee (id, name, manager_id) VALUES (1, 'kim', NULL);
INSERT INTO employee (id, name, manager_id) VALUES (2, 'tom', 1);
INSERT INTO employee (id, name, manager_id) VALUES (3, 'sam', 1);
INSERT INTO employee (id, name, manager_id) VALUES (4, 'yoan', NULL);
INSERT INTO employee (id, name, manager_id) VALUES (5, 'jon', 2);
INSERT INTO employee (id, name, manager_id) VALUES (6, 'pier', 2);
INSERT INTO employee (id, name, manager_id) VALUES (7, 'zoe', 4);
INSERT INTO employee (id, name, manager_id) VALUES (8, 'ian', 1);
INSERT INTO employee (id, name, manager_id) VALUES (9, 'loic', 4);


This results in the following table content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
> SELECT * FROM employee

+----+------+------------+
| id | name | manager_id |
+----+------+------------+
| 1  | kim  | <null>     |
| 2  | tom  | 1          |
| 3  | sam  | 1          |
| 4  | yoan | <null>     |
| 5  | jon  | 2          |
| 6  | pier | 2          |
| 7  | zoe  | 4          |
| 8  | ian  | 1          |
| 9  | loic | 4          |
+----+------+------------+


Now that we have a table filled with employees, we are able to retrieve employees separately but if we want to retrieve all the nodes for a specific root, we need to perform a recursive query where we select employees starting from the first node and recursively selection child nodes. A recursive query in SQLite can be executed using CTE.

CTE

With nodes stored in the database, we want to be able to retrieve all nodes and children nodes given a root.

For example givenid = 1, we want to retrieve all children and children of children, etc…

Getting children of children is inherently recursive.

To perform the recursion, we would:

1. Select the root,
2. Join the employee table where the root id is the manager id,
3. Join the employee table where the employee id from step 2 is the manager id
4. Loop on step 3 until result no longer returns result.

This assumes that no infinite loop are present in the nodes.

A CTE query is done using the WITH RECURSIVE expression in SQLite. The composition of the query can be done as followed:

1
2
3
4
5
6
7
8
WITH RECURSIVE [cte_table_name](columns...) AS (
[initial_select]

UNION or UNION ALL

[recursive_select - where cte_table_name can be used to join]
)
[query_on_cte_table]


The WITH RECURSIVE ... AS () creates a temporary table containing the result of the recursive operation, which can then be queried on as a regular table. The initial_select is the anchor selection setting the first roots for recursion, while the recursive_select is the selection executed during the recursion. The overal selection is a compound select where subsequent queries are assembled using either UNION or UNION ALL. The difference between UNION and UNION ALL would be weither duplicate row are removed or kept.

Following this construct we can come up with a recursive query to get all direct and indirect reports of (1, kim):

1
2
3
4
5
6
7
8
9
10
11
12
13
WITH RECURSIVE cte_employee (id, name, manager_id) AS (
SELECT e.id, e.name, e.manager_id
FROM employee e
WHERE e.id = 1

UNION ALL

SELECT e.id, e.name, e.manager_id
FROM employee e
JOIN cte_employee c ON c.id = e.manager_id
)

SELECT * FROM cte_employee;


This will yield the following result:

1
2
3
4
5
6
7
8
9
10
+----+------+------------+
| id | name | manager_id |
+----+------+------------+
| 1  | kim  | <null>     |
| 2  | tom  | 1          |
| 3  | sam  | 1          |
| 8  | ian  | 1          |
| 5  | jon  | 2          |
| 6  | pier | 2          |
+----+------+------------+


We can see that with a single query we are able to get the complete tree.

CTE uses a queue plus a single value table to execute the recursion. It takes the following approach to construct its content:

1. the initial_select is executed, the result of the select is put in the table and in a queue,
2. the first item is dequeued from the queue and the recursive_select is executed with the CTE table acting as if it was a table of a single value (the dequeued value),
3. the result of the recursive_select is saved on the table and placed in the queue,
4. step 2. is repeated until the queue is empty.

Taking our example we can decompose the steps with:

• initial_select is executed, and returns (1, kim) - places it in a queue and in the content table,
• then dequeues (1, kim), executes recursive_select acting as if cte_employee only contains (1, kim), returns (2, tom), (3, sam), (8, ian) - places the 3 values in the queue and in the content table,
• then dequeues (2, tom), executes recursive_select acting as if cte_employee only contains (2, tom), returns (5, jon), (6, pier) - places the 2 values in the queue and in the content table,
• then dequeues (3, sam), similarly executes the recursive_select, returns no result,
• then dequeues (8, ian), then (5, jon) and finally (6, pier),
• once the queue is empty returns the table content ready for the query.

And that concludes today’s post!

Conclusion

In today’s post we looked at how we could represent tree structures in SQLite tables and how we could retrieve nodes from our table using a recursive call done with Common Table Expression. We started by looking at what a tree would look like in a SQLite table, with roots and nodes, then we used a recursive CTE query to get the nodes where we looked at how we could define a recursive query and the approach SQLite takes to construct its content. Hope you liked this post and I see you on the next one!

External Sources

Designed, built and maintained by Kimserey Lam.