The CTEs are like temporary tables that exist only during the execution of the query.
WITH RECURSIVE cte_name [(col_1,..., col_n)] AS(
CTE_query_definition -- non-recursive term
UNION [ALL]
CTE_query definion -- recursive term
) SELECT * FROM cte_name;
a simple example:
WITH RECURSIVE nums (n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM nums WHERE n+1 <= 11
)
SELECT n FROM nums;
Another example using a function
CREATE OR REPLACE FUNCTION yildiz(n integer) RETURNS text AS $$
declare star text;
BEGIN
for counter in 1..n loop
star:=concat(star,'*');
end loop;
return star;
END;
$$ LANGUAGE plpgsql;
WITH RECURSIVE nums (n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM nums WHERE n+1 <= 4
)
SELECT yildiz(n) FROM nums;
Getting “manager tree” for some employee:
drop table if exists employees cascade;
CREATE TABLE employees (
id serial PRIMARY KEY,
name varchar(255) NOT NULL,
salary integer,
job varchar(255),
manager_id int,
FOREIGN KEY (manager_id) REFERENCES employees (id) ON DELETE CASCADE
);
INSERT INTO employees (id,name,manager_id,salary,job)
VALUES
(1, 'John', NULL, 10000, 'CEO'),
(2, 'Ben', 5, 1400, 'Junior Developer'),
(3, 'Barry', 6, 500, 'Intern'),
(4, 'George', 5, 1800, 'Developer'),
(5, 'James', 7, 3000, 'Manager'),
(6, 'Steven', 7, 2400, 'DevOps Engineer'),
(7, 'Alice', 1, 4200, 'VP'),
(8, 'Jerry', 1, 3500, 'Manager'),
(9, 'Adam', 8, 2000, 'Data Analyst'),
(10, 'Grace', 8, 2500, 'Developer'),
(11, 'Nicola',5,3000,'Developer'),
(12, 'Alexandra',5,2500,'System Engineer'),
(13, 'Dominic', 5,2750,'Accountant'),
(14, 'Leonard', 8,3450,'Finans Manager'),
(15, 'Eric', 4,2500,'Junior Developer'),
(16, 'Piers Paige', 8,4000,'Executive Director')
;
with recursive submanager as (
select id,name,job,manager_id,0 as level from employees where id=1
union
select e.id,e.name,e.job,e.manager_id,a.level+1 as level from employees e join submanager a on a.id=e.manager_id)
select * from submanager;
another example with string_agg function:
WITH RECURSIVE pathman AS (
SELECT id,name, manager_id, job, 0 AS level
FROM employees
WHERE id = 3 -- Barry, the intern
UNION ALL
SELECT e.id,e.name, e.manager_id, e.job, p.level + 1 AS level
FROM employees e
JOIN pathman p ON e.id = p.manager_id
)
SELECT STRING_AGG(name || '(' ||job||')',' > '
ORDER BY level ASC) AS path FROM pathman;
Recursion on Graphs
Graphs as a data structure are an ideal candidate for traversing using recursion.
drop table if exists Roads;
CREATE TABLE Roads ( src integer, dest integer,distance integer);
INSERT INTO Roads ( src, dest,distance) VALUES
(1, 2,4),(2, 3,3),(2, 4,5),(3, 4,2),(4, 1,7),(3, 5,1);
find paths as graph. For that, we will need a PostgreSQL special — ARRAY
data type. Arrays are not a common relational database feature but are very handy here for storing paths. If you are not familiar with this data type, then these are the things you need for understanding SQL below:
- Create array —
ARRAY[value_1, value_2]
- Concatenate array —
ARRAY[value_1, value_2] || ARRAY[value_3]
ALL
function –"x" != ALL(ARRAY["a", "b", "c"])
– compares"x"
to each value in array, results istrue
if all comparisons result intrue
WITH RECURSIVE paths (src, dest, distance, path) AS (
SELECT e.src, e.dest, e.distance,ARRAY[e.src, e.dest]
FROM roads e
UNION
SELECT p.src, e.dest,p.distance+e.distance, p.path || ARRAY[e.dest]
FROM paths p
JOIN roads e
ON p.dest = e.src AND e.dest != ALL(p.path)
)
SELECT * FROM paths order by src,dest,distance;
Recursive VIEW
CREATE RECURSIVE VIEW paths (src, dest, distance, path) AS (
SELECT e.src, e.dest, e.distance,ARRAY[e.src, e.dest]
FROM roads e
UNION
SELECT p.src, e.dest,p.distance+e.distance, p.path || ARRAY[e.dest]
FROM paths p
JOIN roads e
ON p.dest = e.src AND e.dest != ALL(p.path)
);
SELECT src, dest, path, distance FROM paths WHERE src = 4;
In Oracle the same is done by:
hierarchical_query_clause::=
SELECT last_name, employee_id, manager_id, LEVEL
FROM employees
START WITH employee_id = 100
CONNECT BY PRIOR employee_id = manager_id
ORDER SIBLINGS BY last_name;
LAST_NAME EMPLOYEE_ID MANAGER_ID LEVEL
------------------------- ----------- ---------- ----------
King 100 1
Cambrault 148 100 2
Bates 172 148 3
Bloom 169 148 3
Fox 170 148 3
Kumar 173 148 3
Ozer 168 148 3
Smith 171 148 3
De Haan 102 100 2
Hunold 103 102 3
Austin 105 103 4
Ernst 104 103 4
Lorentz 107 103 4
Pataballa 106 103 4
Errazuriz 147 100 2
Ande 166 147 3
Banda 167 147 3
Connect By – An alternative approach
An alternative mechanism for querying tree structures is to use the CONNECT BY statement (or connectby in PostgreSQL). CONNECT BY is a standard feature of ORACLE RDBMS, but is not part of the PostgreSQL core functionality, and instead, it is included in the tablefunc Additional Supplied Module.
Assuming your PostgreSQL install has access to tablefunc, you can install its functions into your database with:
CREATE EXTENSION tablefunc;
The format of connectby is:
connectby(text relname, text keyid_fld, text parent_keyid_fld [, text orderby_fld ], text start_with, int max_depth [, text branch_delim ])
select e.id,e.name,e.manager_id,orderbyManager from
connectby('employees', 'id', 'manager_id', 'id','1', 0,'->')
as t(id int, manager_id int, level int,orderbyManager text,managerid int) join employees e on e.id=t.id;
So why would you potentially use connectby, rather than WITH RECURSIVE? You would probably only use it where you already had legacy code that you wished to remain consistent with. Some limited testing of a slightly more complex tree containing 4,000+ records seems to show performance that is in favour of the RECURSIVE CTE approach (550ms vs 850ms). Testing was by no means exhaustive, and your mileage may vary.