Recursive CTE

About

Recursive CTE (Common Table Expressions) is a way to perform recursive queries, meaning a query that calls itself until a stopping condition is met.

Useful when having a tree or graph-type data structure.

The Anchor member runs once and provides the initial dataset.
The Recursive member runs repeatedly, using results from previous iterations.
1. The process stops when the recursive part returns no new rows.

WITH RECURSIVE countdown(val) AS (
  SELECT 3 AS val -- Anchor member (base case)
  UNION [ALL]
  SELECT val - 1 FROM countdown WHERE val > 1 -- Recursive member (self-reference)
)
SELECT * FROM countdown;

Check differences of using UNION and UNION ALL.

Example

This example is prone to infinite loops, if for some reason, the employee is its own manager or if someones manager is someone that was already looked on (A -> B -> C -> A).

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name TEXT,
    manager_id INT REFERENCES employees(id)
);

-- Recursive query to get subordinates of employee with id = 1

WITH RECURSIVE emp_tree(id, name, manager_id) AS (
    -- Anchor: CEO or root person
    SELECT id, name, manager_id
    FROM employees
    WHERE id = 1

    UNION ALL

    -- Recursive part: find direct reports
    SELECT e.id, e.name, e.manager_id
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
)
SELECT * FROM emp_tree;

Union vs Union All

There is a difference in using UNION and UNION ALL in the recursive block.

Union All

The recursive part reuses all rows, including duplicates.
This is usually what you want.
It's faster because there's no sorting or deduplication.
You'll need to manually prevent infinite loops if needed. (e.g., via path tracking)

Union

PostgreSQL will remove duplicate rows at every recursion level.
This adds CPU cost due to hashing or sorting.
If you're not careful, it can also block valid recursion paths.

You may think this helps with cycle prevention, but:

It only deduplicates entire rows — so even small changes can create new "unique" rows.
It doesn’t guarantee correct logic in graphs/hierarchies.

Fixing infinite loops

To prevent cycles or infinite loops, build a path history with an array:

WITH RECURSIVE emp_tree(id, name, manager_id, path_ids) AS (
    SELECT id, name, manager_id, ARRAY[id] AS path_ids
    FROM employees
    WHERE id = 1

    UNION ALL

    SELECT e.id, e.name, e.manager_id, et.paths_ids || e.id -- build history path
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
    -- Don't revisit rows where the ID was already seen
    WHERE NOT e.id = ANY(et.path_ids)
)
SELECT * FROM emp_tree;

Limiting depth of recusion

WITH RECURSIVE emp_tree(id, name, manager_id, depth) AS (
    SELECT id, name, manager_id, 1 AS depth
    FROM employees
    WHERE id = 1

    UNION ALL

    SELECT e.id, e.name, e.manager_id, et.depth + 1
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
    -- Depth limit of 5 (Don't go deeper than 4 levels of subordinates)
    WHERE et.depth< 5
)
SELECT * FROM emp_tree;

PreviousFunctions NextViews

Last updated 3 days ago