Recursive CTE

About

Recursive CTE (Common Table Expressions) is a way to perform recursive queries, meaning a query that calls itself until a stopping condition is met.

Useful when having a tree or graph-type data structure.

  1. The Anchor member runs once and provides the initial dataset.

  2. The Recursive member runs repeatedly, using results from previous iterations.

    1. The process stops when the recursive part returns no new rows.

WITH RECURSIVE countdown(val) AS (
  SELECT 3 AS val -- Anchor member (base case)
  UNION [ALL]
  SELECT val - 1 FROM countdown WHERE val > 1 -- Recursive member (self-reference)
)
SELECT * FROM countdown;

Example

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name TEXT,
    manager_id INT REFERENCES employees(id)
);

-- Recursive query to get subordinates of employee with id = 1

WITH RECURSIVE emp_tree(id, name, manager_id) AS (
    -- Anchor: CEO or root person
    SELECT id, name, manager_id
    FROM employees
    WHERE id = 1

    UNION ALL

    -- Recursive part: find direct reports
    SELECT e.id, e.name, e.manager_id
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
)
SELECT * FROM emp_tree;

Union vs Union All

There is a difference in using UNION and UNION ALL in the recursive block.

Union All

  • The recursive part reuses all rows, including duplicates.

  • This is usually what you want.

  • It's faster because there's no sorting or deduplication.

  • You'll need to manually prevent infinite loops if needed. (e.g., via path tracking)

Union

  • PostgreSQL will remove duplicate rows at every recursion level.

  • This adds CPU cost due to hashing or sorting.

  • If you're not careful, it can also block valid recursion paths.

You may think this helps with cycle prevention, but:

  • It only deduplicates entire rows — so even small changes can create new "unique" rows.

  • It doesn’t guarantee correct logic in graphs/hierarchies.

Fixing infinite loops

To prevent cycles or infinite loops, build a path history with an array:

WITH RECURSIVE emp_tree(id, name, manager_id, path_ids) AS (
    SELECT id, name, manager_id, ARRAY[id] AS path_ids
    FROM employees
    WHERE id = 1

    UNION ALL

    SELECT e.id, e.name, e.manager_id, et.paths_ids || e.id -- build history path
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
    -- Don't revisit rows where the ID was already seen
    WHERE NOT e.id = ANY(et.path_ids)
)
SELECT * FROM emp_tree;

Limiting depth of recusion

WITH RECURSIVE emp_tree(id, name, manager_id, depth) AS (
    SELECT id, name, manager_id, 1 AS depth
    FROM employees
    WHERE id = 1

    UNION ALL

    SELECT e.id, e.name, e.manager_id, et.depth + 1
    FROM employees e
    INNER JOIN emp_tree et ON e.manager_id = et.id
    -- Depth limit of 5 (Don't go deeper than 4 levels of subordinates)
    WHERE et.depth< 5
)
SELECT * FROM emp_tree;

Last updated