Window Functions

About

Window functions allow you to run aggregate functions without grouping the data.

Specifying PARTITION BY works like creating aggregations like GROUP BY.

Windowing Functions

Window function summary:

Scope
Type
Function
Description

frame

computation

generic aggs.

e.g. SUM, AVG

row access

FIRST_VALUE

first frame value

row access

LAST_VALUE

last frame value

row access

NTH_VALUE

nth frame value

partition

row access

LAG

row before current

row access

LEAD

row after current

row access

ROW_NUMBER

current row number

ranking

CUME_DIST

cumulative distribution

ranking

DENSE_RANK

rank without gaps

ranking

NTILE

rank in n partitions

ranking

PERCENT_RANK

percent rank

ranking

RANK

rank with gaps

over

-- Will do average over everyone since not partition was given
SELECT
    col1,
    AVG(col2) OVER()
FROM table_name;

partition by

-- Will partition data differently for each result column
SELECT
    SUM(col1) OVER(PARTITION BY col1),
    AVG(col1) OVER(PARTITION BY col2)
FROM table_name;
-- Will partition only filtered data
SELECT
    SUM(col1) OVER(PARTITION BY col1 = 'value')
FROM table_name;

partition by with order by

-- Will partition with order
SELECT
    SUM(col1) OVER(PARTITION BY col1 ORDER BY col3)
FROM table_name;

lead & lag

Useful functions to compare data of CURRENT ROW with any other row in its PARTITION group. (LEAD going forward, and LAG going backwards)

-- LEAD(column_name, row_to_jump_ahead_to)
-- LAG(column_name, row_to_jump_behind_to)
-- Use the total_col of the next row on the current row
SELECT
    total_col,
    year_col,
    LEAD(total_col, 1) OVER(ORDER BY year_col)
FROM table_name;

/*
    total_col  year_col  LEAD(...)
    10         2000      15
    15         2001      5
    5          2002      6
    6          ...
*/
-- Calculate the diff of total_col between the curr row and next
SELECT
    total_col,
    year_col,
    total_col - LEAD(total_col, 1) OVER(ORDER BY year_col) as diff
FROM table_name;

/*
    total_col  year_col  diff
    10         2000      -5
    15         2001      10
    5          2002      -1
    6          ...
*/

first_value & last_value

Get the first, last or nth row value in a PARTITION group.

-- FIRST_VALUE(column_name)
-- LAST_VALUE(column_name)
-- NTH_VALUE(column_name, row_to_get)

Dynamic Windows

You can create sliding dynamic windows for the data by specifying window frames.

window frames

Window frames are used to indicate how many rows around the current view, the window function should include.

-- [ROWS | RANGE] BETWEEN rows_number PRECEDING AND rows_number FOLLOWING

If no range is specified, a window frame by default uses:

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW;

UNBOUNDED means, consider all the rows before (PRECEDING) / after (FOLLOWING) the current one.

Examples

-- Dynamic sliding windows of 3 rows (previous row, current, next row)
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    )
FROM table_name;
-- Dynamic sliding windows of 2 rows (previous row, next row) (Current row excluded)
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
        EXCLUDE CURRENT ROW
    )
FROM table_name;
-- Dynamic sliding windows of 2 rows (current row, next row)
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING
    )
FROM table_name;
-- Dynamic sliding windows of n rows (everything before current line)
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN UNBOUNDED PRECEDING
    )
FROM table_name;
-- Dynamic sliding windows of n rows (everything after current line)
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN UNBOUNDED FOLLOWING
    )
FROM table_name;
-- Dynamic sliding windows of n rows (everything before current line and current line)
-- Also known as cumulative totals
SELECT
    SUM(col1) OVER(
        PARTITION BY col1
        ORDER BY col3
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    )
FROM table_name;

Last updated