Window Functions
About
Window functions allow you to run aggregate functions without grouping the data.
Specifying PARTITION BY
works like creating aggregations like GROUP BY
.
Window functions are powerful as it lets you aggregate data in different ways in the same query.
Windowing Functions
Window function summary:
frame
computation
generic aggs.
e.g. SUM
, AVG
row access
FIRST_VALUE
first frame value
row access
LAST_VALUE
last frame value
row access
NTH_VALUE
nth frame value
partition
row access
LAG
row before current
row access
LEAD
row after current
row access
ROW_NUMBER
current row number
ranking
CUME_DIST
cumulative distribution
ranking
DENSE_RANK
rank without gaps
ranking
NTILE
rank in n partitions
ranking
PERCENT_RANK
percent rank
ranking
RANK
rank with gaps
Window function never process rows outside their partitions. However, without PARTITION BY
the partition is the entire set.
over
-- Will do average over everyone since not partition was given
SELECT
col1,
AVG(col2) OVER()
FROM table_name;
partition by
-- Will partition data differently for each result column
SELECT
SUM(col1) OVER(PARTITION BY col1),
AVG(col1) OVER(PARTITION BY col2)
FROM table_name;
-- Will partition only filtered data
SELECT
SUM(col1) OVER(PARTITION BY col1 = 'value')
FROM table_name;
partition by with order by
-- Will partition with order
SELECT
SUM(col1) OVER(PARTITION BY col1 ORDER BY col3)
FROM table_name;
lead & lag
Useful functions to compare data of CURRENT ROW
with any other row in its PARTITION
group. (LEAD
going forward, and LAG
going backwards)
-- LEAD(column_name, row_to_jump_ahead_to)
-- LAG(column_name, row_to_jump_behind_to)
-- Use the total_col of the next row on the current row
SELECT
total_col,
year_col,
LEAD(total_col, 1) OVER(ORDER BY year_col)
FROM table_name;
/*
total_col year_col LEAD(...)
10 2000 15
15 2001 5
5 2002 6
6 ...
*/
-- Calculate the diff of total_col between the curr row and next
SELECT
total_col,
year_col,
total_col - LEAD(total_col, 1) OVER(ORDER BY year_col) as diff
FROM table_name;
/*
total_col year_col diff
10 2000 -5
15 2001 10
5 2002 -1
6 ...
*/
first_value & last_value
Get the first, last or nth row value in a PARTITION
group.
-- FIRST_VALUE(column_name)
-- LAST_VALUE(column_name)
-- NTH_VALUE(column_name, row_to_get)
Dynamic Windows
You can create sliding dynamic windows for the data by specifying window frames.
window frames
Window frames are used to indicate how many rows around the current view, the window function should include.
RANGE
can only be used with UNBOUNDED
values.
There is also a slight difference in how aggregations are treated in RANGE
when compared to ROWS
.
-- [ROWS | RANGE] BETWEEN rows_number PRECEDING AND rows_number FOLLOWING
If no range is specified, a window frame by default uses:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW;
UNBOUNDED
means, consider all the rows before (PRECEDING
) / after (FOLLOWING
) the current one.
Examples
-- Dynamic sliding windows of 3 rows (previous row, current, next row)
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
)
FROM table_name;
-- Dynamic sliding windows of 2 rows (previous row, next row) (Current row excluded)
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
EXCLUDE CURRENT ROW
)
FROM table_name;
-- Dynamic sliding windows of 2 rows (current row, next row)
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING
)
FROM table_name;
-- Dynamic sliding windows of n rows (everything before current line)
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN UNBOUNDED PRECEDING
)
FROM table_name;
-- Dynamic sliding windows of n rows (everything after current line)
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN UNBOUNDED FOLLOWING
)
FROM table_name;
-- Dynamic sliding windows of n rows (everything before current line and current line)
-- Also known as cumulative totals
SELECT
SUM(col1) OVER(
PARTITION BY col1
ORDER BY col3
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
FROM table_name;
Last updated