Window functions are one of the most powerful features in SQL for performing complex analytical calculations. Unlike aggregate functions that collapse multiple rows into a single result, window functions allow you to perform calculations across a set of rows that are related to the current row—all while preserving the individual rows in your result set.
This lesson introduces the fundamental concepts of window functions and demonstrates how they can transform your data analysis capabilities.
A window function performs a calculation across a set of table rows that are somehow related to the current row. This set of rows is called a "window" or "window frame." The key difference from regular aggregate functions is that window functions do not cause rows to be grouped into a single output row—each row retains its identity.
Think of it like looking through a moving window as you scan through your data. For each row, you can see and calculate values based on related rows around it, but each row still appears separately in the result.
Key characteristics:
OVER clauseThe general syntax for a window function is:
window_function_name(expression) OVER (
[PARTITION BY partition_expression]
[ORDER BY sort_expression]
[window_frame_clause]
)
Components:
ROW_NUMBER, SUM, AVG)Let's start with one of the most commonly used window functions: ROW_NUMBER(). This function assigns a unique sequential number to each row within a partition.
SELECT
payment_id,
customer_id,
amount,
payment_date,
ROW_NUMBER() OVER (ORDER BY payment_date) AS row_num
FROM
payment
LIMIT 10;
This query assigns a sequential number to each payment ordered by payment date. The OVER (ORDER BY payment_date) clause tells SQL to:
payment_dateThe real power of window functions comes when you use PARTITION BY to create separate windows for different groups:
SELECT
customer_id,
amount,
payment_date,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY payment_date
) AS payment_number
FROM
payment
WHERE
customer_id IN (1, 2, 3)
ORDER BY
customer_id,
payment_date;
Here's what happens:
PARTITION BY customer_id creates a separate window for each customerpayment_dateROW_NUMBER() starts counting from 1 for each new customerVisualization:
Customer 1: Customer 2: Customer 3:
Row 1 ----\ Row 1 ----\ Row 1 ----\
Row 2 -----\ Row 2 -----\ Row 2 -----\
Row 3 ------\ Row 3 ------\ Row 3 ------\
... ... ...
Each customer has their own independent row numbering.
Window functions make it easy to identify the most recent record in each group:
WITH numbered_payments AS (
SELECT
customer_id,
amount,
payment_date,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY payment_date DESC
) AS recency_rank
FROM
payment
)
SELECT
customer_id,
amount,
payment_date
FROM
numbered_payments
WHERE
recency_rank = 1
ORDER BY
customer_id
LIMIT 10;
This query finds the most recent payment for each customer by:
recency_rank = 1 (the most recent)Window functions can also perform aggregations while keeping individual rows:
SELECT
customer_id,
amount,
payment_date,
SUM(amount) OVER (PARTITION BY customer_id) AS total_spent,
AVG(amount) OVER (PARTITION BY customer_id) AS avg_payment,
amount - AVG(amount) OVER (PARTITION BY customer_id) AS diff_from_avg
FROM
payment
WHERE
customer_id IN (1, 2, 3)
ORDER BY
customer_id,
payment_date;
For each payment, this query shows:
Notice how regular aggregate functions would require a GROUP BY and collapse the rows, but window functions let you keep all the detail while adding aggregated context.
It's important to understand the difference:
GROUP BY (Aggregate Functions):
SELECT
customer_id,
COUNT(*) AS payment_count,
SUM(amount) AS total_amount
FROM
payment
GROUP BY
customer_id;
Result: One row per customer
Window Functions:
SELECT
customer_id,
payment_id,
amount,
COUNT(*) OVER (PARTITION BY customer_id) AS payment_count,
SUM(amount) OVER (PARTITION BY customer_id) AS total_amount
FROM
payment;
Result: Every payment row preserved, with aggregate values added as additional columns
ROW_NUMBER()).GROUP BY, window functions do not collapse rows—they add calculated columns to your existing data.In the next lessons, we will explore more window functions like RANK(), DENSE_RANK(), NTILE(), and dive deeper into window frames and advanced analytical calculations.