SQL code copied to buffer
RU PT FR

Lesson 4.5: Advanced Aggregation with ROLLUP, CUBE, and GROUPING SETS in SQL

As reporting needs grow, regular GROUP BY is often not enough. For example, you may need to get all of the following at once:

  • detail by order status and customer;
  • intermediate totals by status;
  • totals by customer;
  • a grand total for the whole dataset.

You can write multiple queries and combine them with UNION ALL, but that is verbose and harder to maintain. SQL provides grouping extensions for these tasks: ROLLUP, CUBE, and GROUPING SETS.

Important: all practical examples in this lesson use SQL Server (AdventureWorks).

Syntax note: ROLLUP, CUBE, GROUPING SETS, and GROUPING() below are shown using SQL Server syntax. In MySQL, functionality is more limited and syntax is partly different (for example, WITH ROLLUP is commonly used, while CUBE and GROUPING SETS may be unavailable in classic form).

In this lesson, we will cover:

  • how ROLLUP, CUBE, and GROUPING SETS differ;
  • how subtotal and total rows are built;
  • how to distinguish generated total rows using GROUPING().

Why This Matters

Advanced aggregation helps you:

  • build multi-level reports in a single query;
  • reduce SQL duplication;
  • produce consistent detail, subtotals, and grand totals.

Core Idea

Assume we have sales data in SalesOrderHeader with dimensions Status, CustomerID and metric TotalDue.

Regular GROUP BY returns only one grouping level. Extended grouping constructs return multiple levels at once.

ROLLUP: Hierarchical Totals

ROLLUP builds a hierarchy from right to left in the column list.

Syntax

GROUP BY ROLLUP (col1, col2, col3)

Generated levels:

  • (col1, col2, col3) - detail;
  • (col1, col2) - subtotal over col3;
  • (col1) - subtotal over col2 and col3;
  • () - grand total.

Example: order amount totals by status and customer

SELECT
    Status,
    CustomerID,
    SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY ROLLUP (Status, CustomerID)
ORDER BY Status, CustomerID;

Result:

  • rows for each Status + CustomerID pair;
  • subtotal per Status;
  • overall grand total.

CUBE: All Dimension Combinations

CUBE builds aggregates for all possible combinations of listed columns.

Syntax

GROUP BY CUBE (col1, col2)

For two columns, levels are:

  • (col1, col2);
  • (col1);
  • (col2);
  • ().

For three columns, combinations are already $2^3 = 8$, so result size can grow quickly.

Example: order amount totals by status and customer across all slices

SELECT
    Status,
    CustomerID,
    SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY CUBE (Status, CustomerID)
ORDER BY Status, CustomerID;

Result: besides detail and grand total, you also get:

  • totals per Status;
  • totals per CustomerID.

GROUPING SETS: Precise Control of Levels

GROUPING SETS lets you explicitly list only the grouping levels you need.

Syntax

GROUP BY GROUPING SETS (
    (col1, col2),
    (col1),
    ()
)

Example: only required levels, no extra combinations

SELECT
    Status,
    CustomerID,
    SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY GROUPING SETS (
    (Status, CustomerID),
    (Status),
    ()
)
ORDER BY Status, CustomerID;

This is equivalent to multiple GROUP BY ... UNION ALL ... queries, but is more compact and usually better optimized.

Distinguishing Total Rows with GROUPING()

In generated total rows, dimension values often become NULL. The issue is that source data can also contain real NULL values.

GROUPING(column) helps distinguish them:

  • 0 - regular value from source data;
  • 1 - value generated by aggregation level.

Example with level flags

SELECT
    Status,
    CustomerID,
    SUM(TotalDue) AS total_amount,
    GROUPING(Status) AS g_status,
    GROUPING(CustomerID) AS g_customer
FROM SalesOrderHeader
GROUP BY ROLLUP (Status, CustomerID)
ORDER BY Status, CustomerID;

Practical report labeling pattern:

CASE
    WHEN GROUPING(Status) = 1 AND GROUPING(CustomerID) = 1 THEN 'GRAND TOTAL'
    WHEN GROUPING(CustomerID) = 1 THEN 'STATUS SUBTOTAL'
    ELSE 'DETAIL'
END AS row_type

When to Use What

  • Use ROLLUP when you need hierarchical totals (for example, year -> month -> day).
  • Use CUBE when you need all analytical slices across dimensions.
  • Use GROUPING SETS when you want strict control over exactly which levels are returned.

Practical Recommendations

  • Always check result size: CUBE can increase row count significantly.
  • Label row types (DETAIL, SUBTOTAL, GRAND TOTAL) for readability.
  • Add explicit ORDER BY so totals appear in a predictable order.
  • If you need aggregate filtering, combine with HAVING.

MySQL Example

Below is a MySQL example on the payment table using subtotals with WITH ROLLUP:

SELECT
    staff_id,
    customer_id,
    SUM(amount) AS total_amount
FROM
    payment
GROUP BY
    staff_id, customer_id WITH ROLLUP
ORDER BY
    GROUPING(staff_id),
    staff_id,
    GROUPING(customer_id),
    customer_id;

In this query:

  • detail is returned per staff_id + customer_id pair;
  • WITH ROLLUP adds subtotals per staff_id and a grand total;
  • ORDER BY GROUPING(...) places rows in a convenient order: details, subtotals, then grand total.

Important notes for MySQL:

  • WITH ROLLUP provides hierarchical totals, but not a full equivalent of CUBE/GROUPING SETS.
  • For more complex grouping combinations, you often need multiple queries with UNION ALL.
  • If your MySQL version does not support GROUPING(), sorting and total-row labeling are usually done with NULL checks.

Practical Usage

  1. Order amount report by status and customer with totals: ROLLUP (Status, CustomerID) gives detail, status subtotals, and grand total.

  2. Multidimensional sales analysis: CUBE (Status, CustomerID) gives all slice combinations across status and customer.

  3. Custom order amount report: GROUPING SETS lets you keep only the levels you need: detail + departmental subtotal + grand total.

Key Takeaways from This Lesson

  • ROLLUP, CUBE, and GROUPING SETS extend standard GROUP BY.
  • ROLLUP creates hierarchical totals, CUBE creates all combinations, GROUPING SETS creates only explicitly listed levels.
  • GROUPING() is essential for correctly interpreting generated total rows.
  • These tools help build flexible analytical reports on order amounts in a single query.

By mastering these constructs, you can design more powerful SQL reports without long UNION ALL chains.