In the previous lessons, we used "standalone" subqueries that could be run on their own. In this lesson, we introduce Correlated Subqueries - a more advanced type of subquery that depends on cells or values from the outer query.
A subquery is correlated when it refers to a column from a table in the outer query. Unlike a regular subquery, a correlated subquery cannot be executed independently of the outer query.
How it works:
WHERE (or SELECT) clause.Performance Note: Because a correlated subquery is potentially executed once for every row in the outer query, it can be slower than a JOIN or a regular subquery on very large datasets.
The most common use is to compare a row's value against a set of data related specifically to that row.
Scenario: Find all films that have a replacement cost higher than the average replacement cost of films in the same rating category (e.g., G, PG, R).
SELECT
title,
rating,
replacement_cost
FROM
film AS f1
WHERE
replacement_cost > (
SELECT AVG(replacement_cost)
FROM film AS f2
WHERE f1.rating = f2.rating
);
f1.rating = f2.rating links the inner query to the current row of the outer query.You can use correlated subqueries to retrieve descriptive data or aggregates for each row without using a GROUP BY clause.
Scenario: Show the list of categories and the title of the longest film in each category.
SELECT
c.name AS category_name,
(
SELECT f.title
FROM film f
JOIN film_category fc ON f.film_id = fc.film_id
WHERE fc.category_id = c.category_id
ORDER BY f.length DESC
LIMIT 1) AS longest_film_title
FROM
category AS c;
We saw the EXISTS operator in the previous lesson. EXISTS is almost always used with a correlated subquery.
Scenario: Find customers who have rented at least one film at a specific store (Store 1).
SELECT
first_name,
last_name
FROM
customer AS c
WHERE
EXISTS (
SELECT 1
FROM rental AS r
INNER JOIN inventory AS i ON r.inventory_id = i.inventory_id
WHERE r.customer_id = c.customer_id
AND i.store_id = 1
);