A SELF JOIN is not a different type of join keyword. Instead, it is a regular join (usually an INNER JOIN or LEFT JOIN) where a table is joined with itself. This is useful for querying hierarchical data or comparing rows within the same table.
To perform a self-join, you must treat one table as if it were two separate tables. To do this, you must use table aliases to give each instance of the table a unique name. Without aliases, the database won't know which column belongs to which instance of the table.
Visualization (Employee Hierarchy):
Imagine an employee table where each row has a manager_id that points to the employee_id of their supervisor.
Table A (Employees) Table B (Managers)
+----+-------+---------+ +----+-------+
| id | name | mgr_id | | id | name |
+----+-------+---------+ +----+-------+
| 1 | Alice | NULL | | 1 | Alice |
| 2 | Bob | 1 | <-> | 1 | Alice | (Bob's Manager is Alice)
| 3 | Carol | 1 | <-> | 1 | Alice | (Carol's Manager is Alice)
+----+-------+---------+ +----+-------+
SELECT
e.name AS employee_name,
m.name AS manager_name
FROM
employee AS e
LEFT JOIN
employee AS m ON e.manager_id = m.id;
employee AS e: The first instance (representing the employees).employee AS m: The second instance (representing the managers).ON e.manager_id = m.id: The condition that links them.Suppose we want to find pairs of films that have exactly the same length. We can join the film table to itself.
SELECT
f1.title AS film_1,
f2.title AS film_2,
f1.length
FROM
film AS f1
INNER JOIN
film AS f2 ON f1.length = f2.length
WHERE
f1.film_id <> f2.film_id -- Ensure we don't match a film with itself
LIMIT 10;
The condition f1.film_id <> f2.film_id is critical. Without it, every film would match itself (since it has the same length as itself).
If we want to see which customers live in the same city (based on address_id in this simplified example):
SELECT
c1.first_name AS cust_1_first,
c1.last_name AS cust_1_last,
c2.first_name AS cust_2_first,
c2.last_name AS cust_2_last,
c1.address_id
FROM
customer AS c1
INNER JOIN
customer AS c2 ON c1.address_id = c2.address_id
WHERE
c1.customer_id < c2.customer_id; -- Use '<' instead of '<>' to avoid duplicate pairs (A-B and B-A)
ON conditions to define the relationship between rows (e.g., hierarchy or shared attributes).WHERE filter conditions like id1 <> id2 or id1 < id2 to avoid matching a row with itself or returning redundant pairs.