The UNION and UNION ALL operators in SQL are used to combine the results of two or more SELECT queries into a single result set. Both operators allow you to merge the result sets, but they differ in how they handle duplicates.
The UNION operator combines the result sets of two or more SELECT queries and removes any duplicate rows. It ensures that the final result set contains only unique rows.
The basic syntax of the UNION operator is:
SELECT column1, column2, ... FROM table1 UNION SELECT column1, column2, ... FROM table2;
In this syntax:
Suppose you have two tables: Customers_2023 and Customers_2024, which store customer data for two different years. You want to get a list of unique customers from both tables. You can use the UNION operator:
SELECT CustomerID, Name FROM Customers_2023 UNION SELECT CustomerID, Name FROM Customers_2024;
This query will return a list of unique customers from both tables, removing any duplicate customers (based on the CustomerID and Name).
Suppose the Customers_2023 table contains:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie
And the Customers_2024 table contains:
CustomerID | Name -------------------- 2 | Bob 4 | David 5 | Eve
The result of the UNION query would be:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie 4 | David 5 | Eve
As shown, the duplicate row for "Bob" (CustomerID 2) is removed, and the final result contains only unique customer records.
The UNION ALL operator also combines the results of two or more SELECT queries, but it does not remove duplicate rows. All rows from each query are included in the result set, even if they are duplicates.
The basic syntax of the UNION ALL operator is:
SELECT column1, column2, ... FROM table1 UNION ALL SELECT column1, column2, ... FROM table2;
In this syntax:
Using the same Customers_2023 and Customers_2024 tables, if you want to retrieve all customers, including duplicates, you can use the UNION ALL operator:
SELECT CustomerID, Name FROM Customers_2023 UNION ALL SELECT CustomerID, Name FROM Customers_2024;
This query will return all rows from both tables, including any duplicate customers.
With the same data in Customers_2023 and Customers_2024, the result of the UNION ALL query would be:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie 2 | Bob 4 | David 5 | Eve
As shown, the duplicate row for "Bob" (CustomerID 2) is not removed, and the result includes all rows from both tables.
When working with large datasets, using UNION ALL can improve performance because it avoids the overhead of checking for and removing duplicate rows. However, if you need to eliminate duplicates in the result set, you should use UNION, even if it might be slower.
The UNION and UNION ALL operators are powerful tools for combining results from multiple SELECT queries. Use UNION when you want to remove duplicates and UNION ALL when you want to keep all results, including duplicates. Understanding when to use each operator can help optimize query performance and ensure that you get the desired results.