Snowflake UNION and UNION ALL: How to Combine Data Sets
In Snowflake, the UNION operator is a set operator used to combine the results of two distinct queries into a single result set. Useful for when you need to merge similar data stored across different tables or result sets.
How UNION Works
The UNION operator combines results from two queries:
UNION
: Eliminates duplicate rows (acts like DISTINCT).UNION ALL
: Retains duplicates, merging all results from both queries.
Usage Notes
- Column Count: Each query combined by the UNION must select an equal number of columns.
- Data Type Compatibility: Ensure that corresponding columns in each query have matching or compatible data types.
- Relevant Meaning: Columns should ideally represent similar data or concepts for meaningful results.
Syntax
Here’s the general structure for using UNION:
_10SELECT column1, column2 FROM table1_10UNION [ALL]_10SELECT column1, column2 FROM table2;
Examples
Union vs. Union ALL
Imagine we have two tables: Earthlings and Vogons. Both tables store names, and some names may appear in both tables.
Input
Table: Earthlings
name |
---|
Arthur Dent |
Ford Prefect |
Trillian |
Table: Vogons
name |
---|
Jeltz |
Ford Prefect |
Using UNION
The UNION
operation will combine results from both tables and eliminate duplicates:
_10SELECT name FROM Earthlings_10UNION_10SELECT name FROM Vogons;
This would result in the following, with duplicates removed:
name |
---|
Arthur Dent |
Ford Prefect |
Trillian |
Jeltz |
Using UNION ALL
The UNION ALL
operation, on the other hand, does not eliminate duplicates and shows all entries:
_10SELECT name FROM Earthlings_10UNION ALL_10SELECT name FROM Vogons;
This would result in:
name |
---|
Arthur Dent |
Ford Prefect |
Trillian |
Jeltz |
Ford Prefect |
Notice how Ford Prefect appears twice in the UNION ALL
result because it is found in both tables.