Snowflake Common Table Expressions (CTE)
What is a CTE?
A Common Table Expression (CTE) is like a temporary table that you can use during the execution of a single SQL statement. Think of it as a way to organize complex SQL queries. It's defined within a WITH
clause at the beginning of a SQL query, and it makes your SQL statement easier to read and maintain.
How does it work?
You start by naming your CTE and optionally specifying the names of its columns. Then, you write a SQL query (usually a SELECT
) that fills the CTE with data. This data acts just like a table, which you can use in the rest of your SQL statement.
Syntax example
_10WITH my_cte (column_1, column_2) AS (_10 SELECT column_a, column_b_10 FROM some_table_10)_10SELECT * FROM my_cte;
In this example:
my_cte
is the name of the CTE.column_1
,column_2
are the names of the columns in the CTE.- The CTE gets its data from a query that selects
column_A
andcolumn_B
fromsome_table
.
Common use cases
- Simplifying Complex Queries: Break down complicated queries into simpler parts. For example, you can use a CTE to perform complex filtering or calculations, and then use the result to do further queries.
- Recursive Queries: Handle hierarchical or recursive data, like finding all subordinates of a manager in an employee table.
Why use CTEs?
CTEs make your queries modular (easy to piece together) and maintainable. They help you organize your SQL code, making it clearer and easier to understand, especially when dealing with multiple steps of data processing.
đź’ˇ Key points to remember
- A CTE is only available during the execution of the query where it’s defined—it’s not stored in the database.
- Avoid using names that are the same as existing tables or SQL functions for your CTEs to prevent confusion.
- CTEs can be non-recursive (the default) or recursive, where they can reference themselves to handle data that's nested or hierarchical.
Examples
Suppose we have a table that logs each character's visit to various planets and the number of times those visits occurred. We'll first calculate the average number of visits per planet across all characters, and then summarize how each character's average compares to the overall average.
Using a single CTE
Input
Table: galactic_visits
character_id | character_name | planet | visit_count |
---|---|---|---|
1 | Arthur Dent | Earth | 5 |
1 | Arthur Dent | Vogon Ship | 2 |
2 | Ford Prefect | Betelgeuse | 3 |
2 | Ford Prefect | Earth | 4 |
3 | Zaphod Beeblebrox | Betelgeuse | 5 |
3 | Zaphod Beeblebrox | Magrathea | 1 |
Snowflake SQL Query
_10WITH planet_average AS (_10 SELECT planet, AVG(visit_count)::decimal(2,1) AS avg_visits_10 FROM galactic_visits_10 GROUP BY planet_10)_10SELECT planet, avg_visits AS average_visits_10FROM planet_average;
Output
planet | average_visits |
---|---|
Earth | 4.5 |
Vogon Ship | 2.0 |
Betelgeuse | 4.0 |
Magrathea | 1.0 |
Referencing a CTE within another CTE
Input
Table: galactic_visits
character_id | character_name | planet | visit_count |
---|---|---|---|
1 | Arthur Dent | Earth | 5 |
1 | Arthur Dent | Vogon Ship | 2 |
2 | Ford Prefect | Betelgeuse | 3 |
2 | Ford Prefect | Earth | 4 |
3 | Zaphod Beeblebrox | Betelgeuse | 5 |
3 | Zaphod Beeblebrox | Magrathea | 1 |
Snowflake SQL Query
_12WITH planet_average AS (_12 SELECT planet, AVG(visit_count)::decimal(2,1) AS avg_visits_12 FROM galactic_visits_12 GROUP BY planet_12),_12character_planet_comparison AS (_12 SELECT g.character_name, g.planet, g.visit_count, p.avg_visits_12 FROM galactic_visits g_12 JOIN planet_average p ON g.planet = p.planet_12)_12SELECT character_name, planet, visit_count AS individual_visits, avg_visits::int AS average_visits_12FROM character_planet_comparison;
Output
character_name | planet | individual_visits | average_visits |
---|---|---|---|
Arthur Dent | Earth | 5 | 4.5 |
Ford Prefect | Earth | 4 | 4.5 |
Arthur Dent | Vogon Ship | 2 | 2.0 |
Ford Prefect | Betelgeuse | 3 | 4.0 |
Zaphod Beeblebrox | Betelgeuse | 5 | 4.0 |
Zaphod Beeblebrox | Magrathea | 1 | 1.0 |