Snowflake MERGE: How to Synchronize Data Across Tables
The MERGE
function is a SQL command used to synchronize two tables or datasets. It can insert new records, update existing ones, or delete records based on the data in a second table. Think of it as an all-in-one tool for ensuring your main table is up to date with the latest changes.
What it does
MERGE
looks at two sets of data: one is your main dataset (the target), and the other is usually a newer dataset (the source) that contains updates, new additions, or deletions. Based on rules you define, it updates the target table by:
- Inserting new rows found in the source but not in the target.
- Updating existing rows in the target where corresponding matches are found in the source.
- Deleting rows in the target if specified conditions are met.
Syntax
Here’s the basic way you write a MERGE statement:
_10MERGE INTO <target_table> USING <source_table>_10ON <join_condition>_10WHEN MATCHED THEN_10 UPDATE SET <column> = <expression>_10 DELETE WHERE <condition>_10WHEN NOT MATCHED THEN_10 INSERT (<column_names>) VALUES (<values>)
where:
<target_table>
: The table you want to update.<source_table>
: The table that contains new or updated data.<join_condition>
: The condition used to match rows between the two tables.- WHEN MATCHED and WHEN NOT MATCHED: Define what actions to take when rows match or do not match the join condition.
Examples
Sync two tables
Imagine you have a table of registered conference attendees (target_table) and a new sign-up list (source_table). You want to make sure the attendee list is updated with any new sign-ups, correct existing attendee details, or remove cancellations:
Input
Table: conference_attendees
name | canceled | |
---|---|---|
john.doe@example.com | John Doe | FALSE |
jane.smith@example.com | Jane Smith | TRUE |
Table: new_signups
name | canceled | |
---|---|---|
john.doe@example.com | Jonathan Doe | FALSE |
alice.jones@example.com | Alice Jones | FALSE |
Snowflake SQL Query
_10MERGE INTO conference_attendees USING new_signups_10ON conference_attendees.email = new_signups.email_10WHEN MATCHED AND new_signups.canceled = FALSE THEN_10 UPDATE SET name = new_signups.name_10WHEN MATCHED AND new_signups.canceled = TRUE THEN_10 DELETE_10WHEN NOT MATCHED THEN_10 INSERT (email, name, canceled) VALUES (new_signups.email, new_signups.name, new_signups.canceled);
Output
Table: conference_attendees
name | canceled | |
---|---|---|
alice.jones@example.com | Alice Jones | FALSE |
john.doe@example.com | Jonathan Doe | FALSE |
jane.smith@example.com | Jane Smith | TRUE |
Explanation
- Update Existing Attendees: The MERGE command updates the name of
John Doe
toJonathan Doe
based on the updated information in new_signups. - Delete Canceled Registrations: If any attendee in new_signups is marked as canceled (no such case in this example), their record would be deleted from conference_attendees.
- Add New Attendees:
Alice Jones
, a new attendee from the new_signups table, is added to the conference_attendees.