Snowflake FIRST_VALUE: Retrieving First Values in Data Sets
FIRST_VALUE
is a window function that returns the first value in an ordered set of values within a partition. It’s typically used in reports or data analysis where you need to display the initial value in a dataset according to some sort of chronological order or based on a specific sorting order.
Syntax
_10FIRST_VALUE( <expr> ) [ IGNORE | RESPECT NULLS ]_10OVER ( [ PARTITION BY <expr1> ] ORDER BY <expr2> [ ASC | DESC ] [ <window_frame> ] )
where:
expr
: The column or expression that you want to evaluate.PARTITION BY expr1
: (Optional) Divides the result set into partitions to which the function is applied.ORDER BY expr2
: Defines the order of data within each partition. This is necessary for FIRST_VALUE to determine which value is the first.
Usage Notes
- If
IGNORE | RESPECT NULLS
is not specified, the default behavior is to respect nulls, which means that if the first value is null, null is returned. - You must define a window using
OVER()
which includes ordering; without it, the function can’t determine which value is first.
Common Use Cases
- Fetching the earliest purchase date for each customer.
- Getting the initial price at which a stock was traded on a particular day.
- Identifying the first logged event of a system or process.
Examples
Identify First Value in a Set
Let’s assume we have a table of customer purchases, and we want to identify the first purchase date for each customer.
Input
Table: customer_purchases
customer_id | purchase_date | amount |
---|---|---|
1 | 2021-01-01 | 100.00 |
1 | 2021-02-01 | 110.00 |
2 | 2021-01-15 | 200.00 |
2 | 2021-03-01 | 220.00 |
3 | 2021-02-01 | 300.00 |
Snowflake SQL Query
_10SELECT_10 customer_id,_10 FIRST_VALUE(purchase_date) OVER (PARTITION BY customer_id ORDER BY purchase_date) AS first_purchase_date_10FROM customer_purchases_10ORDER BY customer_id;
Output
customer_id | first_purchase_date |
---|---|
1 | 2021-01-01 |
1 | 2021-01-01 |
2 | 2021-01-15 |
2 | 2021-01-15 |
3 | 2021-02-01 |
Explanation
This query will return the first purchase date for each customer based on the chronological order of their purchases. The PARTITION BY
clause groups records by customer_id
, and the ORDER BY clause within the OVER()
function specifies that records should be ordered by purchase_date
. Thus, FIRST_VALUE
picks the earliest date from each group.