dbt utils: Enhance Your dbt Workflow
The dbt-utils (opens in a new tab) package is a powerful set of utility macros and functions that extend the core functionality of dbt. It simplifies many common tasks and enhances your ability to perform complex transformations and analyses within your dbt projects.
What is dbt utils?
dbt-utils
is a package developed by dbt Labs that provides a collection of reusable macros
and functions
. These utilities help streamline various data transformation tasks, making it easier to write clean, efficient, and maintainable dbt models. The package includes macros for generating SQL code, validating data quality, and simplifying repetitive tasks.
How to Install dbt Utils
To use dbt-utils
in your dbt project, you need to add it as a dependency in your packages.yml file and then install it using the dbt deps command.
- Add to packages.yml:
- Install the package:
_10dbt deps
Commonly Used Functions and Examples
1. date_spine
This macro generates the SQL needed to create a date spine, a continuous series of dates. The spine will include the start_date (if it aligns with the specified datepart), but will not encompass the end_date.
_10{{ dbt_utils.date_spine(_10 datepart="day",_10 start_date="cast('2019-01-01' as date)",_10 end_date="cast('2020-01-01' as date)"_10) }}
2. get_relations_by_pattern
This macro retrieves a list of relations that match a specified schema or table name pattern. It’s especially useful when used in conjunction with the union_relations
macro.
_10-- Retrieves relations matching the schema pattern and table pattern_10{% set relations = dbt_utils.get_relations_by_pattern('schema_pattern%', 'table_pattern') %}_10_10-- Example using the union_relations macro_10{% set ads_relations = dbt_utils.get_relations_by_pattern('ads%', 'clicks') %}_10{{ dbt_utils.union_relations(relations = ads_relations) }}
3. union_relations
This macro combines multiple relations using UNION ALL
, handling cases where columns may be in different orders or some columns are missing from some relations. It fills missing columns with NULL
and adds a new column (_dbt_source_relation
) to indicate the source of each record.
_10{{ dbt_utils.union_relations(_10 relations=[ref('my_model'), source('my_source', 'my_table')],_10 exclude=["_loaded_at"]_10) }}
4. generate_series
This macro generates a sequence of numbers up to a specified upper bound, providing a cross-database mechanism for creating a 1-indexed result set.
_10{{ dbt_utils.generate_series(upper_bound=1000) }}
5. generate_surrogate_key
This macro creates a hashed surrogate key from specified fields, providing a consistent way to generate unique keys across different databases.
_10{{ dbt_utils.generate_surrogate_key(['column_a', 'column_b']) }}
6. log_info
This macro logs a formatted message, complete with a timestamp, to the command line.
_10{{ dbt_utils.log_info("Informative log message") }}
Example output:
_1011:07:28 | 1 of 1 START table model analytics.fct_orders........................ [RUN]_1011:07:31 + Informative log message
Manage Sources and dbt Models in one place
Build end-to-end pipelines using a single framework.
Get Started