dbt Cheat Sheet

dbt (data build tool) is extensively utilized for data transformation in the analytics engineering field, operating on top of data in warehouse technologies. dbt enables data analysts and engineers to transform, test, and document data in the environment of the warehouse itself. This dbt Cheat Sheet offers a comprehensive reference to the most commonly used dbt commands, including their arguments, and graph and set operators.

Anatomy of a dbt command

A dbt command typically includes a primary action (such as run, test, or compile), which can be accompanied by a series of arguments and operators that help to selectively execute and manage your data transformations:

Command: The primary action dbt is to perform, like run or test.
Arguments: Options such as --select or --exclude that specify which models or tests to include or omit.
Graph Operators: Symbols like + or @ that define relationships and dependencies between models, allowing you to include ancestors (upstream models) or descendants (downstream models).
Set Operators: Used within arguments like --select to combine or intersect sets of models.

Example command:


_10# Run all models in models/staging, while excluding any models tagged as 'deprecated'
_10dbt run --select models/staging --exclude tag:deprecated

dbt Arguments

dbt supports a range of command-line arguments that enable selective and efficient execution of data transformation tasks. These arguments, such as --select, --exclude, --state, and --defer help manage and optimize dbt runs by targeting specific nodes or leveraging past run artifacts.

--select

Specifies the nodes (models, tests, seeds) to run. Use node names, tags, or paths.


_10dbt run --select sales_model

--exclude

Excludes specific nodes from the dbt run.


_10dbt run --exclude long_running_model

--resource-type

Filters nodes by type (model, seed, test, snapshot).


_10dbt ls --resource-type model

--selector

Executes nodes specified in a predefined YAML selector file.


_10dbt run --selector weekly_jobs

--state

Points to a directory with historical artifacts to compare changes.


_10dbt run --select state:modified+ --state ./path/to/artifacts

--defer

Uses results from previous runs for upstream dependencies not directly selected. Ideal for development to reuse production assets without recomputation. Pair with the --state flag to locate previous artifacts.


_10dbt run --select revenue_forecast+ --defer --state ./path/to/prod/artifacts

--empty

Supports building schema-only dry runs by limiting refs and sources to zero rows. Useful for validating dependencies and ensuring models build properly without expensive reads of input data.


_10dbt run --empty

-–full-refresh

Treats incremental models as table models, ideal for recreating them following schema changes or reprocessing entire models due to updates in logic.


_10dbt run --full-refresh

dbt Graph and Set Operators

The Graph and Set operators provide powerful methods for selectively running transformations based on complex relationships and criteria within your data models.

Graph Operators

The "plus" operator (+)

Selects all ancestors (upstream) or descendants (downstream) of a model, including the model itself. Place + at the start to select ancestors, at the end to select descendants, or both to select both.


_10# select `sales_model` and all its descendants
_10dbt run --select "sales_model+" 
_10
_10# select `user_engagement` and all its ancestors
_10dbt run --select "+user_engagement"
_10
_10# select `daily_metrics`, its ancestors, and descendants
_10dbt run --select "+daily_metrics+"

The "n-plus" operator (n+)

Adjusts the depth of the + operator to specify exact generations of ancestors or descendants.


_10# select `customer_demographics` and its descendants up to two levels down
_10dbt run --select "customer_demographics+2"
_10
_10# select `revenue_forecast` and its first-degree ancestors  
_10dbt run --select "1+revenue_forecast"
_10
_10# select `order_items`, two levels of ancestors, and three levels of descendants
_10dbt run --select "2+order_items+3"

The "at" operator (@)

Selects a model, all its descendants, and the necessary ancestors of those descendants, ensuring all can be built successfully.


_10# select `marketing_spend`, its descendants, and the ancestors of its descendants
_10dbt run --select "@marketing_spend"

Set Operators

Unions

Combines multiple model selections into one set; any model listed in any of the selections is included.


_10# run `inventory_levels`, all its ancestors, and `customer_retention`, and all its ancestors
_10dbt run --select "+inventory_levels +customer_retention"

Intersections

Selects only models that meet all criteria specified. Separate selections with commas to indicate intersection.


_10# run common ancestors of both `monthly_sales` and `quarterly_goals`
_10dbt run --select "+monthly_sales,+quarterly_goals"
_10
_10# select common descendants of `product_details` and `category_summaries` 
_10dbt run --select "product_details+,category_summaries+"
_10
_10# run models in the `marts/supply_chain` directory tagged `urgent`
_10dbt run --select "marts.supply_chain,tag:urgent"

dbt Commands

Essentials

Fundamental commands that are commonly used in daily operations and critical for routine dbt tasks.

dbt run Explore ↗

Executes models in a dbt project.

accepts: --select, --exclude, --selector, --resource-type, --defer, --empty, --full-refresh


_10# run `sales_model` and all its descendants
_10dbt run --select "sales_model+" 
_10
_10# Run all models tagged as 'daily' while excluding a specific model
_10dbt run --select tag:daily --exclude "long_running_model"

dbt test Explore ↗

Executes tests defined on models, sources, snapshots, and seeds. Supports running both data tests and schema tests.

accepts: --select, --exclude, --selector, --defer


_10# Run all tests in a project
_10dbt test
_10
_10# Run only unit tests
_10dbt test --select test_type:unit
_10
_10# Exclude specific tests from execution
_10dbt test --exclude "stale_data_check"

dbt seed Explore ↗

Loads data from CSV files into your data warehouse, creating or replacing static tables. Useful for small reference data sets or lookup tables.

accepts: --select, --exclude, --selector


_10# Load all seed files in the project
_10dbt seed
_10
_10# Load specific seed file with detailed output
_10dbt seed --select "regions" --show
_10
_10# Refresh all seed data while providing detailed logging
_10dbt seed --full-refresh --show

dbt snapshot Explore ↗

Executes jobs defined in a project to capture and track changes to a queried dataset over time (similar to SCD2). Useful for historical data tracking and change auditing.

accepts: --select, --exclude


_10# Run all snapshot jobs within a project
_10dbt snapshot
_10
_10# Execute a specific snapshot job
_10dbt snapshot --select "my_snapshot"
_10
_10# Exclude a snapshot from execution
_10dbt snapshot --exclude "old_snapshot"

dbt build Explore ↗

Executes models, runs tests, snapshots snapshots, and seeds seeds, following the dependencies defined in the DAG. Useful for complete builds of selected resources or the entire project.

accepts: --select, --exclude, --selector, --resource-type, --defer, --empty, --full-refresh


_10# Execute a complete build of all project components
_10dbt build
_10
_10# Execute a full-refresh build, reprocessing all incremental models
_10dbt build --full-refresh
_10
_10# Perform a schema-only dry run to validate model deps without processing data
_10dbt build --empty

dbt docs Explore ↗

Generates and serves documentation for your dbt project. It compiles resources and metadata into a searchable and interactive website.

accepts: --select, --no-compile, --empty-catalog


_11# Generate documentation for all project components
_11dbt docs generate
_11
_11# Serve the generated documentation locally on a custom port
_11dbt docs serve --port 8081
_11
_11# Generate documentation without re-compiling models
_11dbt docs generate --no-compile
_11
_11# Generate documentation with limited node inclusion
_11dbt docs generate --select "tag:critical"

Advanced

Commands that are used for specific scenarios that may require a more detailed understanding of dbt functionalities or involve complex operations.

dbt clone

Clones selected models from the specified state to the target schema; uses database-specific features like zero-copy cloning where available. Supports use cases such as blue/green deployments and development schema updates.

accepts: --state, --select, --full-refresh, --threads


_10# Clone all models from a specified state to your target schemas
_10dbt clone --state path/to/artifacts
_10
_10# Clone a model from a past state and refresh all relationships in the target env
_10dbt clone --select "target_model" --state path/to/artifacts --full-refresh
_10
_10# Clone models with increased parallel processing using 4 threads
_10dbt clone --state path/to/artifacts --threads 4

dbt run-operation

Executes a specific macro within the dbt project, allowing for the running of customized SQL operations directly through dbt.

accepts: --args


_10# Execute a macro to grant select permissions with specified arguments
_10dbt run-operation grant_select --args '{role: "analyst"}'
_10
_10# Run a maintenance macro to clean up old models, specifying days and dry_run status
_10dbt run-operation cleanup_old_models --args '{days: 30, dry_run: False}'
_10
_10# Perform a custom operation to update data partitions
_10dbt run-operation update_partitions --args '{partition_date: "2023-01-01"}'

dbt source freshness

Assesses the recency of data in source tables.

accepts: --select, --exclude, --selector, --output


_10# Check the freshness of all sources in the project
_10dbt source freshness
_10
_10# Validate freshness for a specific source table
_10dbt source freshness --select "source:google_ads.campaign"
_10
_10# Output freshness results to a custom JSON file
_10dbt source freshness --output custom_freshness_results.json

Helpers

Utility commands that assist with project setup, troubleshooting, and maintenance, which might not be used daily but are essential for project health and optimization.

dbt init

Initializes a new dbt project by creating necessary configuration files and directory structures.

accepts: --profile


_10# Initialize a new dbt project with a specific profile
_10dbt init my_new_project --profile my_profile
_10
_10# Create a new dbt project with default settings
_10dbt init my_fresh_project

dbt deps

Manages and updates project dependencies specified in the packages.yml file.

accepts: --no-partial-parse, --upgrade, --add-package, --source


_10# Update and reinstall all dependencies
_10dbt deps --upgrade
_10
_10# Add or update a specific package configuration
_10dbt deps --add-package dbt-labs/dbt_utils@1.2.0 --source git
_10
_10# Install a package from a local directory
_10dbt deps --add-package /path/to/local/package --source local

dbt ls (list)

Lists all the resources within a dbt project. It can filter resources based on type, tags, or custom selectors.

accepts: --resource-type, --select, --models (Similar to --select + --resource-type=model), --exclude, --selector, --output, --output-keys


_10# List all models in JSON format
_10dbt list --resource-type model --output json
_10
_10# List all sources with a specific tag in name format
_10dbt list --select tag:raw --output name
_10
_10# Exclude specific models and list the rest
_10dbt list --exclude "stg_orders" --output path

dbt clean

Removes all directories listed in the clean-targets section of the dbt_project.yml. Useful to clean up dbt_packages and target directories.

accepts: None


_10dbt clean

dbt compile

Generates SQL from project models without executing them. Stores the SQL files in target/ directory.

accepts: --select, --exclude, --selector, --inline


_10# Compile a specific staging model to review its SQL
_10dbt compile --select "stg_orders"
_10
_10# Compile an inline SQL statement referencing a raw customers table
_10dbt compile --inline "select * from {{ ref('raw_orders') }}"

dbt debug

Tests the database connection and displays configuration details for debugging.

accepts: --config-dir


_10# Run a basic debug to test connection and environment setup
_10dbt debug
_10
_10# Show configured location for the profiles.yml file
_10dbt debug --config-dir

dbt parse

Analyzes and validates your dbt project structure, checking for Jinja or YAML syntax errors.

accepts: --no-partial-parse


_10# Perform a standard parse of the dbt project
_10dbt parse
_10
_10# Force a complete re-parse of the entire project, ignoring any cached results
_10dbt parse --no-partial-parse

dbt retry

Re-executes the last dbt command starting from the point of failure.

accepts: None


_12# Retry the last dbt command after a failure
_12dbt retry
_12
_12# Example where the retry command successfully reruns a failed model after the error is fixed
_12dbt run --select "faulty_model"
_12# Assume "faulty_model" fails due to a syntax error
_12dbt retry # After fixing the syntax error
_12
_12# Example demonstrating idempotent behavior when no changes are made after a failure
_12dbt run --select "stable_model"
_12# Assume "stable_model" fails and no changes are made to address the failure
_12dbt retry # Will likely fail again without changes

dbt show

Compiles and executes the SQL definition of a model, test, analysis, or any dbt-SQL query, displaying the results directly in the terminal for preview.

accepts: --select, --inline, --limit


_10# Preview the first 5 rows from a specific model's output
_10dbt show --select "stg_customers"
_10
_10# Execute and display results of an arbitrary query from a model
_10dbt show --inline "select * from {{ ref('stg_products') }}"
_10
_10# Display a custom number of rows for a model's preview
_10dbt show --select "stg_orders" --limit 10