dbt Cheat Sheet

dbt (data build tool) is extensively utilized for data transformation in the analytics engineering field, operating on top of data in warehouse technologies. dbt enables data analysts and engineers to transform, test, and document data in the environment of the warehouse itself. This dbt Cheat Sheet offers a comprehensive reference to the most commonly used dbt commands, including their arguments, and graph and set operators.

Anatomy of a dbt command

A dbt command typically includes a primary action (such as run, test, or compile), which can be accompanied by a series of arguments and operators that help to selectively execute and manage your data transformations:

  • Command: The primary action dbt is to perform, like run or test.
  • Arguments: Options such as --select or --exclude that specify which models or tests to include or omit.
  • Graph Operators: Symbols like + or @ that define relationships and dependencies between models, allowing you to include ancestors (upstream models) or descendants (downstream models).
  • Set Operators: Used within arguments like --select to combine or intersect sets of models.

Example command:


_10
# Run all models in models/staging, while excluding any models tagged as 'deprecated'
_10
dbt run --select models/staging --exclude tag:deprecated

dbt Arguments

dbt supports a range of command-line arguments that enable selective and efficient execution of data transformation tasks. These arguments, such as --select, --exclude, --state, and --defer help manage and optimize dbt runs by targeting specific nodes or leveraging past run artifacts.

--select
Specifies the nodes (models, tests, seeds) to run. Use node names, tags, or paths.

_10
dbt run --select sales_model

--exclude
Excludes specific nodes from the dbt run.

_10
dbt run --exclude long_running_model

--resource-type
Filters nodes by type (model, seed, test, snapshot).

_10
dbt ls --resource-type model

--selector
Executes nodes specified in a predefined YAML selector file (opens in a new tab).

_10
dbt run --selector weekly_jobs

--state
Points to a directory with historical artifacts to compare changes.

_10
dbt run --select state:modified+ --state ./path/to/artifacts

--defer
Uses results from previous runs for upstream dependencies not directly selected. Ideal for development to reuse production assets without recomputation. Pair with the --state flag to locate previous artifacts.

_10
dbt run --select revenue_forecast+ --defer --state ./path/to/prod/artifacts

--empty
Supports building schema-only dry runs by limiting refs and sources to zero rows. Useful for validating dependencies and ensuring models build properly without expensive reads of input data.

_10
dbt run --empty

-–full-refresh
Treats incremental models as table models, ideal for recreating them following schema changes or reprocessing entire models due to updates in logic.

_10
dbt run --full-refresh

dbt Graph and Set Operators

The Graph and Set operators provide powerful methods for selectively running transformations based on complex relationships and criteria within your data models.

Graph Operators

The "plus" operator (+)
Selects all ancestors (upstream) or descendants (downstream) of a model, including the model itself. Place + at the start to select ancestors, at the end to select descendants, or both to select both.

_10
# select `sales_model` and all its descendants
_10
dbt run --select "sales_model+"
_10
_10
# select `user_engagement` and all its ancestors
_10
dbt run --select "+user_engagement"
_10
_10
# select `daily_metrics`, its ancestors, and descendants
_10
dbt run --select "+daily_metrics+"

The "n-plus" operator (n+)
Adjusts the depth of the + operator to specify exact generations of ancestors or descendants.

_10
# select `customer_demographics` and its descendants up to two levels down
_10
dbt run --select "customer_demographics+2"
_10
_10
# select `revenue_forecast` and its first-degree ancestors
_10
dbt run --select "1+revenue_forecast"
_10
_10
# select `order_items`, two levels of ancestors, and three levels of descendants
_10
dbt run --select "2+order_items+3"

The "at" operator (@)
Selects a model, all its descendants, and the necessary ancestors of those descendants, ensuring all can be built successfully.

_10
# select `marketing_spend`, its descendants, and the ancestors of its descendants
_10
dbt run --select "@marketing_spend"

Set Operators

Unions
Combines multiple model selections into one set; any model listed in any of the selections is included.

_10
# run `inventory_levels`, all its ancestors, and `customer_retention`, and all its ancestors
_10
dbt run --select "+inventory_levels +customer_retention"

Intersections
Selects only models that meet all criteria specified. Separate selections with commas to indicate intersection.

_10
# run common ancestors of both `monthly_sales` and `quarterly_goals`
_10
dbt run --select "+monthly_sales,+quarterly_goals"
_10
_10
# select common descendants of `product_details` and `category_summaries`
_10
dbt run --select "product_details+,category_summaries+"
_10
_10
# run models in the `marts/supply_chain` directory tagged `urgent`
_10
dbt run --select "marts.supply_chain,tag:urgent"

dbt Commands

Essentials

Fundamental commands that are commonly used in daily operations and critical for routine dbt tasks.

dbt run Explore ↗
Executes models in a dbt project.

accepts: --select, --exclude, --selector, --resource-type, --defer, --empty, --full-refresh


_10
# run `sales_model` and all its descendants
_10
dbt run --select "sales_model+"
_10
_10
# Run all models tagged as 'daily' while excluding a specific model
_10
dbt run --select tag:daily --exclude "long_running_model"

dbt test Explore ↗
Executes tests defined on models, sources, snapshots, and seeds. Supports running both data tests and schema tests.

accepts: --select, --exclude, --selector, --defer


_10
# Run all tests in a project
_10
dbt test
_10
_10
# Run only unit tests
_10
dbt test --select test_type:unit
_10
_10
# Exclude specific tests from execution
_10
dbt test --exclude "stale_data_check"

dbt seed Explore ↗
Loads data from CSV files into your data warehouse, creating or replacing static tables. Useful for small reference data sets or lookup tables.

accepts: --select, --exclude, --selector


_10
# Load all seed files in the project
_10
dbt seed
_10
_10
# Load specific seed file with detailed output
_10
dbt seed --select "regions" --show
_10
_10
# Refresh all seed data while providing detailed logging
_10
dbt seed --full-refresh --show

dbt snapshot Explore ↗
Executes jobs defined in a project to capture and track changes to a queried dataset over time (similar to SCD2). Useful for historical data tracking and change auditing.

accepts: --select, --exclude


_10
# Run all snapshot jobs within a project
_10
dbt snapshot
_10
_10
# Execute a specific snapshot job
_10
dbt snapshot --select "my_snapshot"
_10
_10
# Exclude a snapshot from execution
_10
dbt snapshot --exclude "old_snapshot"

dbt build Explore ↗
Executes models, runs tests, snapshots snapshots, and seeds seeds, following the dependencies defined in the DAG. Useful for complete builds of selected resources or the entire project.

accepts: --select, --exclude, --selector, --resource-type, --defer, --empty, --full-refresh


_10
# Execute a complete build of all project components
_10
dbt build
_10
_10
# Execute a full-refresh build, reprocessing all incremental models
_10
dbt build --full-refresh
_10
_10
# Perform a schema-only dry run to validate model deps without processing data
_10
dbt build --empty

dbt docs Explore ↗
Generates and serves documentation for your dbt project. It compiles resources and metadata into a searchable and interactive website.

accepts: --select, --no-compile, --empty-catalog


_11
# Generate documentation for all project components
_11
dbt docs generate
_11
_11
# Serve the generated documentation locally on a custom port
_11
dbt docs serve --port 8081
_11
_11
# Generate documentation without re-compiling models
_11
dbt docs generate --no-compile
_11
_11
# Generate documentation with limited node inclusion
_11
dbt docs generate --select "tag:critical"

Advanced

Commands that are used for specific scenarios that may require a more detailed understanding of dbt functionalities or involve complex operations.

dbt clone
Clones selected models from the specified state to the target schema; uses database-specific features like zero-copy cloning where available. Supports use cases such as blue/green deployments and development schema updates.

accepts: --state, --select, --full-refresh, --threads


_10
# Clone all models from a specified state to your target schemas
_10
dbt clone --state path/to/artifacts
_10
_10
# Clone a model from a past state and refresh all relationships in the target env
_10
dbt clone --select "target_model" --state path/to/artifacts --full-refresh
_10
_10
# Clone models with increased parallel processing using 4 threads
_10
dbt clone --state path/to/artifacts --threads 4

dbt run-operation
Executes a specific macro within the dbt project, allowing for the running of customized SQL operations directly through dbt.

accepts: --args


_10
# Execute a macro to grant select permissions with specified arguments
_10
dbt run-operation grant_select --args '{role: "analyst"}'
_10
_10
# Run a maintenance macro to clean up old models, specifying days and dry_run status
_10
dbt run-operation cleanup_old_models --args '{days: 30, dry_run: False}'
_10
_10
# Perform a custom operation to update data partitions
_10
dbt run-operation update_partitions --args '{partition_date: "2023-01-01"}'

dbt source freshness
Assesses the recency of data in source tables.

accepts: --select, --exclude, --selector, --output


_10
# Check the freshness of all sources in the project
_10
dbt source freshness
_10
_10
# Validate freshness for a specific source table
_10
dbt source freshness --select "source:google_ads.campaign"
_10
_10
# Output freshness results to a custom JSON file
_10
dbt source freshness --output custom_freshness_results.json

Helpers

Utility commands that assist with project setup, troubleshooting, and maintenance, which might not be used daily but are essential for project health and optimization.

dbt init
Initializes a new dbt project by creating necessary configuration files and directory structures.

accepts: --profile


_10
# Initialize a new dbt project with a specific profile
_10
dbt init my_new_project --profile my_profile
_10
_10
# Create a new dbt project with default settings
_10
dbt init my_fresh_project

dbt deps
Manages and updates project dependencies specified in the packages.yml file.

accepts: --no-partial-parse, --upgrade, --add-package, --source


_10
# Update and reinstall all dependencies
_10
dbt deps --upgrade
_10
_10
# Add or update a specific package configuration
_10
dbt deps --add-package dbt-labs/dbt_utils@1.2.0 --source git
_10
_10
# Install a package from a local directory
_10
dbt deps --add-package /path/to/local/package --source local

dbt ls (list)
Lists all the resources within a dbt project. It can filter resources based on type, tags, or custom selectors.

accepts: --resource-type, --select, --models (Similar to --select + --resource-type=model), --exclude, --selector, --output, --output-keys


_10
# List all models in JSON format
_10
dbt list --resource-type model --output json
_10
_10
# List all sources with a specific tag in name format
_10
dbt list --select tag:raw --output name
_10
_10
# Exclude specific models and list the rest
_10
dbt list --exclude "stg_orders" --output path

dbt clean
Removes all directories listed in the clean-targets section of the dbt_project.yml. Useful to clean up dbt_packages and target directories.

accepts: None


_10
dbt clean

dbt compile
Generates SQL from project models without executing them. Stores the SQL files in target/ directory.

accepts: --select, --exclude, --selector, --inline


_10
# Compile a specific staging model to review its SQL
_10
dbt compile --select "stg_orders"
_10
_10
# Compile an inline SQL statement referencing a raw customers table
_10
dbt compile --inline "select * from {{ ref('raw_orders') }}"

dbt debug
Tests the database connection and displays configuration details for debugging.

accepts: --config-dir


_10
# Run a basic debug to test connection and environment setup
_10
dbt debug
_10
_10
# Show configured location for the profiles.yml file
_10
dbt debug --config-dir

dbt parse
Analyzes and validates your dbt project structure, checking for Jinja or YAML syntax errors.

accepts: --no-partial-parse


_10
# Perform a standard parse of the dbt project
_10
dbt parse
_10
_10
# Force a complete re-parse of the entire project, ignoring any cached results
_10
dbt parse --no-partial-parse

dbt retry
Re-executes the last dbt command starting from the point of failure.

accepts: None


_12
# Retry the last dbt command after a failure
_12
dbt retry
_12
_12
# Example where the retry command successfully reruns a failed model after the error is fixed
_12
dbt run --select "faulty_model"
_12
# Assume "faulty_model" fails due to a syntax error
_12
dbt retry # After fixing the syntax error
_12
_12
# Example demonstrating idempotent behavior when no changes are made after a failure
_12
dbt run --select "stable_model"
_12
# Assume "stable_model" fails and no changes are made to address the failure
_12
dbt retry # Will likely fail again without changes

dbt show
Compiles and executes the SQL definition of a model, test, analysis, or any dbt-SQL query, displaying the results directly in the terminal for preview.

accepts: --select, --inline, --limit


_10
# Preview the first 5 rows from a specific model's output
_10
dbt show --select "stg_customers"
_10
_10
# Execute and display results of an arbitrary query from a model
_10
dbt show --inline "select * from {{ ref('stg_products') }}"
_10
_10
# Display a custom number of rows for a model's preview
_10
dbt show --select "stg_orders" --limit 10