dbt tags: Categorize and Manage Resources

In dbt, tags are a flexible way to categorize and manage different resources within your project, such as models, snapshots, and seeds. You can assign tags to these resources to facilitate selective execution of commands, making it easier to manage parts of your project during development and deployment.

Applying Tags

You can define tags in the dbt_project.yml file under the appropriate resource type. Tags can be a single string or a list of strings, depending on how many categories you want to apply to a resource.

dbt_project.yml


_17models:
_17  analytics:
_17    +tags: "sensitive_data"
_17
_17    staging:
_17      +tags:
_17        - "nightly"
_17
_17    marts:
_17      +tags:
_17        - "nightly"
_17        - "customer_data"
_17
_17    reports:
_17      +tags:
_17        - "weekly"
_17        - "external"

Additionally, you can apply tags directly to individual models using the config block within the SQL file:

models/staging/stg_orders.sql


_10{{ config(
_10    tags=["revenue", "critical"]
_10) }}
_10
_10SELECT ...

Using Tags for Selective Execution

Once tags are applied, you can use them to run specific parts of your project. For example, to execute all models with the weekly tag while excluding those tagged nightly:


_10$ dbt run --select tag:weekly --exclude tag:nightly

Tags in Seeds

Tags can also be applied to seeds to group and manage static data that supports your transformations:

dbt_project.yml


_10seeds:
_10  analytics:
_10    marketing_data:
_10      +tags: ["marketing", "daily"]

Hierarchical Accumulation of Tags

Tags accumulate across the hierarchy of your project configuration. For instance, if a parent node in your dbt_project.yml is tagged "sensitive_data”, all child nodes inherit this tag unless specifically overridden or extended:

Accumulation Example:

stg_orders.sql inherits tags: ["revenue”, "critical”, "nightly”]
dim_customers.sql in marts inherits tags: ["customer_data”, "nightly”, "sensitive_data”]

Beyond models and seeds

Tags are not limited to models and seeds; they can also be applied to sources, exposures, and even specific columns within a table. This extended functionality allows for granular control over test execution and documentation exposure.

schema.yml


_14sources:
_14  - name: sales_data
_14    tags: ["source_level"]
_14
_14    tables:
_14      - name: transaction_log
_14        tags: ["table_level"]
_14
_14        columns:
_14          - name: transaction_id
_14            tags: ["column_level"]
_14            tests:
_14              - not_null:
_14                  tags: ["test_level"]

In this setup, a unique test tagged "test_level” can be executed using the tag selector, providing a powerful way to manage tests across different layers of your data architecture.

Manage Sources and dbt Models in one place

Build end-to-end pipelines using a single framework.

Get Started