Selecting assets to run

Selecting assets to run

Overview

The y42 build command materializes and tests assets in DAG order for selected assets or the entire space. The command supports asset selection syntax and various flags for customization.

Run the Y42 build command

Run the `y42 build` command

Selecting assets to build

By default, y42 build executes all committed assets in the space. However, you can select a subset of assets to include in the build DAG using the --select and --exclude command flags. When used in conjunction with graph and set operators, these options allow you to actively select and exclude specific assets, enabling you to perform targeted execution of Y42 build tasks.

Anatomy of a build command

Build command

Command flags

  • --select / -s
  • --exclude
  • --vars
  • --stale / --no-stale
  • --selector
  • --resource-type
  • --full-refresh / -f
  • --fail-fast / -x
  • --no-fail-fast

Graph operators

  • + selects parents or children
  • @ selects children and all of its children's upstream assets
  • +N selects parents or children up to N edges away
  • * selects matched assets within a directory

Methods

  • source: selects assets that depend on a specified source
  • exposure: selects parents of a specified exposure
  • tag: selects assets that have a specific tag

Set operators

  • space-delineation - selects union of options
  • comma-delination - selects intersect of options
CLI

y42 build --select model_1+
--vars '{key: value, date: 20180101}’

Command flags

  • --select / -s: Selects all matched assets.

  • --exclude: Excludes all matched assets.

  • --stale: Filters for stale-only jobs, excluding assets that have not yet been materialized.

  • --no-stale: Excludes stale jobs, focusing on active ones only.

  • --vars: This argument allows you to override variables defined in your dbt_project.yml file. It accepts a YAML dictionary in string format, for example, {my_variable: my_value}. See this example for more details.

  • --selector: Uses a selector name defined in selectors.yml.

  • --resource-type: Limits the command to specific resource types: [source, model, seed, all].

  • --fail-fast / -x: Stops execution at the first encountered failure, canceling the execution of any other asset, even for the ones independent of the failed asset.

  • --no-fail-fast: Skips the execution of downstream assets connected to any failed ones, while continuing the execution of assets not impacted by the failures. This behavior mirrors the default execution mode, without any flags.

  • --full-refresh / -f: Performs a full import on every selected asset, useful for incremental sources or models that require a full update.

FAQ

What mode does the y42 build command trigger?

By default, the y42 build command initiates an incremental run, unless otherwise specified. This incremental run is triggered if:

  • The model incorporates a specific materialization configuration. If there is no specific materialization type specified, y42 build performs a full refresh of the model.
  • The source table supports incremental updates.

To override the default configurations and enforce a full refresh, you should append the --full-refresh flag at the end of the y42 build command. This will ensure that a complete refresh is performed.

What happens when I rename an asset?

Renaming an asset changes its lineage hash, which requires a full-refresh in the next run. To avoid this, you can revert to the asset's original name.

The DAG is running long or timing out. How can I improve the performance?

  • Review the model logic. Simplify by breaking down complex transformations or excessive joins into multiple models.
  • Consider using a different materialization type for your model and/or upstream models.
  • Optimize SQL queries. While specific optimizations depend on the warehouse, general improvements include using GROUP BY instead of DISTINCT, or UNION ALL instead of UNION.
  • Utilize warehouse-specific features, such as partitioning and clustering in BigQuery.

Note: The above solutions are particularly helpful in addressing common issues like the compilation memory exhausted error in Snowflake. However, different warehouses may have unique errors and solutions.