Branch environments

Branch environments

Branch enviroments allow data teams to leverage Git branching strategies not just for their code but also for their data as well, including both the pipeline logic and the data state within the warehouse.

What are branch environments?

Each Git branch within Y42 is automatically configured as a standalone environment, encompassing the data pipeline's code and the resultant state in the data warehouse.

When you run your data pipelines, assets are materialized as data artifacts (such as tables or views) in a shared space-wide data warehouse schema, but with a twist. Y42 assigns a unique naming identifier to each table or view, linking them to specific Git branches. This coupling of code and data provides a seamless workflow where changes in code are linked to a dedicated virtual data warehouse environment.

For an in-depth guide on the underyling mechanics, read more about Virtual Data Builds, the innovative feature that makes branch environments possible.

How to use branch environments

Branch environments do not require additional setup. Unlike other systems where managing separate production, staging or development environments might involve complex configuration steps, Y42 simplifies environment management to the extent that it becomes almost invisible to the user.

Here's an example of a typical development workflow using branch environments:

Create a Git branch

Begin by creating a new branch — that's all you need to do to initiate a new branch environment in Y42.

Even though you're working on a new branch, you don't need to refresh your data pipelines to access the data. Your branch environment has immediate, read-only access to the tables or views that were previously materialized in its base branch (such as the main branch).

Develop and test

With your branch environment set up, you're now ready to develop new features, experiment with changes or test your assets. Your work in this branch will not affect production data or other versions of your pipelines.

When you modify an asset on this branch and materialize it, Y42 automatically detects your changes and creates a new table or view in the data warehouse. As such, the existing tables and views linked to the base branch are never overwritten.

Merge changes

Once you're satisfied with the changes in your branch, you can merge them back to the main branch, following your standard Git workflow.

Deploying to production

Remember how branch environments have immediate read-only access to other branches? This applies to the main branch as well. Once you merge the changes into main, the new tables or views are instantaneously available in your production dataset.

Configuring branch environments

For each branch, you can enable various features:

  • Publish data to dataset or schema: Push the data of all assets from the selected branch to a specified dataset or schema in your data warehouse. Learn more about publishing assets here.
  • Orchestration: Activate orchestration for the chosen branch.
  • Asset Health History: Enable the Asset Health History Dashboard for the specific branch.
Options available for each branch.

Options available for each branch.