Asset properties

Asset properties

Core properties

All assets have five core properties. These properties are specified in a YAML configuration, except for data transformations which are expressed with a SQL query.

Asset type - models, sources or exposures

Name - Assets in a Y42 space share the same namespace. Each asset must have a unique name.

Materialization

Tests

Tags, Descriptions and Metadata

View all properties

models/stg_customers.yml
stg_customers.sql

_22
version: 2
_22
_22
models:
_22
- name: stg_customers
_22
meta:
_22
experts:
_22
users:
_22
- lea.dang@y42.com
_22
description: Staging model
_22
config:
_22
tags:
_22
- customer
_22
columns:
_22
- name: id
_22
data_type: FLOAT
_22
description: The id is unique
_22
tests:
_22
- unique
_22
- name: first_name
_22
data_type: STRING
_22
- name: last_name
_22
data_type: STRING

Asset types

There are three types of assets: source assets, dbt model assets, and exposures. All assets can be defined via UI or in code as showcased below.

Define a new asset.

Define a new asset.

Source assets

Source assets are typically used to ingest or reference data from external sources and persist it in a database. This allows you to easily access and transform the imported data so that it can be used for analysis. A source asset and its tables are specified in the source YAML file. You can also define its schema along with several other options.

sources/shop_source.yml

_56
version: 2
_56
_56
sources:
_56
- name: shop_source
_56
meta:
_56
config:
_56
type: source-bigquery
_56
connection: shop_integration
_56
experts:
_56
users:
_56
- lea.dang@y42.com
_56
tables:
_56
- name: raw_customers
_56
config:
_56
y42_table:
_56
publish_view: false
_56
import: raw_customers
_56
columns:
_56
- last_name
_56
- id
_56
- first_name
_56
schema:
_56
$schema: http://json-schema.org/draft-07/schema#
_56
type: object
_56
properties:
_56
last_name:
_56
type: string
_56
id:
_56
type: number
_56
airbyte_type: integer
_56
first_name:
_56
type: string
_56
group: sources_tests
_56
supported_sync_modes:
_56
- full_refresh
_56
- incremental
_56
default_cursor_field: []
_56
source_defined_cursor: null
_56
source_defined_primary_key: []
_56
experts:
_56
users:
_56
- lea.dang@y42.com
_56
teams: []
_56
columns:
_56
- name: id
_56
data_type: FLOAT
_56
tests:
_56
- unique
_56
- name: first_name
_56
data_type: STRING
_56
- name: last_name
_56
data_type: STRING
_56
description: Raw data
_56
tags:
_56
- shop
_56
- customers

Models / dbt assets

dbt model assets are used to define data transformations that modify the imported data in some way. This could be done by applying filters, aggregations, or other types of manipulations on the source data. The transformed results are then persisted in a database and can be used for further transformation or analysis.

While all other assset configurations are defined in .yml files, modeling logic definitions are exceptions to the rule - they are stored in .sql files.

stg_customers.sql

_10
select id, first_name, last_name
_10
from {{source('src', 'src_customers')}}

models/stg_customers.yml

_23
version: 2
_23
_23
models:
_23
- name: stg_customers
_23
meta:
_23
experts:
_23
users:
_23
- lea.dang@y42.com
_23
description: Staging model
_23
config:
_23
tags:
_23
- shop
_23
- customers
_23
columns:
_23
- name: id
_23
data_type: FLOAT
_23
description: The id is unique
_23
tests:
_23
- unique
_23
- name: first_name
_23
data_type: STRING
_23
- name: last_name
_23
data_type: STRING

Exposure assets

Exposures are data assets that reference to a downstream use of your project's data. This could encompass a dashboard, an application, a Python notebook, or a data science project. An exposure groups all relevant upstream assets together to define the data required for external use. By employing exposures, you can:

Group multiple assets together to assess whether all upstream dependencies have been refreshed for a specific exposure and effortlessly refresh the data if needed.

Provide additional context to external data consumers in the exposure data catalog pages.

daily_purchases_update.yml

_24
version: 2
_24
_24
exposures:
_24
- name: daily_purchases_update
_24
type: dashboard
_24
owner:
_24
name: Lea Dang
_24
email: lea.dang@y42.com
_24
_24
depends_on:
_24
- ref('stg_purchases')
_24
- source('shop_source','purchases')
_24
_24
maturity: high
_24
url: https://bi.tool/purchases-dashboard
_24
_24
description: Daily updated dashboard of all purchases.
_24
tags:
_24
- purchases
_24
meta:
_24
experts:
_24
users:
_24
- lea.dang@y42.com
_24
teams: []

Asset description

Edit an asset's description either directly in the code using markdown or through the markdown editor.

Edit the asset description using markdown

Edit the asset description using markdown

This action updates the .yml file, specifically the description property.

mrt_purchases.yml

_21
version: 2
_21
_21
models:
_21
- name: mrt_purchases
_21
description: |-
_21
### **Purchases asset**
_21
_21
Joins purchasing staging data with products.
_21
_21
### Notes
_21
_21
* IDs have been casted to INT64 data type upstream
_21
_21
* LEFT JOIN to capture all purchasing
_21
_21
config:
_21
tags:
_21
- verified
_21
meta:
_21
asset_status: verified
_21
tier: tier 1

Asset labels

In the platform, users have the flexibility to categorize their assets using specific labels to streamline management and monitoring. These labels help in quickly identifying the current state of an asset. The available labels to classify an asset are:

  • No Status: The asset is newly created or its status has not been determined yet.
  • Issue: This label indicates that the asset is currently facing problems or irregularities that need attention.
  • Draft: The asset is in a preliminary stage and may be subject to further modifications.
  • Deprecated : This label is applied to an asset that is no longer recommended for use, generally because it has been replaced by a newer version or functionality.
  • Verified : This signifies that the asset has been reviewed and confirmed to be functioning as expected.

You can easily apply these labels to keep track of asset statuses and facilitate smoother workflow processes.

The asset label classification.

The asset label classification.

Asset tiers

In order to prioritize and manage your assets more effectively, you can designate different Tier levels to them. These tiers help in distinguishing the criticality and importance of each asset, aiding in efficient resource allocation and focus. The tier levels are as follows:

  • Tier 1 - Critical: Assets that are vital and form the backbone of the operation. Any disruption to these assets can have significant repercussions.
  • Tier 2 - Important: Assets that have a high value and contribute considerably to the overall function, but are not as critical as Tier 1 assets.
  • Tier 3 - Regular: These are the assets that perform standard functions and are necessary for routine operations but are not in the high priority bracket.
  • Tier 4 - No Tier: Assets that are currently not categorized or do not fall under any critical functionality or importance bracket.
The asset tier classification.

The asset tier classification.