All columns anomalies

All columns anomalies

elementary.all_columns_anomalies

Overview

The all_column_anomalies test applies column-level monitoring and anomaly detection across all table columns, activating monitors based on the column's data type. Customize the test with column_anomalies to specify monitors, and use exclude_prefix or exclude_regexp to skip columns based on naming patterns or regex matches, focusing the test on relevant data.

Default monitors by type:

Data quality metricColumn Type
null_countany
null_percentany
min_lengthstring
max_lengthstring
average_lengthstring
missing_countstring
missing_percentstring
minnumeric
maxnumeric
averagenumeric
zero_countnumeric
zero_percentnumeric
standard_deviationnumeric
variancenumeric

Opt-in monitors by type:

Data quality metricColumn Type
sumnumeric

Test configuration


_10
tests:
_10
- elementary.all_columns_anomalies:
_10
{{ parameters }}

where {{ parameters }}:

Example

login_events.yml

_17
version: 2
_17
_17
models:
_17
- name: login_events
_17
config:
_17
y42:
_17
apiVersion: v3 # the apiVersion does not impact anomaly testing
_17
elementary:
_17
timestamp_column: "loaded_at"
_17
tests:
_17
- elementary.all_columns_anomalies:
_17
where_expression: "event_type in ('event_1', 'event_2') and country_name != 'unwanted country'"
_17
time_bucket:
_17
period: day
_17
count: 1
_17
# optional - change global sensitivity
_17
anomaly_sensitivity: 3.5