Skip to main content

constraints

Constraints are a feature of many data platforms. When specified, the platform will perform additional validation on data as it is being populated in a new table or inserted into a preexisting table. If the validation fails, the table creation or update fails, the operation is rolled back, and you will see a clear error message.

When enforced, a constraint guarantees that you will never see invalid data in the table materialized by your model. Enforcement varies significantly by data platform.

Constraints require the declaration and enforcement of a model contract.

Constraints are never applied on ephemeral models or those materialized as view. Only table and incremental models support applying and enforcing constraints.

Defining constraints

Constraints may be defined for a single column, or at the model level for one or more columns. As a general rule, we recommend defining single-column constraints directly on those columns.

If you are defining multiple primary_key constraints for a single model, those must be defined at the model level. Defining multiple primary_key constraints at the column level is not supported.

The structure of a constraint is:

  • type (required): one of not_null, unique, primary_key, foreign_key, check, custom
  • expression: Free text input to qualify the constraint. Required for certain constraint types, and optional for others.
  • name (optional): Human-friendly name for this constraint. Supported by some data platforms.
  • columns (model-level only): List of column names to apply the constraint over.

Platform-specific support

In transactional databases, it is possible to define "constraints" on the allowed values of certain columns, stricter than just the data type of those values. For example, Postgres supports and enforces all the constraints in the ANSI SQL standard (not null, unique, primary key, foreign key), plus a flexible row-level check constraint that evaluates to a boolean expression.

Most analytical data platforms support and enforce a not null constraint, but they either do not support or do not enforce the rest. It is sometimes still desirable to add an "informational" constraint, knowing it is not enforced, for the purpose of integrating with legacy data catalog or entity-relation diagram tools (dbt-core#3295). Some data platforms can optionally use primary or foreign key constraints for query optimization if you specify an additional keyword.

To that end, there are two optional fields you can specify on any filter:

  • warn_unenforced: False to skip warning on constraints that are supported, but not enforced, by this data platform. The constraint will be included in templated DDL.
  • warn_unsupported: False to skip warning on constraints that aren't supported by this data platform, and therefore won't be included in templated DDL.
  • PostgreSQL constraints documentation: here
models/constraints_example.sql
{{
config(
materialized = "table"
)
}}

select
1 as id,
'My Favorite Customer' as customer_name,
cast('2019-01-01' as date) as first_transaction_date
models/schema.yml
models:
- name: dim_customers
config:
contract:
enforced: true
columns:
- name: id
data_type: int
constraints:
- type: not_null
- type: primary_key
- type: check
expression: "id > 0"
- name: customer_name
data_type: text
- name: first_transaction_date
data_type: date

Expected DDL to enforce constraints:

target/run/.../constraints_example.sql
create table "database_name"."schema_name"."constraints_example__dbt_tmp"
(
id integer not null primary key check (id > 0),
customer_name text,
first_transaction_date date
)
;
insert into "database_name"."schema_name"."constraints_example__dbt_tmp"
(
id,
customer_name,
first_transaction_date
)
(
select
1 as id,
'My Favorite Customer' as customer_name,
cast('2019-01-01' as date) as first_transaction_date
);

Custom constraints

In dbt Cloud and dbt Core, you can use custom constraints on models for the advanced configuration of tables. Different data warehouses support different syntax and capabilities.

Custom constraints allow you to add configuration to specific columns. For example:

  • Set masking policies in Snowflake when using a Create Table As Select (CTAS).

  • Other data warehouses (such as Databricks and BigQuery have their own set of parameters that can be set for columns in their CTAS statements.

You can implement constraints in a couple of different ways:

Custom constraints with tags

Here's an example of how to implement tag-based masking policies with contracts and constraints using the following syntax:

models/constraints_example.yml

models:
- name: my_model
config:
contract:
enforced: true
materialized: table
columns:
- name: id
data_type: int
constraints:
- type: custom
expression: "tag (my_tag = 'my_value')" # A custom SQL expression used to enforce a specific constraint on a column.

Using this syntax requires configuring all the columns and their types as it’s the only way to send a create or replace <cols_info_with_masking> mytable as .... It’s not possible to do it with just a partial list of columns. This means making sure the columns and constraints fields are fully defined.

To generate a YAML with all the columns, you can use generate_model_yaml from dbt-codegen.

Custom constraints without tags

Alternatively, you can add a masking policy without tags:

models/constraints_example.yml

models:
- name: my_model
config:
contract:
enforced: true
materialized: table
columns:
- name: id
data_type: int
constraints:
- type: custom
expression: "masking policy my_policy"

0