This functionality is new in v1.5.
contract configuration is enforced, dbt will ensure that your model's returned dataset exactly matches the attributes you have defined in yaml:
data_typefor every column
constraints, as supported for this materialization and data platform
This is to ensure that the people querying your model downstream—both inside and outside dbt—have a predictable and consistent set of columns to use in their analyses. Even a subtle change in data type, such as from
1), could cause queries to fail in surprising ways.
data_type defined in your YAML file must match a data type your data platform recognizes. dbt does not do any type aliasing itself. If your data platform recognizes both
integer as corresponding to the same type, then they will return a match.
When dbt is comparing data types, it will not compare granular details such as size, precision, or scale. We don't think you should sweat the difference between
varchar(257), because it doesn't really affect the experience of downstream queriers. If you need a more-precise assertion, it's always possible to accomplish by writing or using a custom test.
That said, on certain data platforms, you will need to specify a varchar size or numeric scale if you do not want it to revert to the default. This is most relevant for the
numeric type on Snowflake, which defaults to a precision of 38 and a scale of 0 (zero digits after the decimal, such as rounded to an integer). To avoid this implicit coercion, specify your
data_type with a nonzero scale, like
- name: dim_customers
- name: customer_id
- type: not_null
- name: customer_name
Let's say your model is defined as:
'abc123' as customer_id,
'My Best Customer' as customer_name
dbt run your model, before dbt has materialized it as a table in the database, you will see this error:
20:53:45 Compilation Error in model dim_customers (models/dim_customers.sql)
20:53:45 This model has an enforced contract that failed.
20:53:45 Please ensure the name, data_type, and number of columns in your contract match the columns in your model's definition.
20:53:45 | column_name | definition_type | contract_type | mismatch_reason |
20:53:45 | ----------- | --------------- | ------------- | ------------------ |
20:53:45 | customer_id | TEXT | INT | data type mismatch |
20:53:45 > in macro assert_columns_equivalent (macros/materializations/models/table/columns_spec_ddl.sql)
At present, model contracts are supported for:
- SQL models (not yet Python)
- Models materialized as
- The most popular data platforms — though support and enforcement of different constraint types vary by platform
Incremental models and
Why require that incremental models also set
on_schema_change, and why to
- You add a new column to both the SQL and the YAML spec
- You don't set
on_schema_change, or you set
- dbt doesn't actually add that new column to the existing table — and the upsert/merge still succeeds, because it does that upsert/merge on the basis of the already-existing "destination" columns only (this is long-established behavior)
- The result is a delta between the yaml-defined contract, and the actual table in the database - which means the contract is now incorrect!
append_new_columns, rather than
sync_all_columns? Because removing existing columns is a breaking change for contracted models!