dbt - Documentation

Creating new materializations

The model materializations you're familiar with, table, view, and incremental are implemented as macros in a package that's distributed along with dbt. You can check out the source for these materializations here. If you need to create your own materializations, reading these files is a good place to start. Continue reading below for a deep-dive into dbt materializations.

This is an advanced feature of dbt. Let us know if you need a hand! We're always happy to chat.

Materialization blocks

Materialization blocks make it possible for dbt to load custom materializations from packages. The materialization blocks work very much like macro blocks, with a couple of key exceptions. Materializations are defined as follows:

{% materialization [materialization name], [adapter="something" | default] %}
{% endmaterialization %}

Materializations can be given a name, and they can be tied to a specific adapter. dbt will pick the materialization tied to the currently-in-use adapter if one exists, or it will fall back to the default adapter. In practice, this looks like:

{% materialization my_materialization_name, default %}
 -- cross-adapter materialization... assume Redshift is not supported
{% endmaterialization %}

{% materialization my_materialization_name, adapter='redshift' %}
-- override the materialization for Redshift
{% endmaterialization %}

dbt's ability to dynamically pick the correct materialization based on the active database target is called multiple dispatch.

Anatomy of a materialization

Materializations are responsible for taking a dbt model sql statement and turning it into a transformed dataset in a database. As such, materializations generally take the following shape:

  1. Prepare the database for the new model
  2. Run pre-hooks
  3. Execute any sql required to implement the desired materialization
  4. Run post-model hooks
  5. Clean up the database as required
  6. Return the list of created relations

Each of these tasks are explained in sections below.

1. Prepare the database

Materializations are responsible for creating new tables or views in the database, or inserting/updating/deleting data from existing tables. As such, materializations need to know about the state of the database to determine exactly what sql they should run. Here is some pseudocode for the "setup" phase of the table materialization:

{% set target_relation = load_relation(this) %}
{% if target_relation is none %}
 {# table doesn't exist; nothing to do #}
{% elif target_relation.is_view %}
  {% do run_query('drop view ' ~ target_relation) %}
{% else %}
  {% do run_query('drop table ' ~ target_relation) %}
{% endif %}

This code sample demonstrates how logic in a materialization may need to account for the state of the database. Depending on the state of the database, some setup work may be required to make room for the target relation that the materialization is responsible for building.

2. Run pre-hooks

Pre- and post-hooks can be specified for any model -- be sure that your materialization plays nicely with these settings. Two variables, pre_hooks and post_hooks are automatically injected into the materialization context. Invoke these hooks at the appropriate time with:

{{ run_hooks(pre_hooks) }}

3. Executing SQL

Construct your materialization DML to account for the different permutations of table existence, materialization flags, etc. There are a number of adapter functions and context variables that can help you here. Be sure to consult the Reference section of this site for a full list of variables and functions at your disposal.

4. Run post-hooks

See the section above on pre-hooks for more information

5. Cleaning up

The "cleanup" phase of the materialization typically renames tables/views, drops old tables/views and commits the transaction opened in "preparation" step above. The table materialization, for instance, executes the following pseudocode:

  -- `COMMIT` happens here
  {{ adapter.commit() }}

  -- finally, drop the existing/backup relation after the commit
  {{ drop_relation_if_exists(backup_relation) }}

Be sure to commit the transaction in the cleanup phase of the materialization with {{ adapter.commit() }}

Materialization Configuration

Materializations support custom configuration. You might be familiar with some of these configs from materializations like and unique_key in incremental models or check_cols in snapshots.

Specifying configuration options

Materialization configurations can either be "optional" or "required". If a user fails to provide required configurations, then dbt will raise a compilation error. You can define these configuration options with the config.get and config.require functions. You may specify a default value for optional configurations using the default= keyword argument to config.get.

# optional
config.get('optional_config_name', default=false) 
# required

For more information on materialization configuration, see the config reference.

Creating new materializations

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.