Materializations

Overview

Materializations are strategies for persisting dbt models in a warehouse. There are four types of materializations built into dbt. They are:

  • table
  • view
  • incremental
  • ephemeral

Configuring materializations

By default, dbt models are materialized as "views". Models can be configured with a different materialization by supplying the materialized configuration parameter as shown below.

dbt_project.yml
# The following dbt_project.yml configures a project that looks like this:
# .
# โ””โ”€โ”€ models
# โ”œโ”€โ”€ csvs
# โ”‚ย ย  โ”œโ”€โ”€ employees.sql
# โ”‚ย ย  โ””โ”€โ”€ goals.sql
# โ””โ”€โ”€ events
# โ”œโ”€โ”€ stg_event_log.sql
# โ””โ”€โ”€ stg_event_sessions.sql
name: my_project
version: 1.0.0
models:
my_project:
events:
# materialize all models in models/events as tables
materialized: table
csvs:
# this is redundant, and does not need to be set
materialized: view

Alternatively, materializations can be configured directly inside of the model sql files. This can be useful if you are also setting [Performance Optimization] configs for specific models (for example, Redshift specific configurations or BigQuery specific configurations).

models/events/stg_event_log.sql
{{ config(materialized='table', sort='timestamp', dist='user_id') }}
select *
from ...

Materializations

View

When using the view materialization, your model is rebuilt as a view on each run, via a create view as statement.

  • Pros: No additional data is stored, views always on top of source data will have the latest records in them.
  • Cons: Views that perform significant transformation, or are stacked on top of other views, are
  • Advice:
    • Generally start with views for your models, and only change to another materialization when you're noticing performance problems.
    • Views are best suited for models that do not do significant transformation, e.g. renaming, recasting columns.

Table

When using the table materialization, your model is rebuilt as a table on each run, via a create table as statement.

  • Pros: Tables are fast to query
  • Cons:
    • Tables can take a long time to rebuild, especially for complex transformations
    • New records in underlying source data are not automatically added to the table
  • Advice:
    • Use the table materialization for any models being queried by BI tools, to give your end user a faster experience
    • Also use the table materialization for any slower transformations that are used by many downstream models

Incremental

incremental models allow dbt to insert or update records into a table since the last time that dbt was run.

  • Pros: You can significantly reduce the build time by just transforming new records
  • Cons: Incremental models require extra configuration and are an advanced usage of dbt. Read more about using incremental models here.
  • Advice:
    • Incremental models are best for event-style data
    • Use incremental models when your dbt runs are becoming too slow (i.e. don't start with incremental models)

Ephemeral

ephemeral models are not directly built into the database. Instead, dbt will interpolate the code from this model into dependent models as a common table expression.

  • Pros:
    • You can still write reusable logic
    • Ephemeral models can help keep your data warehouse clean by reducing clutter (also consider splitting your models across multiple schemas by using custom schemas.
  • Cons:
    • You cannot select directly from this model.
    • Over use of the ephemeral materialization can also make queries harder to debug.
  • Advice: Use the ephemeral materialization for:
    • very light-weight transformations that are early on in your DAG
    • are only used in one or two downstream models, and
    • do not need to be queried directly