ClickHouse configurations
Models
Type | Supported? | Details |
---|---|---|
view materialization | YES | Creates a view. |
table materialization | YES | Creates a table. See below for the list of supported engines. |
incremental materialization | YES | Creates a table if it doesn't exist, and then writes only updates to it. |
View Materialization
A dbt model can be created as a ClickHouse view and configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: view
{{ config(materialized = "view") }}
Table Materialization
A dbt model can be created as a ClickHouse table and configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: table
+order_by: [ <column-name>, ... ]
+engine: <engine-type>
+partition_by: [ <column-name>, ... ]
{{ config(
materialized = "table",
engine = "<engine-type>",
order_by = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
...
]
) }}
Table Configuration
Option | Description | Required? |
---|---|---|
materialized | How the model will be materialized into ClickHouse. Must be table to create a table model. | Required |
engine | The table engine to use when creating tables. See list of supported engines below. | Optional (default: MergeTree() ) |
order_by | A tuple of column names or arbitrary expressions. This allows you to create a small sparse index that helps find data faster. | Optional (default: tuple() ) |
partition_by | A partition is a logical combination of records in a table by a specified criterion. The partition key can be any expression from the table columns. | Optional |
Incremental Materialization
Table model will be reconstructed for each dbt execution. This may be infeasible and extremely costly for larger result sets or complex transformations. To address this challenge and reduce the build time, a dbt model can be created as an incremental ClickHouse table and is configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: incremental
+order_by: [ <column-name>, ... ]
+engine: <engine-type>
+partition_by: [ <column-name>, ... ]
+unique_key: [ <column-name>, ... ]
+inserts_only: [ True|False ]
{{ config(
materialized = "incremental",
engine = "<engine-type>",
order_by = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
unique_key = [ "<column-name>", ... ],
inserts_only = [ True|False ],
...
]
) }}
Incremental Table Configuration
Option | Description | Required? |
---|---|---|
materialized | How the model will be materialized into ClickHouse. Must be table to create a table model. | Required |
unique_key | A tuple of column names that uniquely identify rows. For more details on uniqueness constraints, see here. | Required. If not provided altered rows will be added twice to the incremental table. |
engine | The table engine to use when creating tables. See list of supported engines below. | Optional (default: MergeTree() ) |
order_by | A tuple of column names or arbitrary expressions. This allows you to create a small sparse index that helps find data faster. | Optional (default: tuple() ) |
partition_by | A partition is a logical combination of records in a table by a specified criterion. The partition key can be any expression from the table columns. | Optional |
inserts_only | If set to True, incremental updates will be inserted directly to the target incremental table without creating intermediate table. Read more about this configuration in our doc | Optional (default: False ) |
Snapshot
dbt snapshots allow a record to be made of changes to a mutable model over time. This in turn allows point-in-time queries on models, where analysts can “look back in time” at the previous state of a model. This functionality is supported by the ClickHouse connector and is configured using the following syntax:
{{
config(
target_schema = "<schema_name>",
unique_key = "<column-name>",
strategy = "<strategy>",
updated_at = "<unpdated_at_column-name>",
)
}}
Snapshot Configuration
Option | Description | Required? |
---|---|---|
target_schema | A ClickHouse's database name where the snapshot table will be created. | Required |
unique_key | A tuple of column names that uniquely identify rows. | Required. If not provided altered rows will be added twice to the incremental table. |
strategy | Defines how dbt knows if a row has changed. More about dbt startegies here | Required |
updated_at | If using the timestamp strategy, the timestamp column to compare. | Only if using the timestamp strategy |
Supported Table Engines
If you encounter issues connecting to ClickHouse from dbt with one of the above engines, please report an issue here.
Setting quote_columns
To prevent a warning, make sure to explicitly set a value for quote_columns
in your dbt_project.yml
. See the doc on quote_columns for more information.
seeds:
+quote_columns: false #or `true` if you have csv column headers with spaces