dbt - Documentation

BigQuery specific configurations

Using table partitioning and clustering

Partition clause

BigQuery supports the use of a partition by clause to easily partition a table by a column or expression. This option can help decrease latency and cost when querying large tables. Note that the expression provided as the partition_by field must be of type Date, and that partition pruning only works when comparing using scalar/literal values (so selecting partitions using a subquery won't improve performance).

{{ config(materialized='table', partition_by='DATE(created_at)') }}

select
  user_id,
  event_name,
  created_at
  
from {{ ref('events') }}

Because BigQuery only prunes partitions when filters are provided literally, it may be useful to use a statement block to generate filter clauses:


{% call statement('get_date', fetch_result=True) %}

  SELECT
    min(created_at) as min_date
    
  FROM {{ ref('bigquery_table') }}

{% endcall %}

{% set min_date = load_result('get_date').table.columns['min_date'].values()[0] %}

select
  user_id,
  event_name,
  created_at

from {{ ref('bigquery_table') }}
where created_at >= DATE('{{ min_date }}')

Clustering Clause

BigQuery tables can be clustered to colocate related data. At present, BigQuery is only able to cluster partitioned tables, so be sure to use the cluster_by config in conjunction with the partition_by config.

Clustering on a single column:

{{
  config(
    materialized = "table",
    partition_by = "date(created_at)",
    cluster_by = "order_id",
  )
}}

select * from ...

Clustering on a multiple columns:

{{
  config(
    materialized = "table",
    partition_by = "date(created_at)",
    cluster_by = ["customer_id", "order_id"],
  )
}}

select * from ...

Persisting model descriptions

New in dbt v0.14.0

This functionality is new in dbt v0.14.0. For upgrading instructions, check out the docs

The persist_docs config can be used to persist the dbt description supplied for a model to the resulting BigQuery table or view. The persist_docs config is currently only supported on BigQuery for views and tables.

The persist_docs config can be specified in the dbt_project.yml file, or in a specific model.


models:
  # enable docs persistence for all models
  persist_docs:
    relation: true

or:

{{
  config(persist_docs={"relation": true})
}}

select ...

When the persist_docs option is configured appropriately, you'll be able to see your model descriptions in the BigQuery UI:

This description is generated by dbt

This description is generated by dbt

Managing KMS Encryption

Customer managed encryption keys can be configured for BigQuery tables using the kms_key_name model configuration.

Using KMS Encryption

To specify the KMS key name for a model (or a group of models), use the kms_key_name model configuration. The following example sets the kms_key_name for all of the models in the encrypted/ directory of your dbt project.


name: my_project
version: 1.0.0

...

models:
  my_project:
    encrypted:
      kms_key_name: 'projects/PROJECT_ID/locations/global/keyRings/test/cryptoKeys/quickstart'

BigQuery specific configurations


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.