Databricks and Apache Iceberg
Databricks is built on Delta Lake and stores data in the Delta table format. Databricks does not support writing to Iceberg catalogs. Databricks can create both managed Iceberg tables and Iceberg-compatible Delta tables by storing the table metadata in Iceberg and Delta, readable from external clients. In terms of reading, Unity Catalog does support reading from external Iceberg catalogs.
When a dbt model is configured with the table property UniForm
, it will duplicate the Delta metadata for an Iceberg-compatible metadata. This allows external Iceberg compute engines to read from Unity Catalogs.
Example SQL:
{{ config(
tblproperties={
'delta.enableIcebergCompatV2': 'true'
'delta.universalFormat.enabledFormats': 'iceberg'
}
) }}
To set up Databricks for reading and querying external tables, configure Lakehouse Federation and establish the catalog as a foreign catalog. This will be configured outside of dbt, and once completed, it will be another database you can query.
We do not currently support the new Private Priview features of Databricks managed Iceberg tables.
dbt Catalog Integration Configurations for Databricks
The following table outlines the configuration fields required to set up a catalog integration for Iceberg compatible tables in Databricks.
Field | Description | Required | Accepted values |
---|---|---|---|
name | Name of the Catalog on Databricks | Yes | “my_unity_catalog” |
catalog_type | Type of catalog | Yes | unity, hive_metastore |
external_volume | Storage location of your data | Optional | See Databricks documentation |
table_format | Table Format for your dbt models will be materialized as | Optional | Defaults to delta unless overwritten in Databricks account. |
adapter_properties: | Additional Platform-Specific Properties. | Optional | See below for acceptable values |
Adapter Properties
These are the additional configurations that can be supplied and nested under adapter_properties
to add in more configurability.
Field | Description | Required | Accepted values |
---|---|---|---|
table_format | Table Format for your dbt models will be materialized as | Optional | Defaults to delta unless overwritten in Databricks account. |
adapter_properties: | Additional Platform-Specific Properties. | Optional | See below for acceptable values |
Example:
adapter_properties:
file_format: parquet
Configure catalog integration for managed Iceberg tables
- Create a
catalogs.yml
at the top level of your dbt project (at the same level as dbt_project.yml)
An example of Unity Catalog as the catalog:
catalogs:
- name: unity_catalog
active_write_integration: unity_catalog_integration
write_integrations:
- name: unity_catalog_integration
table_format: iceberg
catalog_type: unity
adapter_properties:
file_format: parquet
- Add the
catalog_name
config parameter in either the SQL config (inside the .sql model file), property file (model folder), or yourdbt_project.yml
.
An example of iceberg_model.sql
:
{{
config(
materialized = 'table',
catalog_name = 'unity_catalog'
)
}}
select * from {{ ref('jaffle_shop_customers') }}
- Execute the dbt model with a
dbt run -s iceberg_model
.
Was this page helpful?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.