Apache Hive configurations
Configuring tables
When materializing a model as table
, you may include several optional configs that are specific to the dbt-hive plugin, in addition to the standard model configs.
Option | Description | Required? | Example |
---|---|---|---|
partition_by | partition by a column, typically a directory per partition is created | No | partition_by=['name'] |
clustered_by | second level division of a partitioned column | No | clustered_by=['age'] |
file_format | underlying storage format of the table, see https://cwiki.apache.org/confluence/display/Hive/FileFormats for supported formats | No | file_format='PARQUET' |
location | storage location, typically an hdfs path | No | LOCATION='/user/etl/destination' |
comment | comment for the table | No | comment='this is the cleanest model' |
Incremental models
Supported modes for incremental model:
append
(default): Insert new records without updating or overwriting any existing data.insert_overwrite
: For new records, insert data. When used along with partition clause, update data for changed record and insert data for new records.
Example: Using partition_by config option
hive_partition_by.sql
{{
config(
materialized='table',
unique_key='id',
partition_by=['city'],
)
}}
with source_data as (
select 1 as id, "Name 1" as name, "City 1" as city,
union all
select 2 as id, "Name 2" as name, "City 2" as city,
union all
select 3 as id, "Name 3" as name, "City 2" as city,
union all
select 4 as id, "Name 4" as name, "City 1" as city,
)
select * from source_data
In the above example, a sample table is created with partition_by and other config options. One thing to note when using partition_by option is that the select query should always have the column name used in partition_by option as the last one, as can be seen for the city
column name used in the above query. If the partition_by clause is not the same as the last column in select statement, Hive will flag an error when trying to create the model.
Was this page helpful?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
0