Skip to main content

Apache Hive Profile

Overview of dbt-hive​

Maintained by: Cloudera
Author: Cloudera
Source: Github
dbt Cloud: Currently un-supported
dbt Slack channel Link to channel

dbt-hive stars

Connection Methods​

dbt-hive can connect to Apache Hive and Cloudera Data Platform clusters. The Impyla library is used to establish connections to Hive.

dbt-hive supports two transport mechanisms:

  • binary
  • HTTP(S)

The default mechanism is binary. To use HTTP transport, use the boolean option use_http_transport: [true / false].

Authentication Methods​

dbt-hive supports two authentication mechanisms:

  • insecure No authentication is used, only recommended for testing.
  • ldap Authentication via LDAP

Insecure​

This method is only recommended if you have a local install of Hive and want to test out the dbt-hive adapter.

~/.dbt/profiles.yml
your_profile_name:
target: dev
outputs:
dev:
type: hive
host: localhost
port: [port]
schema: [schema name]

LDAP​

LDAP allows you to authenticate with a username and password when Hive is configured with LDAP Auth. LDAP is supported over Binary & HTTP connection mechanisms.

This is the recommended authentication mechanism to use with Cloudera Data Platform (CDP).

~/.dbt/profiles.yml
your_profile_name:
target: dev
outputs:
dev:
type: hive
host: [host name]
http_path: [optional, http path to Hive]
port: [port]
auth_type: ldap
use_http_transport: [true / false]
use_ssl: [true / false] # TLS should always be used with LDAP to ensure secure transmission of credentials
username: [username]
password: [password]
schema: [schema name]

Note: When creating workload user in CDP, make sure the user has CREATE, SELECT, ALTER, INSERT, UPDATE, DROP, INDEX, READ and WRITE permissions. If you need the user to execute GRANT statements, you should also configure the appropriate GRANT permissions for them. When using Apache Ranger, permissions for allowing GRANT are typically set using "Delegate Admin" option. For more information, see grants and on-run-start & on-run-end.

Installation and Distribution​

dbt's adapter for Apache Hive is managed in its own repository, dbt-hive. To use it, you must install the dbt-hive plugin.

Using pip​

The following commands will install the latest version of dbt-hive as well as the requisite version of dbt-core and impyla driver used for connections.

pip install dbt-hive

Supported Functionality​

NameSupported
Materialization: TableYes
Materialization: ViewYes
Materialization: Incremental - AppendYes
Materialization: Incremental - Insert+OverwriteYes
Materialization: Incremental - MergeNo
Materialization: EphemeralNo
SeedsYes
TestsYes
SnapshotsNo
DocumentationYes
Authentication: LDAPYes
Authentication: KerberosNo