Skip to main content

Apache Impala Profile

Overview of dbt-impala​

Maintained by: Cloudera
Author: Cloudera
Source: Github
dbt Cloud: Currently un-supported
dbt Slack channel Link to channel

dbt-impala stars

Connection Methods​

dbt-impala can connect to Apache Impala and Cloudera Data Platform clusters.

The Impyla library is used to establish connections to Impala.

Two transport mechanisms are supported:

  • binary
  • HTTP(S)

The default mechanism is binary. To use HTTP transport, use the boolean option use_http_transport: [true / false].

Authentication Methods​

dbt-impala supports three authentication mechanisms:

  • insecure No authentication is used, only recommended for testing.
  • ldap Authentication via LDAP
  • kerbros Authentication via Kerberos (GSSAPI)

Insecure​

This method is only recommended if you have a local install of Impala and want to test out the dbt-impala adapter.

~/.dbt/profiles.yml
your_profile_name:
target: dev
outputs:
dev:
type: impala
host: localhost
port: 21050
dbname: [db name] # this should be same as schema name provided below
schema: [schema name]

LDAP​

LDAP allows you to authenticate with a username & password when Impala is configured with LDAP Auth. LDAP is supported over Binary & HTTP connection mechanisms.

This is the recommended authentication mechanism to use with Cloudera Data Platform.

~/.dbt/profiles.yml
your_profile_name:
target: dev
outputs:
dev:
type: impala
host: [host name]
http_path: [optional, http path to Impala]
port: [port]
auth_type: ldap
use_http_transport: [true / false]
use_ssl: [true / false] # TLS should always be used with LDAP to ensure secure transmission of credentials
username: [username]
password: [password]
dbname: [db name] # this should be same as schema name provided below
schema: [schema name]

Kerberos​

The Kerberos authentication mechanism uses GSSAPI to share Kerberos credentials when Impala is configured with Kerberos Auth.

~/.dbt/profiles.yml
your_profile_name:
target: dev
outputs:
dev:
type: impala
host: [hostname]
port: [port]
auth_type: [GSSAPI]
kerberos_service_name: [kerberos service name]
use_http_transport: true
use_ssl: true # TLS should always be used with LDAP to ensure secure transmission of credentials
dbname: [db name] # this should be same as schema name provided below
schema: [schema name]

Installation and Distribution​

dbt's adapter for Apache Impala is managed in its own repository, dbt-impala. To use it, you must install the dbt-impala plugin.

Using pip​

The following commands will install the latest version of dbt-impala as well as the requisite version of dbt-core and impyla driver used for connections.

pip install dbt-impala

Supported Functionality​

NameSupported
Materialization: TableYes
Materialization: ViewYes
Materialization: Incremental - AppendYes
Materialization: Incremental - Insert+OverwriteYes
Materialization: Incremental - MergeNo
Materialization: EphemeralNo
SeedsYes
TestsYes
SnapshotsYes
DocumentationYes
Authentication: LDAPYes
Authentication: KerberosYes