Dremio setup
- Maintained by: Dremio
- Authors: Dremio
- GitHub repo: dremio/dbt-dremio
- PyPI package:
dbt-dremio
- Slack channel: db-dremio
- Supported dbt Core version: v1.8.0 and newer
- dbt Cloud support: Not Supported
- Minimum data platform version: Dremio 22.0
Installing dbt-dremio
Use pip
to install the adapter. Before 1.8, installing the adapter would automatically install dbt-core
and any additional dependencies. Beginning in 1.8, installing an adapter does not automatically install dbt-core
. This is because adapters and dbt Core versions have been decoupled from each other so we no longer want to overwrite existing dbt-core installations.
Use the following command for installation:
Configuring dbt-dremio
For Dremio-specific configuration, please refer to Dremio configs.
Follow the repository's link for OS dependencies.
Model contracts are not supported.
Prerequisites for Dremio Cloud
Before connecting from project to Dremio Cloud, follow these prerequisite steps:
- Ensure that you have the ID of the Sonar project that you want to use. See Obtaining the ID of a Project.
- Ensure that you have a personal access token (PAT) for authenticating to Dremio Cloud. See Creating a Token.
- Ensure that Python 3.9.x or later is installed on the system that you are running dbt on.
Prerequisites for Dremio Software
-
Ensure that you are using version 22.0 or later.
-
Ensure that Python 3.9.x or later is installed on the system that you are running dbt on.
See Support Keys in the Dremio documentation for the steps.
-
If you want to use TLS to secure the connection between dbt and Dremio Software, configure full wire encryption in your Dremio cluster. For instructions, see Configuring Wire Encryption.
Initializing a Project
- Run the command
dbt init <project_name>
. - Select
dremio
as the database to use. - Select one of these options to generate a profile for your project:
dremio_cloud
for working with Dremio Cloudsoftware_with_username_password
for working with a Dremio Software cluster and authenticating to the cluster with a username and a passwordsoftware_with_pat
for working with a Dremio Software cluster and authenticating to the cluster with a personal access token
Next, configure the profile for your project.
Profiles
When you initialize a project, you create one of these three profiles. You must configure it before trying to connect to Dremio Cloud or Dremio Software.
- Profile for Dremio Cloud
- Profile for Dremio Software with Username/Password Authentication
- Profile for Dremio Software with Authentication Through a Personal Access Token
For descriptions of the configurations in these profiles, see Configurations.
- Cloud
- Software (Username/Password)
- Software (Personal Access Token)
[project name]:
outputs:
dev:
cloud_host: api.dremio.cloud
cloud_project_id: [project ID]
object_storage_source: [name]
object_storage_path: [path]
dremio_space: [name]
dremio_space_folder: [path]
pat: [personal access token]
threads: [integer >= 1]
type: dremio
use_ssl: true
user: [email address]
target: dev
[project name]:
outputs:
dev:
password: [password]
port: [port]
software_host: [hostname or IP address]
object_storage_source: [name
object_storage_path: [path]
dremio_space: [name]
dremio_space_folder: [path]
threads: [integer >= 1]
type: dremio
use_ssl: [true|false]
user: [username]
target: dev
[project name]:
outputs:
dev:
pat: [personal access token]
port: [port]
software_host: [hostname or IP address]
object_storage_source: [name
object_storage_path: [path]
dremio_space: [name]
dremio_space_folder: [path]
threads: [integer >= 1]
type: dremio
use_ssl: [true|false]
user: [username]
target: dev
Configurations Common to Profiles for Dremio Cloud and Dremio Software
Configuration | Required? | Default Value | Description |
---|---|---|---|
type | Yes | dremio | Auto-populated when creating a Dremio project. Do not change this value. |
threads | Yes | 1 | The number of threads the dbt project runs on. |
object_storage_source | No | $scratch | The name of the filesystem in which to create tables, materialized views, tests, and other objects. The dbt alias is datalake . This name corresponds to the name of a source in the Object Storage section of the Datasets page in Dremio, which is "Samples" in the following image: |
object_storage_path | No | no_schema | The path in the filesystem in which to create objects. The default is the root level of the filesystem. The dbt alias is root_path . Nested folders in the path are separated with periods. This value corresponds to the path in this location in the Datasets page in Dremio, which is "samples.dremio.com.Dremio University" in the following image: |
dremio_space | No | @\<username> | The value of the Dremio space in which to create views. The dbt alias is database . This value corresponds to the name in this location in the Spaces section of the Datasets page in Dremio: |
dremio_space_folder | No | no_schema | The folder in the Dremio space in which to create views. The default is the top level in the space. The dbt alias is schema . Nested folders are separated with periods. This value corresponds to the path in this location in the Datasets page in Dremio, which is Folder1.Folder2 in the following image: |
Configurations in Profiles for Dremio Cloud
Configuration | Required? | Default Value | Description |
---|---|---|---|
cloud_host | Yes | api.dremio.cloud | US Control Plane: api.dremio.cloud EU Control Plane: api.eu.dremio.cloud |
user | Yes | None | Email address used as a username in Dremio Cloud |
pat | Yes | None | The personal access token to use for authentication. See Personal Access Tokens for instructions about obtaining a token. |
cloud_project_id | Yes | None | The ID of the Sonar project in which to run transformations. |
use_ssl | Yes | true | The value must be true . |
Configurations in Profiles for Dremio Software
Configuration | Required? | Default Value | Description |
---|---|---|---|
software_host | Yes | None | The hostname or IP address of the coordinator node of the Dremio cluster. |
port | Yes | 9047 | Port for Dremio Software cluster API endpoints. |
user | Yes | None | The username of the account to use when logging into the Dremio cluster. |
password | Yes, if you are not using the pat configuration. | None | The password of the account to use when logging into the Dremio cluster. |
pat | Yes, if you are not using the user and password configurations. | None | The personal access token to use for authenticating to Dremio. See Personal Access Tokens for instructions about obtaining a token. The use of a personal access token takes precedence if values for the three configurations user, password and pat are specified. |
use_ssl | Yes | true | Acceptable values are true and false . If the value is set to true, ensure that full wire encryption is configured in your Dremio cluster. See Prerequisites for Dremio Software. |