Redshift setup
Overview of dbt-redshift
- Maintained by: dbt Labs
- Authors: core dbt maintainers
- GitHub repo: dbt-labs/dbt-redshift
- PyPI package:
dbt-redshift
- Slack channel: #db-redshift
- Supported dbt Core version: v0.10.0 and newer
- dbt Cloud support: Supported
- Minimum data platform version: n/a
Installing dbt-redshift
pip is the easiest way to install the adapter:
pip install dbt-redshift
Installing dbt-redshift
will also install dbt-core
and any other dependencies.
Configuring dbt-redshift
For Redshift-specific configuration, refer to Redshift Configuration.
For further info, refer to the GitHub repository: dbt-labs/dbt-redshift.
Authentication Methods
Password-based authentication
company-name:
target: dev
outputs:
dev:
type: redshift
host: hostname.region.redshift.amazonaws.com
user: username
password: password1
port: 5439
dbname: analytics
schema: analytics
threads: 4
keepalives_idle: 240 # default 240 seconds
connect_timeout: 10 # default 10 seconds
# search_path: public # optional, not recommended
sslmode: [optional, set the sslmode used to connect to the database (in case this parameter is set, will look for ca in ~/.postgresql/root.crt)]
ra3_node: true # enables cross-database sources
IAM Authentication
To set up a Redshift profile using IAM Authentication, set the method
parameter to iam
as shown below. Note that a password is not required when
using IAM Authentication. For more information on this type of authentication,
consult the Redshift Documentation
and boto3
docs
on generating user credentials with IAM Auth.
If you receive the "You must specify a region" error when using IAM
Authentication, then your aws credentials are likely misconfigured. Try running
aws configure
to set up AWS access keys, and pick a default region. If you have any questions,
please refer to the official AWS documentation on Configuration and credential file settings.
my-redshift-db:
target: dev
outputs:
dev:
type: redshift
method: iam
cluster_id: CLUSTER_ID
host: hostname.region.redshift.amazonaws.com
user: alice
iam_profile: data_engineer # optional
iam_duration_seconds: 900 # optional
autocreate: true # optional
db_groups: ['ANALYSTS'] # optional
# Other Redshift configs:
port: 5439
dbname: analytics
schema: analytics
threads: 4
keepalives_idle: 240 # default 240 seconds
connect_timeout: 10 # default 10 seconds
retries: 1 # default 1 retry on error/timeout when opening connections
# search_path: public # optional, but not recommended
sslmode: [optional, set the sslmode used to connect to the database (in case this parameter is set, will look for ca in ~/.postgresql/root.crt)]
ra3_node: true # enables cross-database sources
Specifying an IAM Profile
The iam_profile
config option for Redshift profiles is new in dbt v0.18.0
When the iam_profile
configuration is set, dbt will use the specified profile from your ~/.aws/config
file instead of using the profile name default
Redshift notes
sort
and dist
keys
Where possible, dbt enables the use of sort
and dist
keys. See the section on Redshift specific configurations.
keepalives_idle
If the database closes its connection while dbt is waiting for data, you may see the error SSL SYSCALL error: EOF detected
. Lowering the keepalives_idle
value may prevent this, because the server will send a ping to keep the connection active more frequently.
dbt's default setting is 240 (seconds), but can be configured lower (perhaps 120 or 60), at the cost of a chattier network connection.