Skip to main content

About job deployments

Running dbt in production means setting up a system to run a dbt job on a schedule, rather than running dbt commands manually from the command line. Your production dbt jobs should create the tables and views that your business intelligence tools and end users query. Before continuing, make sure you understand dbt's approach to managing environments.

In addition to setting up a schedule, there are other considerations when setting up dbt to run in production:

  • The complexity involved in creating a new dbt job or editing an existing one.
  • Setting up notifications if a step within your job returns an error code (for example, a model can't be built or a test fails).
  • Accessing logs to help debug any issues.
  • Pulling the latest version of your git repo before running dbt (continuous deployment).
  • Running your dbt project before merging code into master (continuous integration).
  • Allowing access for team members that need to collaborate on your dbt project.

Run dbt in production

If you want to run dbt jobs on a schedule, you can use tools such as dbt Cloud, Airflow, Prefect, Dagster, automation server, or Cron.

dbt Cloud

We've built dbt Cloud to empower data teams to easily run dbt in production. If you're interested in trying out dbt Cloud, you can sign up for an account.

dbt Cloud enables you to:

  • run your jobs on a schedule
  • view logs for any historical invocation of dbt
  • configure error notifications
  • render your project's documentation

In general, the dbt Cloud application deployment models fall into two categories: Multi Tenant and Single Tenant. These deployments are hosted on infrastructure managed by dbt Labs. Both models leverage AWS infrastructure as described in the Architecture section.

For more information on these deployment models, refer to our tenancy page

If you’re interested in learning more about an Enterprise plan, please contact us.

Webhooks for your jobs

With webhooks in dbt Cloud, you can send events (notifications) about your dbt jobs to your other systems like Slack, PagerDuty, and so on. This can be useful for automating some of your workflows.

Airflow

If your organization is using Airflow, there are a number of ways you can run your dbt jobs, including:

  • Installing the dbt Cloud Provider to orchestrate dbt Cloud jobs. This package contains multiple Hooks, Operators, and Sensors to complete various actions within dbt Cloud.
Airflow DAG using DbtCloudRunJobOperatorAirflow DAG using DbtCloudRunJobOperatordbt Cloud job triggered by Airflowdbt Cloud job triggered by Airflow
  • Invoking dbt Core jobs through the BashOperator. In this case, be sure to install dbt into a virtual environment to avoid issues with conflicting dependencies between Airflow and dbt.

For more details on both of these methods, including example implementations, check out this guide.

Prefect

If your organization is using Prefect, the way you will run your jobs depends on the dbt version you're on, and whether you're orchestrating dbt Cloud or dbt Core jobs.

Review a variety of options described below.

Prefect DAG using a dbt Cloud job run flowPrefect DAG using a dbt Cloud job run flow

On Prefect 2

dbt Cloud

Use the trigger_dbt_cloud_job_run_and_wait_for_completion flow. As jobs are executing, you can poll dbt to see whether or not the job completes without failures, through the Prefect user interface (UI).

dbt Cloud job triggered by Prefectdbt Cloud job triggered by Prefect

dbt Core

Use the trigger_dbt_cli_command task.

For details on both of these methods, see prefect-dbt docs.

On Prefect 1

dbt Cloud

Trigger dbt Cloud jobs with the DbtCloudRunJob task. Running this task will generate a markdown artifact viewable in the Prefect UI. The artifact will contain links to the dbt artifacts generated as a result of the job run.

dbt Core

Use the DbtShellTask to schedule, execute, and monitor your dbt runs. Use the supported ShellTask to execute dbt commands through the shell.

Dagster

If your organization is using Dagster, you can use the dagster_dbt library to integrate dbt commands into your pipelines. This library supports the execution of dbt through dbt Cloud, dbt CLI and the dbt RPC server. Running dbt from Dagster automatically aggregates metadata about your dbt runs. Check out the example pipeline for details.

Automation servers

Automation servers, like CodeDeploy, GitLab CI/CD (video), Bamboo and Jenkins, can be used to schedule bash commands for dbt. They also provide a UI to view logging to the command line, and integrate with your git repository.

Cron

Cron is a decent way to schedule bash commands. However, while it may seem like an easy route to schedule a job, writing code to take care of all of the additional features associated with a production deployment often makes this route more complex compared to other options listed here.

0