Quickstart with dbt Mesh

dbt Cloud

Quickstart

Intermediate

1 Introduction
2 Prerequisites
3 Create and configure two projects
4 Set up a foundational project
5 Define a public model and run first job
6 Reference a public model in your downstream project
7 Add model versions and contracts
8 Add a dbt Cloud job in the downstream project
9 View deprecation warning
10 View lineage with dbt Explorer
11 What's next

Introduction

Mesh is a framework that helps organizations scale their teams and data assets effectively. It promotes governance best practices and breaks large projects into manageable sections — for faster data development. Mesh is available for dbt Cloud Enterprise accounts.

This guide will teach you how to set up a multi-project design using foundational concepts of Mesh and how to implement a data mesh in dbt Cloud:

Set up a foundational project called “Jaffle | Data Analytics”
Set up a downstream project called “Jaffle | Finance”
Add model access, versions, and contracts
Set up a dbt Cloud job that is triggered on completion of an upstream job

For more information on why data mesh is important, read this post: What is data mesh? The definition and importance of data mesh.

Videos for you

You can check out dbt Fundamentals for free if you're interested in course learning with videos.

You can also watch the YouTube video on dbt and Snowflake.

Back Next

Prerequisites

To leverage Mesh, you need the following:

You must have a dbt Cloud Enterprise account Enterprise
You have access to a cloud data platform, permissions to load the sample data tables, and dbt Cloud permissions to create new projects.
This guide uses the Jaffle Shop sample data, including customers, orders, and payments tables. Follow the provided instructions to load this data into your respective data platform:

This guide assumes you have experience with or fundamental knowledge of dbt. Take the dbt Fundamentals course first if you are brand new to dbt.

Back Next

Create and configure two projects

In this section, you'll create two new, empty projects in dbt Cloud to serve as your foundational and downstream projects:

Foundational projects (or upstream projects) typically contain core models and datasets that serve as the base for further analysis and reporting.
Downstream projects build on these foundations, often adding more specific transformations or business logic for dedicated teams or purposes.

For example, the always-enterprising and fictional account "Jaffle Labs" will create two projects for their data analytics and finance team: Jaffle | Data Analytics and Jaffle | Finance.

Create two new dbt Cloud projects named 'Jaffle | Data Analytics' and 'Jaffle Finance'

To create a new project in dbt Cloud:

From Account settings, click + New Project.
Enter a project name and click Continue.
- Use "Jaffle | Data Analytics" for one project
- Use "Jaffle | Finance" for the other project
Select your data platform, then Next to set up your connection.
In the Configure your environment section, enter the Settings for your new project.
Click Test Connection. This verifies that dbt Cloud can access your data platform account.
Click Next if the test succeeded. If it fails, you might need to go back and double-check your settings.
- For this guide, make sure you create a single development and Deployment per project.
  - For "Jaffle | Data Analytics", set the default database to jaffle_da.
  - For "Jaffle | Finance", set the default database to jaffle_finance

Navigate to 'Account settings' and then click + 'New Project' to create new projects in dbt Cloud

Continue the prompts to complete the project setup. Once configured, each project should have:
- A data platform connection
- New git repo
- One or more environments (such as development, deployment)

Create a production environment

In dbt Cloud, each project can have one deployment environment designated as "Production.". You must set up a "Production" or "Staging" deployment environment for each project you want to "mesh" together. This enables you to leverage Explorer in the later steps of this guide.

To set a production environment:

Navigate to Deploy -> Environments, then click Create New Environment.
Select Deployment as the environment type.
Under Set deployment type, select the Production button.
Select the dbt version.
Continue filling out the fields as necessary in the Deployment connection and Deployment credentials sections.
Click Test Connection to confirm the deployment connection.
Click Save to create a production environment.

Set your production environment as the default environment in your Environment Settings

Back Next

Set up a foundational project

This upstream project is where you build your core data assets. This project will contain the raw data sources, staging models, and core business logic.

dbt Cloud enables data practitioners to develop in their tool of choice and comes equipped with a local dbt Cloud CLI or in-browser Cloud IDE.

In this section of the guide, you will set the "Jaffle | Data Analytics" project as your foundational project using the Cloud IDE.

First, navigate to the Develop page to verify your setup.
Click Initialize dbt project if you’ve started with an empty repo:

Initialize repo

Delete the models/example folder.
Navigate to the dbt_project.yml file and rename the project (line 5) from my_new_project to analytics.
In your dbt_project.yml file, remove lines 39-42 (the my_new_project model reference).
In the File Explorer, hover over the project directory and click the ..., then select Create file.
Create two new folders: models/staging and models/core.

Staging layer

Now that you've set up the foundational project, let's start building the data assets. Set up the staging layer as follows:

Create a new YAML file models/staging/sources.yml.
Declare the sources by copying the following into the file and clicking Save.

models/staging/sources.yml

version: 2

sources:
  - name: jaffle_shop
    description: This is a replica of the Postgres database used by our app
    database: raw
    schema: jaffle_shop
    tables:
      - name: customers
        description: One record per customer.
      - name: orders
        description: One record per order. Includes cancelled and deleted orders.

Create a models/staging/stg_customers.sql file to select from the customers table in the jaffle_shop source.

models/staging/stg_customers.sql

select
    id as customer_id,
    first_name,
    last_name

from {{ source('jaffle_shop', 'customers') }}

Create a models/staging/stg_orders.sql file to select from the orders table in the jaffle_shop source.

models/staging/stg_orders.sql

select
    id as order_id,
    user_id as customer_id,
    order_date,
    status

from {{ source('jaffle_shop', 'orders') }}

Create a models/core/fct_orders.sql file to build a fact table with customer and order details.

models/core/fct_orders.sql

with customers as (
    select * 
    from {{ ref('stg_customers') }}
),

orders as (
    select * 
    from {{ ref('stg_orders') }}
),

customer_orders as (
    select
        customer_id,
        min(order_date) as first_order_date
    from orders
    group by customer_id
),

final as (
    select
        o.order_id,
        o.order_date,
        o.status,
        c.customer_id,
        c.first_name,
        c.last_name,
        co.first_order_date,
        -- Note that we've used a macro for this so that the appropriate DATEDIFF syntax is used for each respective data platform
        {{ datediff('first_order_date', 'order_date', 'day') }} as days_as_customer_at_purchase
    from orders o
    left join customers c using (customer_id)
    left join customer_orders co using (customer_id)
)

select * from final

Navigate to the Command bar and execute dbt build.

Before a downstream team can leverage assets from this foundational project, you need to first:

Create and define at least one model as “public”
Run a deployment job successfully
- Note, Enable the Generate docs on run toggle for this job to update the Explorer. Once run, you can click Explore from the upper menu bar and see your lineage, tests, and documentation coming through successfully.

Back Next

Define a public model and run first job

In the previous section, you've arranged your basic building blocks, now let's integrate Mesh.

Although the Finance team requires the fct_orders model for analyzing payment trends, other models, particularly those in the staging layer used for data cleansing and joining, are not needed by downstream teams.

To make fct_orders publicly available:

In the models/core/core.yml file, add a access: public clause to the relevant YAML file by adding and saving the following:

models/core/core.yml

version: 2

models:
  - name: fct_orders
    access: public
    description: "Customer and order details"
    columns:
      - name: order_id
        data_type: number
        description: ""

      - name: order_date
        data_type: date
        description: ""

      - name: status
        data_type: varchar
        description: "Indicates the status of the order"

      - name: customer_id
        data_type: number
        description: ""

      - name: first_name
        data_type: varchar
        description: ""

      - name: last_name
        data_type: varchar
        description: ""

      - name: first_order_date
        data_type: date
        description: ""

      - name: days_as_customer_at_purchase
        data_type: number
        description: "Days between this purchase and customer's first purchase"

Note: By default, model access is set to "protected", which means they can only be referenced within the same project. Learn more about access types and model groups here.

Navigate to the Cloud IDE Lineage tab to see the model noted as Public, below the model name.

Jaffle | Data Analytics lineage

Go to Version control and click the Commit and Sync button to commit your changes.
Merge your changes to the main or production branch.

Create and run a dbt Cloud job

Before a downstream team can leverage assets from this foundational project, you need to create a production environment and run a deployment job successfully.

To run your first deployment dbt Cloud job, you will need to create a new dbt Cloud job.

Click Deploy and then Jobs.
Click Create job and then Deploy job.
Select the Generate docs on run option. This will reflect the state of this project in the Explore section.

Select the 'Generate docs on run' option when configuring your dbt Cloud job.

Then, click Run now to trigger the job.

Trigger a job by clicking the 'Run now' button.

After the run is complete, click Explore from the upper menu bar. You should now see your lineage, tests, and documentation coming through successfully.

For details on how dbt Cloud uses metadata from the Staging environment to resolve references in downstream projects, check out the section on Staging with downstream dependencies.

Back Next

Reference a public model in your downstream project

In this section, you will set up the downstream project, "Jaffle | Finance", and cross-project reference the fct_orders model from the foundational project. Navigate to the Develop page to set up our project:

If you’ve also started with a new git repo, click Initialize dbt project under the Version control section.
Delete the models/example folder
Navigate to the dbt_project.yml file and rename the project (line 5) from my_new_project to finance
Navigate to the dbt_project.yml file and remove lines 39-42 (the my_new_project model reference).
In the File Explorer, hover over the project directory, click the ... and Select Create file.
Name the file dependencies.yml.

Create file in the dbt Cloud IDE.

Add the upstream analytics project and the dbt_utils package. Click Save.

dependencies.yml

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1

projects:
  - name: analytics

Staging layer

Now that you've set up the foundational project, let's start building the data assets. Set up the staging layer as follows:

Create a new YAML file models/staging/sources.yml and declare the sources by copying the following into the file and clicking Save.
models/staging/sources.yml
```
version: 2

sources:
  - name: stripe
    database: raw
    schema: stripe 
    tables:
      - name: payment
```

Create models/staging/stg_payments.sql to select from the payment table in the stripe source.

models/staging/stg_payments.sql

with payments as (
    select * from {{ source('stripe', 'payment') }}
),

final as (
    select 
        id as payment_id,
        orderID as order_id,
        paymentMethod as payment_method,
        amount,
        created as payment_date 
    from payments
)

select * from final

Reference the public model

You're now set to add a model that explores how payment types vary throughout a customer's journey. This helps determine whether coupon gift cards decrease with repeat purchases, as our marketing team anticipates, or remain consistent.

To reference the model, use the following logic to ascertain this:

models/core/agg_customer_payment_journey.sql

with stg_payments as (
    select * from {{ ref('stg_payments') }}
),

fct_orders as (
    select * from {{ ref('analytics', 'fct_orders') }}
),

final as (
    select 
        days_as_customer_at_purchase,
        -- we use the pivot macro in the dbt_utils package to create columns that total payments for each method
        {{ dbt_utils.pivot(
            'payment_method',
            dbt_utils.get_column_values(ref('stg_payments'), 'payment_method'),
            agg='sum',
            then_value='amount',
            prefix='total_',
            suffix='_amount'
        ) }}, 
        sum(amount) as total_amount
    from fct_orders
    left join stg_payments using (order_id)
    group by 1
)

select * from final

Notice the cross-project ref at work! When you add the ref, the dbt Cloud IDE's auto-complete feature recognizes the public model as available.

Cross-project ref autocomplete in the dbt Cloud IDE

This automatically resolves (or links) to the correct database, schema, and table/view set by the upstream project.

Cross-project ref compile

You can also see this connection displayed in the live Lineage tab.

Cross-project ref lineage

Back Next

Add model versions and contracts

How can you enhance resilience and add guardrails to this type of multi-project relationship? You can adopt best practices from software engineering by:

Defining model contracts — Set up model contracts in dbt to define a set of upfront "guarantees" that define the shape of your model. While building your model, dbt will verify that your model's transformation will produce a dataset matching up with its contract; if not, the build fails.
Defining model versions — Use model versions to manage updates and handle breaking changes systematically.

Set up model contracts

As part of the Data Analytics team, you may want to ensure the fct_orders model is reliable for downstream users, like the Finance team.

Navigate to models/core/core.yml and under the fct_orders model before the columns: section, add a data contract to enforce reliability:

models:
  - name: fct_orders
    access: public
    description: “Customer and order details”
    config:
      contract:
        enforced: true
    columns:
      - name: order_id
        .....

Test what would happen if this contract were violated. In models/core/fct_orders.sql, comment out the orders.status column and click Build to try building the model.
- If the contract is breached, the build fails, as seen in the command bar history.
The data contract was breached and the dbt build run failed.

Set up model versions

In this section, you will set up model versions by the Data Analytics team as they upgrade the fct_orders model while offering backward compatibility and a migration notice to the downstream Finance team.

Rename the existing model file from models/core/fct_orders.sql to models/core/fct_orders_v1.sql.
Create a new file models/core/fct_orders_v2.sql and adjust the schema:
- Comment out o.status in the final CTE.
- Add a new field, case when o.status = 'returned' then true else false end as is_return to indicate if an order was returned.
Then, add the following to your models/core/core.yml file:
- The is_return column
- The two model versions
- A latest_version to indicate which model is the latest (and should be used by default, unless specified otherwise)
- A deprecation_date to version 1 as well to indicate when the model will be deprecated.
It should now read as follows:

models/core/core.yml

version: 2

models:
  - name: fct_orders
    access: public
    description: "Customer and order details"
    latest_version: 2
    config:
      contract:
        enforced: true
    columns:
      - name: order_id
        data_type: number
        description: ""

      - name: order_date
        data_type: date
        description: ""

      - name: status
        data_type: varchar
        description: "Indicates the status of the order"

      - name: is_return
        data_type: boolean
        description: "Indicates if an order was returned"

      - name: customer_id
        data_type: number
        description: ""

      - name: first_name
        data_type: varchar
        description: ""

      - name: last_name
        data_type: varchar
        description: ""

      - name: first_order_date
        data_type: date
        description: ""

      - name: days_as_customer_at_purchase
        data_type: number
        description: "Days between this purchase and customer's first purchase"

    # Declare the versions, and highlight the diffs
    versions:
    
      - v: 1
        deprecation_date: 2024-06-30 00:00:00.00+00:00
        columns:
          # This means: use the 'columns' list from above, but exclude is_return
          - include: all
            exclude: [is_return]
        
      - v: 2
        columns:
          # This means: use the 'columns' list from above, but exclude status
          - include: all
            exclude: [status]

Verify how dbt compiles the ref statement based on the updates. Open a new file, add the following select statements, and click Compile. Note how each ref is compiled to the specified version (or the latest version if not specified).

select * from {{ ref('fct_orders', v=1) }}
select * from {{ ref('fct_orders', v=2) }}
select * from {{ ref('fct_orders') }}

Back Next

Add a dbt Cloud job in the downstream project

Before proceeding, make sure you commit and merge your changes in both the “Jaffle | Data Analytics” and “Jaffle | Finance” projects.

A member of the Finance team would like to schedule a dbt Cloud job for their customer payment journey analysis immediately after the data analytics team refreshes their pipelines.

In the “Jaffle | Finance” project, go to the Jobs page by navigating to Deploy and then Jobs.
Then click Create job and then Deploy job.
Add a name for the job, then scroll to the bottom of the Job completion section.
In Job completion section, configure the job to Run when another job finishes and select the upstream job from the “Jaffle | Data Analytics” project.

Trigger job on completion

Click Save and verify the job is set up correctly.
Go to the “Jaffle | Data Analytics” jobs page. Select the Daily job and click Run now.
Once this job completes successfully, go back to the “Jaffle | Finance” jobs page. You should see that the Finance team’s job was triggered automatically.

This simplifies the process of staying in sync with the upstream tables and removes the need for more sophisticated orchestration skills, such as coordinating jobs across projects via an external orchestrator.

Back Next

View deprecation warning

To find out how long the Finance team has to migrate from fct_orders_v1 to fct_orders_v2, follow these steps:

In the “Jaffle | Finance” project, navigate to the Develop page.
Edit the cross-project ref to use v=1 in models/marts/agg_customer_payment_journey.sql:

models/core/agg_customer_payment_journey.sql

with stg_payments as (
    select * from {{ ref('stg_payments') }}
),

fct_orders as (
    select * from {{ ref('analytics', 'fct_orders', v=1) }}
),

final as (
    select 
        days_as_customer_at_purchase,
        -- we use the pivot macro in the dbt_utils package to create columns that total payments for each method
        {{ dbt_utils.pivot(
            'payment_method',
            dbt_utils.get_column_values(ref('stg_payments'), 'payment_method'),
            agg='sum',
            then_value='amount',
            prefix='total_',
            suffix='_amount'
        ) }}, 
        sum(amount) as total_amount
    from fct_orders
    left join stg_payments using (order_id)
    group by 1
)

select * from final

In the Cloud IDE, go to Version control to commit and merge the changes.
Go to the Deploy and then Jobs page.
Click Run now to run the Finance job. The agg_customer_payment_journey model will build and display a deprecation date warning.

The model will display a deprecation date warning.

Back Next

View lineage with dbt Explorer

Use Explorer to view the lineage across projects in dbt Cloud. Navigate to the Explore page for each of your projects — you should now view the lineage seamlessly across projects.

View 'Jaffle | Data Analytics' lineage with dbt Explorer

Back Next

What's next

Congratulations 🎉! You're ready to bring the benefits of Mesh to your organization. You've learned:

How to establish a foundational project "Jaffle | Data Analytics."
Create a downstream project "Jaffle | Finance."
Implement model access, versions, and contracts.
Set up dbt Cloud jobs triggered by upstream job completions.

Here are some additional resources to help you continue your journey:

Back Next

Introduction​

Related content:​

Prerequisites​​

Create and configure two projects​

Create a production environment​

Set up a foundational project​

Staging layer​

Define a public model and run first job​

Create and run a dbt Cloud job​

Reference a public model in your downstream project​

Staging layer​

Reference the public model​

Add model versions and contracts​

Set up model contracts​

Set up model versions​

Add a dbt Cloud job in the downstream project​

View deprecation warning​

View lineage with dbt Explorer​

What's next​

Introduction

Related content:

Prerequisites

Create and configure two projects

Create a production environment

Set up a foundational project

Staging layer

Define a public model and run first job

Create and run a dbt Cloud job

Reference a public model in your downstream project

Staging layer

Reference the public model

Add model versions and contracts

Set up model contracts

Set up model versions

Add a dbt Cloud job in the downstream project

View deprecation warning

View lineage with dbt Explorer

What's next