Skip to main content

Quickstart for the dbt Cloud Semantic Layer and Snowflake

Updated
Semantic Layer
Snowflake
dbt Cloud
Quickstart
Intermediate
Menu

    Introduction

    The dbt Semantic Layer, powered by MetricFlow, simplifies the setup of key business metrics. It centralizes definitions, avoids duplicate code, and ensures easy access to metrics in downstream tools. MetricFlow helps manage company metrics easier, allowing you to define metrics in your dbt project and query them in dbt Cloud with MetricFlow commands.

    📹 Learn about the dbt Semantic Layer with on-demand video courses!

    Explore our dbt Semantic Layer on-demand course to learn how to define and query metrics in your dbt project.

    Additionally, dive into mini-courses for querying the dbt Semantic Layer in your favorite tools: Tableau, Hex, and Mode.

    This quickstart guide is designed for dbt Cloud users using Snowflake as their data platform. It focuses on building and defining metrics, setting up the dbt Semantic Layer in a dbt Cloud project, and querying metrics in Google Sheets.

    For users on different data platforms

    If you're using a data platform other than Snowflake, this guide is also applicable to you. You can adapt the setup for your specific platform by following the account setup and data loading instructions detailed in the following tabs for each respective platform.

    The rest of this guide applies universally across all supported platforms, ensuring you can fully leverage the dbt Semantic Layer.

    Open a new tab and follow these quick steps for account setup and data loading instructions:

    Prerequisites

    • You need a dbt Cloud Trial, Team, or Enterprise account for all deployments. Contact your representative for Single-tenant setup; otherwise, create an account using this guide.

    • Have the correct dbt Cloud license and permissions based on your plan:

       More info on license and permissions
    • Production and development environments must be on dbt version 1.6 or higher. Alternatively, set your environment to "versionless" by selecting Keep on latest version to always get the latest updates.

    • Create a trial Snowflake account:

      • Select the Enterprise Snowflake edition with ACCOUNTADMIN access. Consider organizational questions when choosing a cloud provider, refer to Snowflake's Introduction to Cloud Platforms.
      • Select a cloud provider and region. All cloud providers and regions will work so choose whichever you prefer.
    • Basic understanding of SQL and dbt. For example, you've used dbt before or have completed the dbt Fundamentals course.

    What you'll learn

    This guide will cover the following topics:

    Create new Snowflake worksheet and set up environment

    1. Log in to your trial Snowflake account.
    2. In the Snowflake user interface (UI), click + Worksheet in the upper right corner.
    3. Select SQL Worksheet to create a new worksheet.

    Set up Snowflake environment

    The data used here is stored as CSV files in a public S3 bucket and the following steps will guide you through how to prepare your Snowflake account for that data and upload it.

    Create a new virtual warehouse, two new databases (one for raw data, the other for future dbt development), and two new schemas (one for jaffle_shop data, the other for stripe data).

    1. Run the following SQL commands one by one by typing them into the Editor of your new Snowflake SQL worksheet to set up your environment.

    2. Click Run in the upper right corner of the UI for each one:

    -- Create a virtual warehouse named 'transforming'
    create warehouse transforming;

    -- Create two databases: one for raw data and another for analytics
    create database raw;
    create database analytics;

    -- Within the 'raw' database, create two schemas: 'jaffle_shop' and 'stripe'
    create schema raw.jaffle_shop;
    create schema raw.stripe;

    Load data into Snowflake

    Now that your environment is set up, you can start loading data into it. You will be working within the raw database, using the jaffle_shop and stripe schemas to organize your tables.

    1. Create customer table. First, delete all contents (empty) in the Editor of the Snowflake worksheet. Then, run this SQL command to create the customer table in the jaffle_shop schema:

      create table raw.jaffle_shop.customers
      ( id integer,
      first_name varchar,
      last_name varchar
      );

      You should see a ‘Table CUSTOMERS successfully created.’ message.

    2. Load data. After creating the table, delete all contents in the Editor. Run this command to load data from the S3 bucket into the customer table:

      copy into raw.jaffle_shop.customers (id, first_name, last_name)
      from 's3://dbt-tutorial-public/jaffle_shop_customers.csv'
      file_format = (
      type = 'CSV'
      field_delimiter = ','
      skip_header = 1
      );

      You should see a confirmation message after running the command.

    3. Create orders table. Delete all contents in the Editor. Run the following command to create…

      create table raw.jaffle_shop.orders
      ( id integer,
      user_id integer,
      order_date date,
      status varchar,
      _etl_loaded_at timestamp default current_timestamp
      );

      You should see a confirmation message after running the command.

    4. Load data. Delete all contents in the Editor, then run this command to load data into the orders table:

      copy into raw.jaffle_shop.orders (id, user_id, order_date, status)
      from 's3://dbt-tutorial-public/jaffle_shop_orders.csv'
      file_format = (
      type = 'CSV'
      field_delimiter = ','
      skip_header = 1
      );

      You should see a confirmation message after running the command.

    5. Create payment table. Delete all contents in the Editor. Run the following command to create the payment table:

      create table raw.stripe.payment
      ( id integer,
      orderid integer,
      paymentmethod varchar,
      status varchar,
      amount integer,
      created date,
      _batched_at timestamp default current_timestamp
      );

      You should see a confirmation message after running the command.

    6. Load data. Delete all contents in the Editor. Run the following command to load data into the payment table:

      copy into raw.stripe.payment (id, orderid, paymentmethod, status, amount, created)
      from 's3://dbt-tutorial-public/stripe_payments.csv'
      file_format = (
      type = 'CSV'
      field_delimiter = ','
      skip_header = 1
      );

      You should see a confirmation message after running the command.

    7. Verify data. Verify that the data is loaded by running these SQL queries. Confirm that you can see output for each one, like the following confirmation image.

      select * from raw.jaffle_shop.customers;
      select * from raw.jaffle_shop.orders;
      select * from raw.stripe.payment;
      The image displays Snowflake's confirmation output when data loaded correctly in the Editor.The image displays Snowflake's confirmation output when data loaded correctly in the Editor.

    Connect dbt Cloud to Snowflake

    There are two ways to connect dbt Cloud to Snowflake. The first option is Partner Connect, which provides a streamlined setup to create your dbt Cloud account from within your new Snowflake trial account. The second option is to create your dbt Cloud account separately and build the Snowflake connection yourself (connect manually). If you want to get started quickly, dbt Labs recommends using Partner Connect. If you want to customize your setup from the very beginning and gain familiarity with the dbt Cloud setup flow, dbt Labs recommends connecting manually.

    Using Partner Connect allows you to create a complete dbt account with your Snowflake connection, a managed repository, environments, and credentials.

    1. In the Snowflake UI, click on the home icon in the upper left corner. In the left sidebar, select Data Products. Then, select Partner Connect. Find the dbt tile by scrolling or by searching for dbt in the search bar. Click the tile to connect to dbt.

      Snowflake Partner Connect BoxSnowflake Partner Connect Box

      If you’re using the classic version of the Snowflake UI, you can click the Partner Connect button in the top bar of your account. From there, click on the dbt tile to open up the connect box.

      Snowflake Classic UI - Partner ConnectSnowflake Classic UI - Partner Connect
    2. In the Connect to dbt popup, find the Optional Grant option and select the RAW and ANALYTICS databases. This will grant access for your new dbt user role to each selected database. Then, click Connect.

      Snowflake Classic UI - Connection BoxSnowflake Classic UI - Connection Box
      Snowflake New UI - Connection BoxSnowflake New UI - Connection Box
    3. Click Activate when a popup appears:

    Snowflake Classic UI - Actviation WindowSnowflake Classic UI - Actviation Window
    Snowflake New UI - Activation WindowSnowflake New UI - Activation Window
    1. After the new tab loads, you will see a form. If you already created a dbt Cloud account, you will be asked to provide an account name. If you haven't created an account, you will be asked to provide an account name and password.
    dbt Cloud - Account Infodbt Cloud - Account Info
    1. After you have filled out the form and clicked Complete Registration, you will be logged into dbt Cloud automatically.

    2. From your Account Settings in dbt Cloud (using the gear menu in the upper right corner), choose the "Partner Connect Trial" project and select snowflake in the overview table. Select Edit and update the Database field to analytics and the Warehouse field to transforming.

    dbt Cloud - Snowflake Project Overviewdbt Cloud - Snowflake Project Overview
    dbt Cloud - Update Database and Warehousedbt Cloud - Update Database and Warehouse

    Set up a dbt Cloud managed repository

    If you used Partner Connect, you can skip to initializing your dbt project as Partner Connect provides you with a managed repository. Otherwise, you will need to create your repository connection.

    When you develop in dbt Cloud, you can leverage Git to version control your code.

    To connect to a repository, you can either set up a dbt Cloud-hosted managed repository or directly connect to a supported git provider. Managed repositories are a great way to trial dbt without needing to create a new repository. In the long run, it's better to connect to a supported git provider to use features like automation and continuous integration.

    To set up a managed repository:

    1. Under "Setup a repository", select Managed.
    2. Type a name for your repo such as bbaggins-dbt-quickstart
    3. Click Create. It will take a few seconds for your repository to be created and imported.
    4. Once you see the "Successfully imported repository," click Continue.

    Initialize your dbt project and start developing

    This guide assumes you use the dbt Cloud IDE to develop your dbt project and define metrics. However, the dbt Cloud IDE doesn't support using MetricFlow commands to query or preview metrics (support coming soon).

    To query and preview metrics in your development tool, you can use the dbt Cloud CLI to run the MetricFlow commands.

    Now that you have a repository configured, you can initialize your project and start development in dbt Cloud using the IDE:

    1. Click Start developing in the dbt Cloud IDE. It might take a few minutes for your project to spin up for the first time as it establishes your git connection, clones your repo, and tests the connection to the warehouse.
    2. Above the file tree to the left, click Initialize your project. This builds out your folder structure with example models.
    3. Make your initial commit by clicking Commit and sync. Use the commit message initial commit. This creates the first commit to your managed repo and allows you to open a branch where you can add a new dbt code.
    4. You can now directly query data from your warehouse and execute dbt run. You can try this out now:
      • Delete the models/examples folder in the File Explorer.
      • Click + Create new file, add this query to the new file, and click Save as to save the new file:
        select * from raw.jaffle_shop.customers
      • In the command line bar at the bottom, enter dbt run and click Enter. You should see a dbt run succeeded message.

    Build your dbt project

    The next step is to build your project. This involves adding sources, staging models, business-defined entities, and packages to your project.

    Add sources

    Sources in dbt are the raw data tables you'll transform. By organizing your source definitions, you document the origin of your data. It also makes your project and transformation more reliable, structured, and understandable.

    You have two options for working with files in the dbt Cloud IDE:

    • Create a new branch (recommended) Create a new branch to edit and commit your changes. Navigate to Version Control on the left sidebar and click Create branch.
    • Edit in the protected primary branch If you prefer to edit, format, or lint files and execute dbt commands directly in your primary git branch, use this option. The dbt Cloud IDE prevents commits to the protected branch so you'll be prompted to commit your changes to a new branch.

    Name the new branch build-project.

    1. Hover over the models directory and click the three-dot menu (...), then select Create file.
    2. Name the file staging/jaffle_shop/src_jaffle_shop.yml , then click Create.
    3. Copy the following text into the file and click Save.
    models/staging/jaffle_shop/src_jaffle_shop.yml
    version: 2

    sources:
    - name: jaffle_shop
    database: raw
    schema: jaffle_shop
    tables:
    - name: customers
    - name: orders
    tip

    In your source file, you can also use the Generate model button to create a new model file for each source. This creates a new file in the models directory with the given source name and fill in the SQL code of the source definition.

    1. Hover over the models directory and click the three dot menu (...), then select Create file.
    2. Name the file staging/stripe/src_stripe.yml , then click Create.
    3. Copy the following text into the file and click Save.
    models/staging/stripe/src_stripe.yml
    version: 2

    sources:
    - name: stripe
    database: raw
    schema: stripe
    tables:
    - name: payment

    Add staging models

    Staging models are the first transformation step in dbt. They clean and prepare your raw data, making it ready for more complex transformations and analyses. Follow these steps to add your staging models to your project.

    1. In the jaffle_shop sub-directory, create the file stg_customers.sql. Or, you can use the Generate model button to create a new model file for each source.
    2. Copy the following query into the file and click Save.
    models/staging/jaffle_shop/stg_customers.sql
      select
    id as customer_id,
    first_name,
    last_name
    from {{ source('jaffle_shop', 'customers') }}
    1. In the same jaffle_shop sub-directory, create the file stg_orders.sql
    2. Copy the following query into the file and click Save.
    models/staging/jaffle_shop/stg_orders.sql
      select
    id as order_id,
    user_id as customer_id,
    order_date,
    status
    from {{ source('jaffle_shop', 'orders') }}
    1. In the stripe sub-directory, create the file stg_payments.sql.
    2. Copy the following query into the file and click Save.
    models/staging/stripe/stg_payments.sql
    select
    id as payment_id,
    orderid as order_id,
    paymentmethod as payment_method,
    status,
    -- amount is stored in cents, convert it to dollars
    amount / 100 as amount,
    created as created_at


    from {{ source('stripe', 'payment') }}
    1. Enter dbt run in the command prompt at the bottom of the screen. You should get a successful run and see the three models.

    Add business-defined entities

    This phase involves creating models that serve as the entity layer or concept layer of your dbt project, making the data ready for reporting and analysis. It also includes adding packages and the MetricFlow time spine that extend dbt's functionality.

    This phase is the marts layer, which brings together modular pieces into a wide, rich vision of the entities an organization cares about.

    1. Create the file models/marts/fct_orders.sql.
    2. Copy the following query into the file and click Save.
    models/marts/fct_orders.sql
    with orders as  (
    select * from {{ ref('stg_orders' )}}
    ),


    payments as (
    select * from {{ ref('stg_payments') }}
    ),


    order_payments as (
    select
    order_id,
    sum(case when status = 'success' then amount end) as amount


    from payments
    group by 1
    ),


    final as (


    select
    orders.order_id,
    orders.customer_id,
    orders.order_date,
    coalesce(order_payments.amount, 0) as amount


    from orders
    left join order_payments using (order_id)
    )


    select * from final

    1. In the models/marts directory, create the file dim_customers.sql.
    2. Copy the following query into the file and click Save.
    models/marts/dim_customers.sql
    with customers as (
    select * from {{ ref('stg_customers')}}
    ),
    orders as (
    select * from {{ ref('fct_orders')}}
    ),
    customer_orders as (
    select
    customer_id,
    min(order_date) as first_order_date,
    max(order_date) as most_recent_order_date,
    count(order_id) as number_of_orders,
    sum(amount) as lifetime_value
    from orders
    group by 1
    ),
    final as (
    select
    customers.customer_id,
    customers.first_name,
    customers.last_name,
    customer_orders.first_order_date,
    customer_orders.most_recent_order_date,
    coalesce(customer_orders.number_of_orders, 0) as number_of_orders,
    customer_orders.lifetime_value
    from customers
    left join customer_orders using (customer_id)
    )
    select * from final
    1. In your main directory, create the file packages.yml.
    2. Copy the following text into the file and click Save.
    packages.yml
    packages:
    - package: dbt-labs/dbt_utils
    version: 1.1.1
    1. In the models directory, create the file metrics/metricflow_time_spine.sql in your main directory.
    2. Copy the following query into the file and click Save.
    models/metrics/metricflow_time_spine.sql
    {{
    config(
    materialized = 'table',
    )
    }}
    with days as (
    {{
    dbt_utils.date_spine(
    'day',
    "to_date('01/01/2000','mm/dd/yyyy')",
    "to_date('01/01/2027','mm/dd/yyyy')"
    )
    }}
    ),
    final as (
    select cast(date_day as date) as date_day
    from days
    )
    select * from final

    1. Enter dbt run in the command prompt at the bottom of the screen. You should get a successful run message and also see in the run details that dbt has successfully built five models.

    Create semantic models

    Semantic models contain many object types (such as entities, measures, and dimensions) that allow MetricFlow to construct the queries for metric definitions.

    • Each semantic model will be 1:1 with a dbt SQL/Python model.
    • Each semantic model will contain (at most) 1 primary or natural entity.
    • Each semantic model will contain zero, one, or many foreign or unique entities used to connect to other entities.
    • Each semantic model may also contain dimensions, measures, and metrics. This is what actually gets fed into and queried by your downstream BI tool.

    In the following steps, semantic models enable you to define how to interpret the data related to orders. It includes entities (like ID columns serving as keys for joining data), dimensions (for grouping or filtering data), and measures (for data aggregations).

    1. In the metrics sub-directory, create a new file fct_orders.yml.
    2. Add the following code to that newly created file:
    models/metrics/fct_orders.yml
    semantic_models:
    - name: orders
    defaults:
    agg_time_dimension: order_date
    description: |
    Order fact table. This table’s grain is one row per order.
    model: ref('fct_orders')

    The following sections explain dimensions, entities, and measures in more detail, showing how they each play a role in semantic models.

    • Entities act as unique identifiers (like ID columns) that link data together from different tables.
    • Dimensions categorize and filter data, making it easier to organize.
    • Measures calculates data, providing valuable insights through aggregation.

    Entities

    Entities are a real-world concept in a business, serving as the backbone of your semantic model. These are going to be ID columns (like order_id) in our semantic models. These will serve as join keys to other semantic models.

    Add entities to your fct_orders.yml semantic model file:

    models/metrics/fct_orders.yml
    semantic_models:
    - name: orders
    defaults:
    agg_time_dimension: order_date
    description: |
    Order fact table. This table’s grain is one row per order.
    model: ref('fct_orders')
    # Newly added
    entities:
    - name: order_id
    type: primary
    - name: customer_id
    type: foreign

    Dimensions

    Dimensions are a way to group or filter information based on categories or time.

    Add dimensions to your fct_orders.yml semantic model file:

    models/metrics/fct_orders.yml
    semantic_models:
    - name: orders
    defaults:
    agg_time_dimension: order_date
    description: |
    Order fact table. This table’s grain is one row per order.
    model: ref('fct_orders')
    entities:
    - name: order_id
    type: primary
    - name: customer_id
    type: foreign
    # Newly added
    dimensions:
    - name: order_date
    type: time
    type_params:
    time_granularity: day

    Measures

    Measures are aggregations performed on columns in your model. Often, you’ll find yourself using them as final metrics themselves. Measures can also serve as building blocks for more complicated metrics.

    Add measures to your fct_orders.yml semantic model file:

    models/metrics/fct_orders.yml
    semantic_models:
    - name: orders
    defaults:
    agg_time_dimension: order_date
    description: |
    Order fact table. This table’s grain is one row per order.
    model: ref('fct_orders')
    entities:
    - name: order_id
    type: primary
    - name: customer_id
    type: foreign
    dimensions:
    - name: order_date
    type: time
    type_params:
    time_granularity: day
    # Newly added
    measures:
    - name: order_total
    description: The total amount for each order including taxes.
    agg: sum
    expr: amount
    - name: order_count
    expr: 1
    agg: sum
    - name: customers_with_orders
    description: Distinct count of customers placing orders
    agg: count_distinct
    expr: customer_id
    - name: order_value_p99 ## The 99th percentile order value
    expr: amount
    agg: percentile
    agg_params:
    percentile: 0.99
    use_discrete_percentile: True
    use_approximate_percentile: False

    Define metrics

    Metrics are the language your business users speak and measure business performance. They are an aggregation over a column in your warehouse that you enrich with dimensional cuts.

    There are different types of metrics you can configure:

    • Conversion metrics Track when a base event and a subsequent conversion event occur for an entity within a set time period.
    • Cumulative metrics Aggregate a measure over a given window. If no window is specified, the window will accumulate the measure over all of the recorded time period. Note that you must create the time spine model before you add cumulative metrics.
    • Derived metrics Allows you to do calculations on top of metrics.
    • Simple metrics Directly reference a single measure without any additional measures involved.
    • Ratio metrics Involve a numerator metric and a denominator metric. A constraint string can be applied to both the numerator and denominator or separately to the numerator or denominator.

    Once you've created your semantic models, it's time to start referencing those measures you made to create some metrics:

    Add metrics to your fct_orders.yml semantic model file:

    models/metrics/fct_orders.yml
    semantic_models:
    - name: orders
    defaults:
    agg_time_dimension: order_date
    description: |
    Order fact table. This table’s grain is one row per order
    model: ref('fct_orders')
    entities:
    - name: order_id
    type: primary
    - name: customer_id
    type: foreign
    dimensions:
    - name: order_date
    type: time
    type_params:
    time_granularity: day
    measures:
    - name: order_total
    description: The total amount for each order including taxes.
    agg: sum
    expr: amount
    - name: order_count
    expr: 1
    agg: sum
    - name: customers_with_orders
    description: Distinct count of customers placing orders
    agg: count_distinct
    expr: customer_id
    - name: order_value_p99
    expr: amount
    agg: percentile
    agg_params:
    percentile: 0.99
    use_discrete_percentile: True
    use_approximate_percentile: False
    # Newly added
    metrics:
    # Simple type metrics
    - name: "order_total"
    description: "Sum of orders value"
    type: simple
    label: "order_total"
    type_params:
    measure: order_total
    - name: "order_count"
    description: "number of orders"
    type: simple
    label: "order_count"
    type_params:
    measure: order_count
    - name: large_orders
    description: "Count of orders with order total over 20."
    type: simple
    label: "Large Orders"
    type_params:
    measure: order_count
    filter: |
    {{ Dimension('order_id__order_total_dim') }} >= 20
    # Ratio type metric
    - name: "avg_order_value"
    label: "avg_order_value"
    description: "average value of each order"
    type: ratio
    type_params:
    numerator: order_total
    denominator: order_count
    # Cumulative type metrics
    - name: "cumulative_order_amount_mtd"
    label: "cumulative_order_amount_mtd"
    description: "The month to date value of all orders"
    type: cumulative
    type_params:
    measure: order_total
    grain_to_date: month
    # Derived metric
    - name: "pct_of_orders_that_are_large"
    label: "pct_of_orders_that_are_large"
    description: "percent of orders that are large"
    type: derived
    type_params:
    expr: large_orders/order_count
    metrics:
    - name: large_orders
    - name: order_count

    Add second semantic model to your project

    Great job, you've successfully built your first semantic model! It has all the required elements: entities, dimensions, measures, and metrics.

    Let’s expand your project's analytical capabilities by adding another semantic model in your other marts model, such as: dim_customers.yml.

    After setting up your orders model:

    1. In the metrics sub-directory, create the file dim_customers.yml.
    2. Copy the following query into the file and click Save.
    models/metrics/dim_customers.yml
    semantic_models:
    - name: customers
    defaults:
    agg_time_dimension: most_recent_order_date
    description: |
    semantic model for dim_customers
    model: ref('dim_customers')
    entities:
    - name: customer
    expr: customer_id
    type: primary
    dimensions:
    - name: customer_name
    type: categorical
    expr: first_name
    - name: first_order_date
    type: time
    type_params:
    time_granularity: day
    - name: most_recent_order_date
    type: time
    type_params:
    time_granularity: day
    measures:
    - name: count_lifetime_orders
    description: Total count of orders per customer.
    agg: sum
    expr: number_of_orders
    - name: lifetime_spend
    agg: sum
    expr: lifetime_value
    description: Gross customer lifetime spend inclusive of taxes.
    - name: customers
    expr: customer_id
    agg: count_distinct

    metrics:
    - name: "customers_with_orders"
    label: "customers_with_orders"
    description: "Unique count of customers placing orders"
    type: simple
    type_params:
    measure: customers

    This semantic model uses simple metrics to focus on customer metrics and emphasizes customer dimensions like name, type, and order dates. It uniquely analyzes customer behavior, lifetime value, and order patterns.

    Test and query metrics

    To work with metrics in dbt, you have several tools to validate or run commands. Here's how you can test and query metrics depending on your setup:

    • dbt Cloud IDE users Currently, running MetricFlow commands directly in the dbt Cloud IDE isn't supported, but is coming soon. You can view metrics visually through the DAG in the Lineage tab without directly running commands.
    • dbt Cloud CLI users The dbt Cloud CLI enables you to run MetricFlow commands to query and preview metrics directly in your command line interface.
    • dbt Core users Use the MetricFlow CLI for command execution. While this guide focuses on dbt Cloud users, dbt Core users can find detailed MetricFlow CLI setup instructions in the MetricFlow commands page. Note that to use the dbt Semantic Layer, you need to have a Team or Enterprise account.

    Alternatively, you can run commands with SQL client tools like DataGrip, DBeaver, or RazorSQL.

    dbt Cloud IDE users

    You can view your metrics in the dbt Cloud IDE by viewing them in the Lineage tab. The dbt Cloud IDE Status button (located in the bottom right of the editor) displays an Error status if there's an error in your metric or semantic model definition. You can click the button to see the specific issue and resolve it.

    Once viewed, make sure you commit and merge your changes in your project.

    Validate your metrics using the Lineage tab in the IDE.Validate your metrics using the Lineage tab in the IDE.

    dbt Cloud CLI users

    This section is for dbt Cloud CLI users. MetricFlow commands are integrated with dbt Cloud, which means you can run MetricFlow commands as soon as you install the dbt Cloud CLI. Your account will automatically manage version control for you.

    Refer to the following steps to get started:

    1. Install the dbt Cloud CLI (if you haven't already). Then, navigate to your dbt project directory.
    2. Run a dbt command, such as dbt parse, dbt run, dbt compile, or dbt build. If you don't, you'll receive an error message that begins with: "ensure that you've ran an artifacts....".
    3. MetricFlow builds a semantic graph and generates a semantic_manifest.json file in dbt Cloud, which is stored in the /target directory. If using the Jaffle Shop example, run dbt seed && dbt run to ensure the required data is in your data platform before proceeding.
    Run dbt parse to reflect metric changes

    When you make changes to metrics, make sure to run dbt parse at a minimum to update the dbt Semantic Layer. This updates the semantic_manifest.json file, reflecting your changes when querying metrics. By running dbt parse, you won't need to rebuild all the models.

    1. Run dbt sl --help to confirm you have MetricFlow installed and that you can view the available commands.

    2. Run dbt sl query --metrics <metric_name> --group-by <dimension_name> to query the metrics and dimensions. For example, to query the order_total and order_count (both metrics), and then group them by the order_date (dimension), you would run:

      dbt sl query --metrics order_total,order_count --group-by order_date
    3. Verify that the metric values are what you expect. To further understand how the metric is being generated, you can view the generated SQL if you type --compile in the command line.

    4. Commit and merge the code changes that contain the metric definitions.

    Run a production job

    Once you’ve committed and merged your metric changes in your dbt project, you can perform a job run in your deployment environment in dbt Cloud to materialize your metrics. The deployment environment is only supported for the dbt Semantic Layer currently.

    1. In dbt Cloud, create a new deployment environment or use an existing environment on dbt 1.6 or higher.
      • Note Deployment environment is currently supported (development experience coming soon)
    2. To create a new environment, navigate to Deploy in the navigation menu, select Environments, and then select Create new environment.
    3. Fill in your deployment credentials with your Snowflake username and password. You can name the schema anything you want. Click Save to create your new production environment.
    4. Create a new deploy job that runs in the environment you just created. Go back to the Deploy menu, select Jobs, select Create job, and click Deploy job.
    5. Set the job to run a dbt build and select the Generate docs on run checkbox.
    6. Run the job and make sure it runs successfully.
    What’s happening internally?
    • Merging the code into your main branch allows dbt Cloud to pull those changes and build the definition in the manifest produced by the run.
    • Re-running the job in the deployment environment helps materialize the models, which the metrics depend on, in the data platform. It also makes sure that the manifest is up to date.
    • The Semantic Layer APIs pull in the most recent manifest and enables your integration to extract metadata from it.

    Set up dbt Semantic Layer

    You must be part of the Owner group and have the correct license and permissions to set up the Semantic Layer at the environment and project level.

    • Enterprise plan:
      • Developer license with Account Admin permissions, or
      • Owner with a Developer license, assigned Project Creator, Database Admin, or Admin permissions.
    • Team plan: Owner with a Developer license.
    • Free trial: You are on a free trial of the Team plan as an Owner, which means you have access to the dbt Semantic Layer.

    1. Select environment

    Select the environment where you want to enable the Semantic Layer:

    1. Navigate to Account settings in the navigation menu.
    2. On the Settings left sidebar, select the specific project you want to enable the Semantic Layer for.
    3. In the Project details page, navigate to the Semantic Layer section. Select Configure Semantic Layer.
    Semantic Layer section in the 'Project Details' pageSemantic Layer section in the 'Project Details' page
    1. In the Set Up Semantic Layer Configuration page, select the deployment environment you want for the Semantic Layer and click Save. This provides administrators with the flexibility to choose the environment where the Semantic Layer will be enabled.
    dbt Cloud Enterprise can skip to Add more credentials

    dbt Cloud Enterprise plans can add multiple credentials and have a different set up. Skip to Add more credentials for more configuration details.

    2. Add a credential and create service tokens

    The dbt Semantic Layer uses service tokens for authentication which are tied to an underlying data platform credential that you configure. The credential configured is used to execute queries that the Semantic Layer issues against your data platform. This credential controls the physical access to underlying data accessed by the Semantic Layer, and all access policies set in the data platform for this credential will be respected.

    dbt Cloud Enterprise plans can add multiple credentials and map those to service tokens. Refer to Add more credentials for more information.

    1. In the Set Up Semantic Layer Configuration page, enter the credentials specific to your data platform that you want the Semantic Layer to use.

      • Use credentials with minimal privileges. The Semantic Layer requires read access to the schema(s) containing the dbt models used in your semantic models for downstream applications
      • Note, environment variables such as {{env_var('DBT_WAREHOUSE')}, aren't supported in the dbt Semantic Layer yet. You must use the actual credentials.
        Enter the credentials specific to your data platform that you want the Semantic Layer to use and select the deployment environment.Enter the credentials specific to your data platform that you want the Semantic Layer to use and select the deployment environment.
    2. Create a Service Token after you add the credential.

      • Enterprise plans: Name and generate a service token on the credential page directly.
      • Team plans: You can return to the Project Details page and click the Generate a Service Token button.
    3. Name the token and save it. Once the token is generated, you won't be able to view this token again so make sure to record it somewhere safe.

    info

    Teams plans can create multiple service tokens that map to one underlying credential. Adding multiple credentials for tailored access is available for Enterprise plans.

    Book a free live demo to discover the full potential of dbt Cloud Enterprise.

    3. View connection detail

    1. Go back to the Project details page for connection details to connect to downstream tools.

    2. Copy and share the environment ID, service token, host, as well as the service token name to the relevant teams for BI connection set up. If your tool uses the GraphQL API, save the GraphQL API host information instead of the JDBC URL.

      For info on how to connect to other integrations, refer to Available integrations.

    After configuring, you'll be provided with the connection details to connect to you downstream tools.After configuring, you'll be provided with the connection details to connect to you downstream tools.

    4. Add more credentials enterprise

    dbt Cloud Enterprise plans can optionally add multiple credentials and map them to service tokens, offering more granular control and tailored access for different teams, which can then be shared to relevant teams for BI connection setup. These credentials control the physical access to underlying data accessed by the Semantic Layer.

    We recommend configuring credentials and service tokens to reflect your teams and their roles. For example, create tokens or credentials that align with your team's needs, such as providing access to finance-related schemas to the Finance team.

    Note that:

    • Admins can link multiple service tokens to a single credential within a project, but each service token can only be linked to one credential per project.
    • When you send a request through the APIs, the service token of the linked credential will follow access policies of the underlying view and tables used to build your semantic layer requests.
    • Environment variables, like {{env_var('DBT_WAREHOUSE')} aren't supported the dbt Semantic Layer yet. You must use the actual credentials instead.

    To add multiple credentials and map them to service tokens:

    1. After configuring your environment, on the Credentials & service tokens page click the Add Semantic Layer credential button to configure the credential for your data platform.

    2. On the Create New Semantic Layer Credential page, you can create multiple credentials and map them to a service token.

    3. In the Add credentials section, fill in the data platform's credential fields. We recommend using “read-only” credentials.

      Add credentials and map them to a service token. Add credentials and map them to a service token.
    4. In the Map new service token section, map a service token to the credential you configured in the previous step. dbt Cloud automatically selects the service token permission set you need (Semantic Layer Only and Metadata Only).

      • To add another service token, click Add service token under the Linked service tokens section.
    5. Click Save to link the service token to the credential. Remember to copy and save the service token securely, as it won't be viewable again after generation.

      Manage multiple credentials and link them to service tokens for more granular control.Manage multiple credentials and link them to service tokens for more granular control.
    6. To delete a credential, go back to the Semantic Layer & Credentials page. Select Delete credential to remove a credential and click Save.

      When you delete a credential, any service tokens mapped to that credential in the project will no longer work and will break for any end users.

    Additional configuration

    The following are the additional flexible configurations for Semantic Layer credentials.

    • Unlink a service token from the credential by clicking Unlink under the Linked service tokens section. If you try to query the Semantic Layer with an unlinked credential, you'll experience an error in your BI tool because no valid token is mapped.

    Manage from service token page

    View credential from service token

    • View your Semantic Layer credential directly by navigating to the API tokens and then Service tokens page.
    • Select the service token to view the credential it's linked to. This is useful if you want to know which service tokens are mapped to credentials in your project.

    Create a new service token

    • From the Service tokens page, create a new service token and map it to the credential(s) (assuming the semantic layer permission exists). This is useful if you want to create a new service token and directly map it to a credential in your project.
    • Make sure to select the correct permission set for the service token (Semantic Layer Only and Metadata Only).
    Create a new service token and map credentials directly on the separate 'Service tokens page'.Create a new service token and map credentials directly on the separate 'Service tokens page'.

    Query the Semantic Layer

    This page will guide you on how to connect and use the following integrations to query your metrics:

    The dbt Semantic Layer enables you to connect and query your metric with various available tools like Google Sheets, Hex, Tableau, and more.

    Query metrics using other tools such as first-class integrations, Semantic Layer APIs, and exports to expose tables of metrics and dimensions in your data platform and create a custom integration with tools like PowerBI.

    Connect and query with Google Sheets

    The Google Sheets integration allows you to query your metrics using Google Sheets. This section will guide you on how to connect and use the Google Sheets integration.

    To query your metrics using Google Sheets:

    1. Make sure you have a Gmail account.
    2. To set up Google Sheets and query your metrics, follow the detailed instructions on Google Sheets integration.
    3. Start exploring and querying metrics!
      • Query a metric, like order_total, and filter it with a dimension, like order_date.
      • You can also use the group_by parameter to group your metrics by a specific dimension.
    Use the dbt Semantic Layer's Google Sheet integration to query metrics with a Query Builder menu.Use the dbt Semantic Layer's Google Sheet integration to query metrics with a Query Builder menu.

    Connect and query with Hex

    This section will guide you on how to use the Hex integration to query your metrics using Hex. Select the appropriate tab based on your connection method:

    1. Navigate to the Hex login page.
    2. Sign in or make an account (if you don’t already have one).
    • You can make Hex free trial accounts with your work email or a .edu email.
    1. In the top left corner of your page, click on the HEX icon to go to the home page.

    2. Then, click the + New project button on the top right.

      Click the '+ New project' button on the top rightClick the '+ New project' button on the top right
    3. Go to the menu on the left side and select Data browser. Then select Add a data connection.

    4. Click Snowflake. Provide your data connection a name and description. You don't need to your data warehouse credentials to use the Semantic Layer.

      Select 'Data browser' and then 'Add a data connection' to connect to Snowflake.Select 'Data browser' and then 'Add a data connection' to connect to Snowflake.
    5. Under Integrations, toggle the dbt switch to the right to enable the dbt integration.

      Click on the dbt toggle to enable the integration. Click on the dbt toggle to enable the integration.
    6. Enter the following information:

      • Select your version of dbt as 1.6 or higher
      • Enter your environment id
      • Enter your service token
      • Make sure to click on the Use Semantic Layer toggle. This way, all queries are routed through dbt.
      • Click Create connection in the bottom right corner.
    7. Hover over More on the menu shown in the following image and select dbt Semantic Layer.

      Hover over 'More' on the menu and select 'dbt Semantic Layer'.Hover over 'More' on the menu and select 'dbt Semantic Layer'.
    8. Now, you should be able to query metrics using Hex! Try it yourself:

      • Create a new cell and pick a metric.
      • Filter it by one or more dimensions.
      • Create a visualization.

    What's next

    Great job on completing the comprehensive dbt Semantic Layer guide 🎉! You should hopefully have gained a clear understanding of what the dbt Semantic Layer is, its purpose, and when to use it in your projects.

    You've learned how to:

    • Set up your Snowflake environment and dbt Cloud, including creating worksheets and loading data.
    • Connect and configure dbt Cloud with Snowflake.
    • Build, test, and manage dbt Cloud projects, focusing on metrics and semantic layers.
    • Run production jobs and query metrics with our available integrations.

    For next steps, you can start defining your own metrics and learn additional configuration options such as exports, fill null values, and more.

    Here are some additional resources to help you continue your journey:

    0