Skip to main content

About dbt projects

A dbt project informs dbt about the context of your project and how to transform your data (build your data sets). By design, dbt enforces the top-level structure of a dbt project such as the dbt_project.yml file, the models directory, the snapshots directory, and so on. Within the directories of the top-level, you can organize your project in any way that meets the needs of your organization and data pipeline.

At a minimum, all a project needs is the dbt_project.yml project configuration file. dbt supports a number of different resources, so a project may also include:

ResourceDescription
modelsEach model lives in a single file and contains logic that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation.
snapshotsA way to capture the state of your mutable tables so you can refer to it later.
seedsCSV files with static data that you can load into your data platform with dbt.
data testsSQL queries that you can write to test the models and resources in your project.
macrosBlocks of code that you can reuse multiple times.
docsDocs for your project that you can build.
sourcesA way to name and describe the data loaded into your warehouse by your Extract and Load tools.
exposuresA way to define and describe a downstream use of your project.
metricsA way for you to define metrics for your project.
groupsGroups enable collaborative node organization in restricted collections.
analysisA way to organize analytical SQL queries in your project such as the general ledger from your QuickBooks.
semantic modelsSemantic models define the foundational data relationships in MetricFlow and the dbt Semantic Layer, enabling you to query metrics using a semantic graph.
saved queriesSaved queries organize reusable queries by grouping metrics, dimensions, and filters into nodes visible in the dbt DAG.

When building out the structure of your project, you should consider these impacts on your organization's workflow:

  • How would people run dbt commands — Selecting a path
  • How would people navigate within the project — Whether as developers in the IDE or stakeholders from the docs
  • How would people configure the models — Some bulk configurations are easier done at the directory level so people don’t have to remember to do everything in a config block with each new model

Project configuration

Every dbt project includes a project configuration file called dbt_project.yml. It defines the directory of the dbt project and other project configurations.

Edit dbt_project.yml to set up common project configurations such as:

YAML keyValue description
nameYour project’s name in snake case
versionVersion of your project
require-dbt-versionRestrict your project to only work with a range of dbt Core versions
profileThe profile dbt uses to connect to your data platform
model-pathsDirectories to where your model and source files live
seed-pathsDirectories to where your seed files live
test-pathsDirectories to where your test files live
analysis-pathsDirectories to where your analyses live
macro-pathsDirectories to where your macros live
snapshot-pathsDirectories to where your snapshots live
docs-pathsDirectories to where your docs blocks live
varsProject variables you want to use for data compilation

For complete details on project configurations, see dbt_project.yml.

Project subdirectories

You can use the Project subdirectory option in dbt Cloud to specify a subdirectory in your git repository that dbt should use as the root directory for your project. This is helpful when you have multiple dbt projects in one repository or when you want to organize your dbt project files into subdirectories for easier management.

To use the Project subdirectory option in dbt Cloud, follow these steps:

  1. Click on the cog icon on the upper right side of the page and click on Account Settings.

  2. Under Projects, select the project you want to configure as a project subdirectory.

  3. Select Edit on the lower right-hand corner of the page.

  4. In the Project subdirectory field, add the name of the subdirectory. For example, if your dbt project files are located in a subdirectory called <repository>/finance, you would enter finance as the subdirectory.

    • You can also reference nested subdirectories. For example, if your dbt project files are located in <repository>/teams/finance, you would enter teams/finance as the subdirectory. Note: You do not need a leading or trailing / in the Project subdirectory field.
  5. Click Save when you've finished.

After configuring the Project subdirectory option, dbt Cloud will use it as the root directory for your dbt project. This means that dbt commands, such as dbt run or dbt test, will operate on files within the specified subdirectory. If there is no dbt_project.yml file in the Project subdirectory, you will be prompted to initialize the dbt project.

New projects

You can create new projects and share them with other people by making them available on a hosted git repository like GitHub, GitLab, and BitBucket.

After you set up a connection with your data platform, you can initialize your new project in dbt Cloud and start developing. Or, run dbt init from the command line to set up your new project.

During project initialization, dbt creates sample model files in your project directory to help you start developing quickly.

Sample projects

If you want to explore dbt projects more in-depth, you can clone dbt Lab’s Jaffle shop on GitHub. It's a runnable project that contains sample configurations and helpful notes.

If you want to see what a mature, production project looks like, check out the GitLab Data Team public repo.

0