Skip to main content

About dbt projects

A dbt project informs dbt the context of your project and how to transform your data (build your data sets). By design, dbt enforces the top-level structure of a dbt project such as the dbt_project.yml file, the models directory, the snapshots directory, and so on. Within the directories of the top-level, you can organize your project in any way that meets the needs of your organization and data pipeline.

At a minimum, all a project needs is the dbt_project.yml project configuration file. dbt supports a number of different resources, so a project may also include:

modelsEach model lives in a single file and contains logic that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation.
snapshotsA way to capture the state of your mutable tables so you can refer to it later.
seedsCSV files with static data that you can load into your data platform with dbt.
testsSQL queries that you can write to test the models and resources in your project.
macrosBlocks of code that you can reuse multiple times.
docsDocs for your project that you can build.
sourcesA way to name and describe the data loaded into your warehouse by your Extract and Load tools.
exposuresA way to define and describe a downstream use of your project.
metricsA way for you to define metrics for your project.
analysisA way to organize analytical SQL queries in your project such as the general ledger from your QuickBooks.

When building out the structure of your project, you should consider these impacts to your organization's workflow:

  • How would people run dbt commands Selecting a path
  • How would people navigate within the project Whether as developers in the IDE or stakeholders from the docs
  • How would people configure the models Some bulk configurations are easier done at the directory level so people don’t have to remember to do everything in a config block with each new model

Project configuration

Every dbt project includes a project configuration file called dbt_project.yml. It defines the directory of the dbt project and other project configurations.

Edit dbt_project.yml to set up common project configurations such as:

YAML keyValue description
nameYour project’s name in snake case
versionVersion of your project
require-dbt-versionRestrict your project to only work with a range of dbt Core versions
profileThe profile dbt uses to connect to your data platform
model-pathsDirectories to where your model and source files live
seed-pathsDirectories to where your seed files live
test-pathsDirectories to where your test files live
analysis-pathsDirectories to where your analyses live
macro-pathsDirectories to where your macros live
snapshot-pathsDirectories to where your snapshots live
docs-pathsDirectories to where your docs blocks live
varsProject variables you want to use for data compilation

For complete details on project configurations, see dbt_project.yml.

New projects

You can create new projects and share them with other people by making them available on a hosted git repository like GitHub, GitLab, and BitBucket.

After you set up a connection with your data platform, you can initialize your new project in dbt Cloud and start developing. Or, run dbt init from the command line to set up your new project.

During project initialization, dbt creates sample model files in your project directory to help you start developing quickly.

Sample projects

If you want to explore dbt projects more in-depth, you can clone dbt Lab’s Jaffle shop on GitHub. It's a runnable project that contains sample configurations and helpful notes.

If you want to see what a mature, production project looks like, check out the GitLab Data Team public repo.