Now that you've seen how we style our dbt projects, it's time to build your own. Feel free to copy this guide and use it as a template for your own project. If you do, we'd love to hear about it! Reach out to us on the Community Forum or Slack to share your style guide. We recommend co-locating your style guide with your code to make sure contributors can easily follow it. If you're using GitHub, you can add your style guide to your repository's wiki, or include it in your README.
Lastly, to ensure your style guide's automated rules are being followed without additional mental overhead to your team, you can use pre-commit hooks to automatically check your code for style violations (and often fix them automagically) before it's committed. This is a great way to make sure your style guide is followed by all contributors. We recommend implementing this once you've settled on and published your style guide, and your codebase is conforming to it. This will ensure that all future commits follow the style guide. You can find an excellent set of open source pre-commit hooks for dbt from the community here in the dbt-checkpoint project.
Style guide template
# dbt Example Style Guide
## SQL Style
- Use lowercase keywords.
- Use trailing commas.
## Model Organization
Our models (typically) fit into two main categories:\
- Staging — Contains models that clean and standardize data.
- Marts — Contains models which combine or heavily transform data.
Things to note:
- There are different types of models that typically exist in each of the above categories. See Model Layers for more information.
- Read How we structure our dbt projects for an example and more details around organization.
## Model Layers
- Only models in `staging` should select from sources.
- Models not in the `staging` folder should select from refs.
## Model File Naming and Coding
- All objects should be plural.
Example: `stg_stripe__invoices.sql` vs. `stg_stripe__invoice.sql`
- All models should use the naming convention `<type/dag_stage>_<source/topic>__<additional_context>`. See this article for more information.
- Models in the **staging** folder should use the source's name as the `<source/topic>` and the entity name as the `additional_context`.
- Schema, table, and column names should be in `snake_case`.
- Limit the use of abbreviations that are related to domain knowledge. An onboarding employee will understand `current_order_status` better than `current_os`.
- Use names based on the _business_ rather than the source terminology.
- Each model should have a primary key to identify the unique row and should be named `<object>_id`. For example, `account_id`. This makes it easier to know what `id` is referenced in downstream joined models.
- For `base` or `staging` models, columns should be ordered in categories, where identifiers are first and date/time fields are at the end.
- Date/time columns should be named according to these conventions:
- Timestamps: `<event>_at`
- Dates: `<event>_date`
- Booleans should be prefixed with `is_` or `has_`.
Example: `is_active_customer` and `has_admin_access`
- Price/revenue fields should be in decimal currency (for example, `19.99` for $19.99; many app databases store prices as integers in cents). If a non-decimal currency is used, indicate this with suffixes. For example, `price_in_cents`.
- Avoid using reserved words (such as these for Snowflake) as column names.
- Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id`.
## Model Configurations
- Model configurations at the folder level should be considered (and if applicable, applied) first.
- More specific configurations should be applied at the model level using one of these methods.
- Models within the `marts` folder should be materialized as `table` or `incremental`.
- By default, `marts` should be materialized as `table` within `dbt_project.yml`.
- If switching to `incremental`, this should be specified in the model's configuration.
- At a minimum, `unique` and `not_null` tests should be applied to the expected primary key of each model.
For more information about why we use so many CTEs, read this glossary entry.
- Where performance permits, CTEs should perform a single, logical unit of work.
- CTE names should be as verbose as needed to convey what they do.
- CTEs with confusing or noteable logic should be commented with SQL comments as you would with any complex functions and should be located above the CTE.
- CTEs duplicated across models should be pulled out and created as their own models.