Skip to main content

dbt Developer Blog

Technical tutorials from the dbt Community.

Start here

Featured Posts

· 13 min read
Benoit Perigaud

At dbt Labs, we have best practices we like to follow for the development of dbt projects. One of them, for example, is that all models should have at least unique and not_null tests on their primary key. But how can we enforce rules like this?

That question becomes difficult to answer in large dbt projects. Developers might not follow the same conventions. They might not be aware of past decisions, and reviewing pull requests in git can become more complex. When dbt projects have hundreds of models, it's hard to know which models do not have any tests defined and aren't enforcing your conventions.

· 11 min read
Callum McCann

TLDR: The Semantic Layer is made up of a combination of open-source and SaaS offerings and is going to change how your team defines and consumes metrics.

At last year's Coalesce, Drew showed us the future1 - a vision of what metrics in dbt could look like. Since then, we've been getting the infrastructure in place to make that vision a reality. We wanted to share with you where we are today and how it fits into the broader picture of where we're going.

To those who haven't followed this saga with the intensity of someone watching their investments on the crypto market, we're rolling out this new resource to help you better understand the dbt Semantic Layer and provide clarification on the following things:

  1. What is the dbt Semantic Layer?
  2. How do I use it?
  3. What is publicly available now?
  4. What is still in development?

With that, lets get into it!

· 7 min read
Jeremy Cohen
Doug Beatty

If you’ve needed to grant access to a dbt model between 2019 and today, there’s a good chance you’ve come across the "The exact grant statements we use in a dbt project" post on Discourse. It explained options for covering two complementary abilities:

  1. querying relations via the "select" privilege
  2. using the schema those relations are within via the "usage" privilege

· 11 min read
Matt Winkler

Stored procedures are widely used throughout the data warehousing world. They’re great for encapsulating complex transformations into units that can be scheduled and respond to conditional logic via parameters. However, as team’s continue building their transformation logic using the stored procedure approach, we see more data downtime, increased data warehouse costs, and incorrect / unavailable data in production. All of this leads to more stressed and unhappy developers, and consumers who have a hard time trusting their data.

If your team works heavily with stored procedures, and you ever find yourself with the following or related issues:

  • dashboards that aren’t refreshed on time
  • It feels too slow and risky to modify pipeline code based on requests from your data consumers
  • It’s hard to trace the origins of data in your production reporting

It’s worth considering if an alternative approach with dbt might help.