Skip to main content
Sanjana Sen
Analytics Engineer at dbt Labs
View all authors

What's a Primary Key and Why Do We Test Them?

· 6 min read
Sanjana Sen
Analytics Engineer at dbt Labs
Jason Ganz
Developer Experience at dbt Labs
David Krevitt
Marketing at dbt Labs

We’ve all done it: fanned out data during a join to produce duplicate records (sometimes duplicated in multiple).

That time when historical revenue numbers doubled on Monday? Classic fanout.

Could it have been avoided? Yes, very simply: by defining the uniqueness grain for a table with a primary key and enforcing it with a dbt test.

So let’s dive deep into: what primary keys are, which cloud analytics warehouses support them, and how you can test them in your warehouse to enforce uniqueness.

Generating Surrogate Keys Across Warehouses

· 7 min read
Sanjana Sen
Analytics Engineer at dbt Labs
Jason Ganz
Developer Experience at dbt Labs
David Krevitt
Marketing at dbt Labs

Why primary keys are important

We all know one of the most fundamental rules in data is that every table should have a primary key. Primary keys are critical for many reasons:

  • They ensure that you don’t have duplicate rows in your table
  • They help establish relationships to other tables
  • They allow you to quickly identify the grain of the table (ex: the customers table with a PK of customer_id has one row per customer)
  • You can test them in dbt, to ensure that your data is complete and unique