Skip to main content
Sanjana Sen
Analytics Engineer at dbt Labs
View all authors

What's a Primary Key and Why Do We Test Them?

· 6 min read
Sanjana Sen
Analytics Engineer at dbt Labs
Jason Ganz
Developer Experience at dbt Labs
David Krevitt
Marketing at dbt Labs

We’ve all done it: fanned out data during a join to produce duplicate records (sometimes duplicated in multiple).

That time when historical revenue numbers doubled on Monday? Classic fanout.

Could it have been avoided? Yes, very simply: by defining the uniqueness grainYour data's grain is the combination of columns at which records in a table are unique. Ideally, this is captured in a single column and a unique primary key. for a tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. with a primary key and enforcing it with a dbt test.

So let’s dive deep into: what primary keys are, which cloud analytics warehouses support them, and how you can test them in your warehouse to enforce uniqueness.

Generating Surrogate Keys Across Warehouses

· 7 min read
Sanjana Sen
Analytics Engineer at dbt Labs
Jason Ganz
Developer Experience at dbt Labs
David Krevitt
Marketing at dbt Labs

Why primary keys are important

We all know one of the most fundamental rules in data is that every tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. should have a primary keyA primary key is a non-null column in a database object that uniquely identifies each row.. Primary keys are critical for many reasons:

  • They ensure that you don’t have duplicate rows in your table
  • They help establish relationships to other tables
  • They allow you to quickly identify the grainYour data's grain is the combination of columns at which records in a table are unique. Ideally, this is captured in a single column and a unique primary key. of the table (ex: the customers table with a PK of customer_id has one row per customer)
  • You can test them in dbt, to ensure that your data is complete and unique