What's a Primary Key and Why Do We Test Them?
We’ve all done it: fanned out data during a join to produce duplicate records (sometimes duplicated in multiple).
That time when historical revenue numbers doubled on Monday? Classic fanout.
Could it have been avoided? Yes, very simply: by defining the uniqueness grainYour data's grain is the combination of columns at which records in a table are unique. Ideally, this is captured in a single column and a unique primary key. for a tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. with a primary key and enforcing it with a dbt test.
So let’s dive deep into: what primary keys are, which cloud analytics warehouses support them, and how you can test them in your warehouse to enforce uniqueness.