Seeds

Getting started

Seeds are CSV files in your dbt project (typically in your data directory), that dbt can load into your data warehouse using the dbt seed command.

Seeds can be referenced in downstream models the same way as referencing models — by using the ref function.

Because these CSV files are located in your dbt repository, they are version controlled and code reviewable. Seeds are best suited to static data which changes infrequently.

Good use-cases for seeds:

  • A list of mappings of country codes to country names
  • A list of test emails to exclude from analysis
  • A list of employee account IDs

Poor use-cases of dbt seeds:

  • Loading raw data that has been exported to CSVs

Example

To load a seed file in your dbt project: 1. Add the file to your data directory, with a .csv file extension, e.g. data/country_codes.csv

data/country_codes.csv
country_code,country_name
US,United States
CA,Canada
GB,United Kingdom
...
  1. Run the dbt seed command — a new table will be created in your warehouse in your target schema, named country_codes
$ dbt seed
Found 2 models, 3 tests, 0 archives, 0 analyses, 53 macros, 0 operations, 1 seed file
14:46:15 | Concurrency: 1 threads (target='dev')
14:46:15 |
14:46:15 | 1 of 1 START seed file analytics.country_codes........................... [RUN]
14:46:15 | 1 of 1 OK loaded seed file analytics.country_codes....................... [INSERT 3 in 0.01s]
14:46:16 |
14:46:16 | Finished running 1 seed in 0.14s.
Completed successfully
Done. PASS=1 ERROR=0 SKIP=0 TOTAL=1
  1. Refer to seeds in downstream models using the ref function.
models/orders.csv
-- This refers to the table created from data/country_codes.csv
select * from {{ ref('country_codes') }}

Configuring seeds

Seeds are configured in your dbt_project.yml, check out the seed configurations docs for a full list of available configurations.

Related documentation

FAQs

 Can I use seeds to load raw data?
 Can I store my seeds in a directory other than the `data` directory in my project?
 The columns of my seed changed, and now I get an error when running the `seed` command, what should I do?
 How do I test and document seeds?
 How do I set a datatype for a column in my seed?
 How do I run models downstream of a seed?
 How do I preserve leading zeros in a seed?
 How do I build one seed at a time?
 Do hooks run with seeds?