Build your first models
After setting up a sample project with a successful run, you can get to the fun part — building models! You will learn how to take a sample query and turn it into a model in your dbt project. A model in dbt is a select statement.
Checkout a new git branch
Click Develop from the upper left of dbt Cloud. You need to create a new branch since the main branch is now set to read-only mode.
Click Create branch, and name your branch
add-customers-model
.Checkout a new branch
Build your first model
Click Develop from the upper left of dbt Cloud.
Click the ... next to the Models directory, then select Create file.
Name the file
models/customers.sql
, then click Create.Copy the following query into the file and click Save File.
- BigQuery
- Databricks
- Redshift
- Snowflake
with customers as (
select
id as customer_id,
first_name,
last_name
from `dbt-tutorial`.jaffle_shop.customers
),
orders as (
select
id as order_id,
user_id as customer_id,
order_date,
status
from `dbt-tutorial`.jaffle_shop.orders
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from finalwith customers as (
select
id as customer_id,
first_name,
last_name
from jaffle_shop_customers
),
orders as (
select
id as order_id,
user_id as customer_id,
order_date,
status
from jaffle_shop_orders
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from finalwith customers as (
select
id as customer_id,
first_name,
last_name
from jaffle_shop.customers
),
orders as (
select
id as order_id,
user_id as customer_id,
order_date,
status
from jaffle_shop.orders
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from finalwith customers as (
select
id as customer_id,
first_name,
last_name
from raw.jaffle_shop.customers
),
orders as (
select
id as order_id,
user_id as customer_id,
order_date,
status
from raw.jaffle_shop.orders
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from finalEnter
dbt run
in the command prompt at the bottom of the screen. You should get a successful run and see three models under DETAILS.
In the future, you would connect your business intelligence (BI) tools to these views and tables so they only read cleaned up data rather than raw data in your BI tool.
FAQs
Change the way your model is materialized
One of the most powerful features of dbt is that you can change the way a model is materialized in your warehouse, simply by changing a configuration value. You can change things between tables and views by changing a keyword rather than writing the data definition language (DDL) to do this behind the scenes.
By default, everything gets created as a view. You can override that by materializing everything in jaffle_shop as a table. Everything in the example project will still be materialized as a view.
Edit your
dbt_project.yml
file.Update your project
name
to:dbt_project.ymlname: 'jaffle_shop'
Update your
models
config block to:dbt_project.ymlmodels:
jaffle_shop:
+materialized: table
example:
+materialized: viewClick Save.
Enter the
dbt run
command. Yourcustomers
model should now be built as a table!infoTo do this, dbt had to first run a
drop view
statement (or API call on BigQuery), then acreate table as
statement.Edit
models/customers.sql
to override thedbt_project.yml
for thecustomers
model only by adding the following snippet to the top, and click Save:models/customers.sql{{
config(
materialized='view'
)
}}
with customers as (
select
id as customer_id
...
)Enter the
dbt run
command. Your model,customers
should now build as a view.Enter the
dbt run --full-refresh
command for this to take effect in your warehouse.
FAQs
Delete the example models
You can now delete the files that dbt created when you initialized the project:
Delete the
models/example/
directory.Delete the
example:
key from yourdbt_project.yml
file, and any configurations that are listed under it.dbt_project.yml# before
models:
jaffle_shop:
+materialized: table
example:
+materialized: viewdbt_project.yml# after
models:
jaffle_shop:
+materialized: tableSave your changes.
FAQs
Build models on top of other models
As a best practice in SQL, you should separate logic that cleans up your data from logic that transforms your data. You have already started doing this in the existing query by using common table expressions (CTEs).
Now you can experiment by separating the logic out into separate models and using the ref function to build models on top of other models:
Create a new SQL file,
models/stg_customers.sql
, with the SQL from thecustomers
CTE in our original query.Create a second new SQL file,
models/stg_orders.sql
, with the SQL from theorders
CTE in our original query.- BigQuery
- Databricks
- Redshift
- Snowflake
models/stg_customers.sqlselect
id as customer_id,
first_name,
last_name
from `dbt-tutorial`.jaffle_shop.customersmodels/stg_orders.sqlselect
id as order_id,
user_id as customer_id,
order_date,
status
from `dbt-tutorial`.jaffle_shop.ordersmodels/stg_customers.sqlselect
id as customer_id,
first_name,
last_name
from jaffle_shop_customersmodels/stg_orders.sqlselect
id as order_id,
user_id as customer_id,
order_date,
status
from jaffle_shop_ordersmodels/stg_customers.sqlselect
id as customer_id,
first_name,
last_name
from jaffle_shop.customersmodels/stg_orders.sqlselect
id as order_id,
user_id as customer_id,
order_date,
status
from jaffle_shop.ordersmodels/stg_customers.sqlselect
id as customer_id,
first_name,
last_name
from raw.jaffle_shop.customersmodels/stg_orders.sqlselect
id as order_id,
user_id as customer_id,
order_date,
status
from raw.jaffle_shop.ordersEdit the SQL in your
models/customers.sql
file as follows:models/customers.sqlwith customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from finalExecute
dbt run
This time, when you performed a dbt run
, separate views/tables were created for stg_customers
, stg_orders
and customers
. dbt inferred the order to run these models. Because customers
depends on stg_customers
and stg_orders
, dbt builds customers
last. You do not need to explicitly define these dependencies.
FAQs
Next steps
Once you build your first model, you're ready to test and document your project.
Before moving on from building your first models, make a change and see how it affects your results:
- Write some bad SQL to cause an error — can you debug the error?
- Run only a single model at a time. For more information, see Syntax overview.
- Group your models with a
stg_
prefix into astaging
subdirectory. For example,models/staging/stg_customers.sql
.- Configure your
staging
models to be views. - Run only the
staging
models.
- Configure your