Setting up
In this tutorial, we will be turning the below query into a dbt project that is tested, documented, and deployed — you can check out the generated documentation for the project we're building here.
This tutorial is geared at first-time users who want detailed instructions on how to go from zero to a deployed dbt project. You'll need a working knowledge of SQL in order to do this tutorial.
We recommend you go through this project once from beginning to end. Once you've completed it, you should go back through and read some of the FAQs to broaden your understanding of dbt.
This tutorial is based on a fictional business named jaffle_shop
, so you'll see this name used throughout the project. At this organization, the following query is used to build a customers
table.
with customers as (selectid as customer_id,first_name,last_namefrom `dbt-tutorial`.jaffle_shop.customers),orders as (selectid as order_id,user_id as customer_id,order_date,statusfrom `dbt-tutorial`.jaffle_shop.orders),customer_orders as (selectcustomer_id,min(order_date) as first_order_date,max(order_date) as most_recent_order_date,count(order_id) as number_of_ordersfrom ordersgroup by 1),final as (selectcustomers.customer_id,customers.first_name,customers.last_name,customer_orders.first_order_date,customer_orders.most_recent_order_date,coalesce(customer_orders.number_of_orders, 0) as number_of_ordersfrom customersleft join customer_orders using (customer_id))select * from final
Create a BigQuery project
For this tutorial, we've created a public dataset in BigQuery that anyone can select
from.
We're using BigQuery since anyone with a Google Account can use BigQuery, but dbt works with many data warehouses.
info
BigQuery has generous free tier. If you have an existing GCP account that has surpassed these tiers on BigQuery, running queries for this tutorial will incur a very small (less than a few USD) cost.
- Go to the BigQuery Console — if you don't have a Google Cloud Platform account you will be asked to create one.
- Create a new project for this tutorial — if you've just created a BigQuery account, you'll be prompted to create a new project straight away. If you already have an existing you can select the project drop down in the header bar, and create a new project from there.
- Head back to the the BigQuery Console, and ensure your new project is selected. Copy and paste the above query into the Query Editor to validate that you are able to run it successfully.
FAQs
Generate BigQuery credentials
In order to let dbt connect to your warehouse, you'll need generate a keyfile. This is analogous to using a database user name and password with most other data warehouses.
- Go to the BigQuery credential wizard. Ensure that your new project is selected in the header bar.
- Generate credentials with the following options:
- Which API are you using? BigQuery API
- Are you planning to use this API with App Engine or Compute Engine? No
- Service account name:
dbt-user
- Role: BigQuery User
- Key type: JSON
- Download the JSON file and save it in an easy-to-remember spot, with a clear filename (e.g.
dbt-user-creds.json
)
FAQs
Choose the way you want to develop
There’s two main ways of working with dbt:
- Edit files and run projects using the web-based Integrated Development Environment (IDE) in dbt Cloud.
- Edit files locally using a code editor, and run projects using the Command Line Interface (dbt CLI).
To use the CLI, it's important that you know some basics of your terminal. In particular, you should understand cd
, ls
and pwd
to navigate through the directory structure of your computer easily. As such, if you are new to programming, we recommend using dbt Cloud for this tutorial.
If you wish to use the CLI, please follow the installation instructions for your operating system.