Skip to main content

SQL INNER JOINS

The cleanest and easiest of SQL joins: the humble inner join. Just as its name suggests, an inner join between two database objects returns all rows that have matching join keys; any keys that don’t match are omitted from the query result.

How to create an inner join

Like all joins, you need some database objects (ie tables/views), keys to join on, and a select statement to perform an inner join:

select
<fields>
from <table_1> as t1
inner join <table_2> as t2
on t1.id = t2.id

In this example above, there’s only one field from each table being used to join the two together; if you’re joining between two database objects that require multiple fields, you can leverage AND/OR operators, and more preferably, surrogate keys. You may additionally add WHERE, GROUP BY, ORDER BY, HAVING, and other clauses after your joins to create filtering, ordering, and performing aggregations.

As with any query, you can perform as many joins as you want in a singular query. A general word of advice: try to keep data models modular by performing regular DAG audits. If you join certain tables further upstream, are those individual tables needed again further downstream? If your query involves multiple joins and complex logic and is exposed to end business users, ensure that you leverage table or incremental materializations.

SQL inner join example

Table A car_type

user_idcar_type
1van
2sedan
3truck

Table B car_color

user_idcar_color
1red
3green
4yellow
select
car_type.user_id as user_id,
car_type.car_type as type,
car_color.car_color as color
from {{ ref('car_type') }} as car_type
inner join {{ ref('car_color') }} as car_color
on car_type.user_id = car_color.user_id

This simple query will return all rows that have the same user_id in both Table A and Table B:

user_idtypecolor
1vanred
3truckgreen

Because there’s no user_id = 4 in Table A and no user_id = 2 in Table B, rows with ids 2 and 4 (from either table) are omitted from the inner join query results.

SQL inner join use cases

There are probably countless scenarios where you’d want to inner join multiple tables together—perhaps you have some really nicely structured tables with the exact same primary keys that should really just be one larger, wider table or you’re joining two tables together don’t want any null or missing column values if you used a left or right join—it’s all pretty dependent on your source data and end use cases. Where you will not (and should not) see inner joins is in staging models that are used to clean and prep raw source data for analytics uses. Any joins in your dbt projects should happen further downstream in intermediate and mart models to improve modularity and DAG cleanliness.

0