Add metrics to your DAG
The dbt Semantic Layer is undergoing some sophisticated changes, enabling more complex metric definitions and efficient querying. As part of these changes, the dbt_metrics package will be deprecated and replaced with MetricFlow. For more info, check out the The dbt Semantic Layer: what's next? and dbt_metrics deprecation blog.
About Metrics
A metric is an aggregation over a table that supports zero or more dimensions. Some examples of metrics include:
- active users
- monthly recurring revenue (mrr)
In v1.0, dbt supports metric definitions as a new node type. Like exposures, metrics appear as nodes in the directed acyclic graph (DAG) and can be expressed in YAML files. Defining metrics in dbt projects encodes crucial business logic in tested, version-controlled code. Further, you can expose these metrics definitions to downstream tooling, which drives consistency and precision in metric reporting.
Review the video below to learn more about metrics, why they're important, and how to get started:
Benefits of defining metrics
Use metric specifications in downstream tools
dbt's compilation context can access metrics via the graph.metrics
variable. The manifest artifact includes metrics for downstream metadata consumption.
See and select dependencies
As with Exposures, you can see everything that rolls up into a metric (dbt ls -s +metric:*
), and visualize them in dbt documentation. For more information, see "The metric:
selection method."

Defining a metric
You can define metrics in .yml
files nested under a metrics:
key. Metric names must:
- contain only letters, numbers, and underscores (no spaces or special characters)
- begin with a letter
- contain no more than 250 characters
For a short human-friendly name with title casing, spaces, and special characters, use the label
property. More examples and guidance for how to define and structure metrics can be found here..
Example definition
- You cannot define metrics on ephemeral models. To define a metric, the materialization must have a representation in the data warehouse.
Available properties
Metrics can have many declared properties, which define aspects of your metric. More information on properties and configs can be found here.
Available calculation methods
Description | |
---|---|
count | This metric type will apply the count aggregation to the specified field |
count_distinct | This metric type will apply the count aggregation to the specified field, with an additional distinct statement inside the aggregation |
sum | This metric type will apply the sum aggregation to the specified field |
average | This metric type will apply the average aggregation to the specified field |
min | This metric type will apply the min aggregation to the specified field |
max | This metric type will apply the max aggregation to the specified field |
median | This metric type will apply the median aggregation to the specified field, or an alternative percentile_cont aggregation if median is not available |
Filters
Filters should be defined as a list of dictionaries that define predicates for the metric. Filters are combined using AND clauses. For more control, users can (and should) include the complex logic in the model powering the metric.
All three properties (field
, operator
, value
) are required for each defined filter.
Note that value
must be defined as a string in YAML, because it will be compiled into queries as part of a string. If your filter's value needs to be surrounded by quotes inside the query (e.g. text or dates), use "'nested'"
quotes:
filters:
- field: is_paying
operator: 'is'
value: 'true'
- field: lifetime_value
operator: '>='
value: '100'
- field: company_name
operator: '!='
value: "'Acme, Inc'"
- field: signup_date
operator: '>='
value: "'2020-01-01'"
Calendar
The dbt_metrics package contains a basic calendar table that is created as part of your dbt run
. It contains dates between 2010-01-01 and 2029-12-31.
If you want to use a custom calendar, you can replace the default with any table which meets the following requirements:
- Contains a
date_day
column. - Contains the following columns:
date_week
,date_month
,date_quarter
,date_year
, or equivalents. - Additional date columns need to be prefixed with
date_
, e.g.date_4_5_4_month
for a 4-5-4 retail calendar date set. Dimensions can have any name (see following section).
To do this, set the value of the dbt_metrics_calendar_model
variable in your dbt_project.yml
file:
#dbt_project.yml
config-version: 2
[...]
vars:
dbt_metrics_calendar_model: my_custom_calendar
Dimensions from calendar tables
You may want to aggregate metrics by a dimension in your custom calendar table, for example is_weekend. You can include this within the list of dimensions in the macro call without it needing to be defined in the metric definition.
To do so, set a list variable at the project level called custom_calendar_dimension_list, as shown in the example below.
#dbt_project.yml
vars:
custom_calendar_dimension_list: ["is_weekend"]
Querying Your Metric
You can dynamically query metrics directly in dbt and verify them before running a job in the deployment environment. To query your defined metric, you must have the dbt_metrics package installed. Information on how to install packages can be found here.
Use the following metrics package installation code in your packages.yml file and run dbt deps
to install the metrics package:
Once the package has been installed with dbt deps
, make sure to run the dbt_metrics_default_calendar
model as this is required for macros used to query metrics. More information on this, and additional calendar functionality, can be found in the project README.
Querying metrics with metrics.calculate
Use the metrics.calculate
macro along with defined metrics to generate a SQL statement that runs the metric aggregation to return the correct metric dataset. Example below:
Supported inputs
The example above doesn't display all the potential inputs you can provide to the macro.
You may find some pieces of functionality, like secondary calculations, complicated to use. We recommend reviewing the package README for more in-depth information about each of the inputs that are not covered in the table below.
Input | Example | Description | Required |
---|---|---|---|
Required | |||
grain | 'day' , 'week' , 'month' , 'quarter' , 'year' | The time grain that the metric will be aggregated to in the returned dataset | Optional |
dimensions | ['plan' ,'country' ] | The dimensions you want the metric to be aggregated by in the returned dataset | Optional |
secondary_calculations | [metrics.period_over_period( comparison_strategy="ratio", interval=1, alias="pop_1wk") ] | Performs the specified secondary calculation on the metric results. Examples include period over period calculations, rolling calculations, and period to date calculations. | Optional |
start_date | '2022-01-01' | Limits the date range of data used in the metric calculation by not querying data before this date | Optional |
end_date | '2022-12-31' | Limits the date range of data used in the metric calculation by not querying data after this date | Optional |
where | plan='paying_customer' | A sql statement, or series of sql statements, that alter the final CTE in the generated sql. Most often used to limit the data to specific values of dimensions provided | Optional |
Secondary Calculations
Secondary calculations are window functions you can add to the metric calculation and perform on the primary metric or metrics.
You can use them to compare values to an earlier period, calculate year-to-date sums, and return rolling averages. You can add custom secondary calculations into dbt projects - for more information on this, reference the package README.
The supported Secondary Calculations are:
Period over Period:
The period over period secondary calculation performs a calculation against the metric(s) in question by either determining the difference or the ratio between two points in time. The input variable, which looks at the grain selected in the macro, determines the other point.
Input | Example | Description | Required |
---|---|---|---|
comparison_strategy | ratio or difference | How to calculate the delta between the two periods | Yes |
interval | 1 | Integer - the number of time grains to look back | Yes |
alias | week_over_week | The column alias for the resulting calculation | No |
metric_list | base_sum_metric | List of metrics that the secondary calculation should be applied to. Default is all metrics selected | No |
Period to Date:
The period to date secondary calculation performs an aggregation on a defined period of time that is equal to or higher than the grain selected. For example, when you want to display a month_to_date value alongside your weekly grained metric.
Input | Example | Description | Required |
---|---|---|---|
aggregate | max , average | The aggregation to use in the window function. Options vary based on the primary aggregation and are enforced in validate_aggregate_coherence(). | Yes |
period | "day" , "week" | The time grain to aggregate to. One of ["day" , "week" , "month" , "quarter" , "year" ]. Must be at equal or coarser (higher, more aggregated) granularity than the metric's grain (see Time Grains below). In example grain of month , the acceptable periods would be month , quarter , or year . | Yes |
alias | month_to_date | The column alias for the resulting calculation | No |
metric_list | base_sum_metric | List of metrics that the secondary calculation should be applied to. Default is all metrics selected | No |
Rolling:
Developing metrics with metrics.develop
Multiple/Derived Metrics with metrics.develop
If you have a more complicated use case that you are interested in testing, the develop macro also supports this behavior. The only caveat is that you must include the raw tags for any provided metric yml that contains a derived metric. Example below:
{% set my_metric_yml -%}
{% raw %}
metrics:
- name: develop_metric
model: ref('fact_orders')
label: Total Discount ($)
timestamp: order_date
time_grains: [day, week, month]
calculation_method: average
expression: discount_total
dimensions:
- had_discount
- order_country
- name: derived_metric
label: Total Discount ($)
timestamp: order_date
time_grains: [day, week, month]
calculation_method: derived
expression: "{{ metric('develop_metric') }} - 1 "
dimensions:
- had_discount
- order_country
- name: some_other_metric_not_using
label: Total Discount ($)
timestamp: order_date
time_grains: [day, week, month]
calculation_method: derived
expression: "{{ metric('derived_metric') }} - 1 "
dimensions:
- had_discount
- order_country
{% endraw %}
{%- endset %}
select *
from {{ metrics.develop(
develop_yml=my_metric_yml,
metric_list=['derived_metric']
grain='month'
)
}}
The above example will return a dataset that contains the metric provided in the metric list (derived_metric
) and the parent metric (develop_metric
). It will not contain some_other_metric_not_using
as it is not designated in the metric list or a parent of the metrics included.
Important caveat - You must wrap the expression
property for derived
metrics in double quotes to render it. For example, expression: "{{ metric('develop_metric') }} - 1 "
.