Skip to main content

Near real-time data in dbt

By design, dbt is batch-oriented with jobs having a defined start and end time. But did you know that you can also use dbt to get near real-time data by combining your data warehouse's continuous ingestion with frequent dbt transformations?

This guide covers multiple patterns for achieving near real-time data freshness with dbt:

  1. Incremental patternsmerge strategies, Change Data Capture (CDC), and microbatch processing
  2. Warehouse-native features — When to use dynamic tables and materialized views
  3. Lambda views pattern — Combining batch and real-time data in a single view
  4. Views-only pattern - Maximum freshness for lightweight transformations
  5. Operational considerations — Challenges, risks, and cost management

Each pattern includes practical code examples, use cases, and tradeoffs to help you choose the right approach.

Anyone can use this guide, but it's primarily for data engineers and architects who want to achieve near real-time data freshness with dbt.

Where does dbt fit?

There are two main ways to use dbt to get near real-time data:

  • For near real-time (5 - 15 minutes) — dbt excels at this and is well-suited for most operational dashboards.
  • For true real-time (sub-second) — This requires dedicated streaming databases (ClickHouse, Materialize, Rockset, and so on) in front of or alongside dbt; dbt still owns “analytic” tables and history but not the ultra‑low‑latency read path.

How dbt achieves near real-time data

To achieve real-time data with dbt, we recommend using a two-layer architecture:

Ingestion layer

Continuous data landing using your data warehouse's streaming ingestion features.

Streaming ingestion features such as streaming tables, Snowpipe, or Storage Write API work well for this. To find streaming ingestion features for your warehouse, refer to the additional resources section.

dbt transformation layer

Run dbt every few minutes to transform the data, and use materialized views or dynamic tables for the lowest-latency reporting.

Specific transformation approaches include:

Key recommendations

The following are some key recommendations to help you achieve near real-time data freshness with dbt:

  • Ingest data continuously: Use your warehouse's native streaming or micro-batch ingestion to land raw data as soon as it arrives.
  • Transform with dbt on a frequent schedule: Schedule dbt jobs to run as often as your business needs allow (for example, every 1–15 minutes). Balance freshness with cost and resource constraints.
  • Materialized views and dynamic tables: For the lowest-latency reporting, use materialized views or dynamic tables. These can be refreshed as frequently as every minute.
  • Incremental models and microbatching: Use dbt's incremental models to process only new or changed data, keeping transformations efficient and scalable.
  • Decouple ingestion from transformation: Keep data acquisition and transformation flows separate. This allows you to optimize each independently.
  • Monitor and test data freshness: Implement data quality checks and freshness monitoring to ensure your near real-time pipelines deliver accurate, up-to-date results.
  • Cost and complexity considerations: Running dbt jobs more frequently drives up compute costs and operational complexity. Always weigh the business value against these trade-offs.

Was this page helpful?

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

0
Loading