Query the Discovery API
The Discovery API supports ad-hoc queries and integrations.. If you are new to the API, read the Discovery API overview for an introduction.
Use the Discovery API to evaluate data pipeline health and project state across runs or at a moment in time. dbt Labs provide a GraphQL explorer for this API, enabling you to run queries and browse the schema.
Since GraphQL provides a description of the data in the API, the schema displayed in the GraphQL explorer accurately represents the graph and fields available to query.
Prerequisites
- dbt Cloud multi-tenant or single tenant account
- You must be on a Team or Enterprise plan
- Your projects must be on dbt version 1.0 or higher. Refer to Version migration guides to upgrade
Authorization
Currently, authorization of requests takes place using a service token. dbt Cloud admin users can generate a Metadata Only service token that is authorized to execute a specific query against the Discovery API.
Once you've created a token, you can use it in the Authorization header of requests to the dbt Cloud Discovery API. Be sure to include the Token prefix in the Authorization header, or the request will fail with a 401 Unauthorized
error. Note that Bearer
can be used in place of Token
in the Authorization header. Both syntaxes are equivalent.
Access the Discovery API
Create a service account token to authorize requests. dbt Cloud Admin users can generate a Metadata Only service token, which can be used to execute a specific query against the Discovery API for authorization of requests.
Find your API URL using the endpoint
https://metadata.{YOUR_ACCESS_URL}/graphql
.- Replace
{YOUR_ACCESS_URL}
with the appropriate Access URL for your region and plan. For example, if your multi-tenant region is North America, your endpoint ishttps://metadata.cloud.getdbt.com/graphql
. If your multi-tenant region is EMEA, your endpoint ishttps://metadata.emea.dbt.com/graphql
.
- Replace
For specific query points, refer to the schema documentation.
Run queries using HTTP requests
You can run queries by sending a POST
request to the https://metadata.YOUR_ACCESS_URL/graphql
endpoint, making sure to replace:
YOUR_ACCESS_URL
with the appropriate Access URL for your region and plan.YOUR_TOKEN
in the Authorization header with your actual API token. Be sure to include the Token prefix.QUERY_BODY
with a GraphQL query, for example{ "query": "<query text>" }
VARIABLES
with a dictionary of your GraphQL query variables, such as a job ID or a filter.ENDPOINT
with the endpoint you're querying, such as environment.curl 'https://metadata.YOUR_ACCESS_URL/graphql' \
-H 'authorization: Bearer YOUR_TOKEN' \
-H 'content-type: application/json'
-X POST
--data QUERY_BODY
Python example:
response = requests.post('YOUR_ACCESS_URL',
headers={"authorization": "Bearer "+YOUR_TOKEN, "content-type": "application/json"},
json={"query": QUERY_BODY, "variables": VARIABLES})
metadata = response.json()['data'][ENDPOINT]
Every query will require an environment ID or job ID. You can get the ID from a dbt Cloud URL or using the Admin API.
There are several illustrative example queries in this documentation. You can see an examples in the use case guide.
Reasonable use
To maintain performance and stability, and prevent abuse, Discovery (GraphQL) API usage is subject to request rate and response size limits.
- The current request rate limit is 200 requests within a minute for a given IP address. If a user exceeds this limit, they will receive an HTTP 429 response status.
- Environment-level endpoints will be subject to response size limits in the future. The depth of the graph should not exceed three levels. A user can paginate up to 500 items per query.
Retention limits
You can use the Discovery API to query data from the previous three months. For example, if today was April 1st, you could query data back to January 1st.
Run queries with the GraphQL explorer
You can run ad-hoc queries directly in the GraphQL API explorer and use the document explorer on the left-hand side, where you can see all possible nodes and fields.
Refer to the Apollo explorer documentation for setup and authorization info.
Access the GraphQL API explorer and select fields you'd like query.
Go to Variables at the bottom of the explorer and replace any
null
fields with your unique values.Authenticate via Bearer auth with
YOUR_TOKEN
. Go to Headers at the bottom of the explorer and select +New header.Select Authorization in the header key drop-down list and enter your Bearer auth token in the value field. Remember to include the Token prefix. Your header key should look like this
{"Authorization": "Bearer <YOUR_TOKEN>}
.

- Run your query by pressing the blue query button in the top-right of the Operation editor (to the right of the query). You should see a successful query response on the right side of the explorer.

Fragments
Use the ..on
notation to query across lineage and retrieve results from specific node types.
environment(id: $environmentId) {
applied {
models(first: $first,filter:{uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) {
edges {
node {
name
ancestors(types:[Model, Source, Seed, Snapshot]) {
... on ModelAppliedStateNode {
name
resourceType
materializedType
executionInfo {
executeCompletedAt
}
}
... on SourceAppliedStateNode {
sourceName
name
resourceType
freshness {
maxLoadedAt
}
}
... on SnapshotAppliedStateNode {
name
resourceType
executionInfo {
executeCompletedAt
}
}
... on SeedAppliedStateNode {
name
resourceType
executionInfo {
executeCompletedAt
}
}
}
}
}
}
}
}
Pagination
Querying large datasets can impact performance on multiple functions in the API pipeline. Pagination eases the burden by returning smaller data sets one page at a time. This is useful for returning a particular portion of the dataset or the entire dataset piece-by-piece to enhance performance. dbt Cloud utilizes cursor-based pagination, which makes it easy to return pages of constantly changing data.
Use the PageInfo
object to return information about the page. The following fields are available:
startCursor
string type - corresponds to the firstnode
in theedge
.endCursor
string type - corresponds to the lastnode
in theedge
.hasNextPage
boolean type - whether there are morenodes
after the returned results.hasPreviousPage
boolean type - whethernodes
exist before the returned results.
There are connection variables available when making the query:
first
integer type - will return the first 'n'nodes
for each page, up to 500.after
string type sets the cursor to retrievenodes
after. It's best practice to set theafter
variable with the object ID defined in theendcursor
of the previous page.
The following example shows that we're returning the first
500 models after
the specified Object ID in the variables. The PageInfo
object will return where the object ID where the cursor starts, where it ends, and whether there is a next page.

Here is a code example of the PageInfo
object:
pageInfo {
startCursor
endCursor
hasNextPage
}
totalCount # Total number of pages
Filters
Filtering helps to narrow down the results of an API query. Want to query and return only models and tests that are failing? Or find models that are taking too long to run? You can fetch execution details such as executionTime
, runElapsedTime
, or status
. This helps data teams monitor the performance of their models, identify bottlenecks, and optimize the overall data pipeline.
In the following example, we can see that we're filtering results to models that have succeeded on their lastRunStatus
:

Here is a code example that filters for models that have an error on their last run and tests that have failed:
environment(id: $environmentId) {
applied {
models(first: $first, filter: {lastRunStatus:error}) {
edges {
node {
name
executionInfo {
lastRunId
}
}
}
}
tests(first: $first, filter: {status:"fail"}) {
edges {
node {
name
executionInfo {
lastRunId
}
}
}
}
}