What are adapters? Why do we need them?
Adapters are an essential component of dbt. At their most basic level, they are how dbt Core connects with the various supported data platforms. At a higher-level, dbt Core adapters strive to give analytics engineers more transferrable skills as well as standardize how analytics projects are structured. Gone are the days where you have to learn a new language or flavor of SQL when you move to a new job that has a different data platform. That is the power of adapters in dbt Core.
Navigating and developing around the nuances of different databases can be daunting, but you are not alone. Visit #adapter-ecosystem Slack channel for additional help beyond the documentation.
All databases are not the same
There's a tremendous amount of work that goes into creating a database. Here is a high-level list of typical database layers (from the outermost layer moving inwards):
- SQL API
- Client Library / Driver
- Server Connection Manager
- Query parser
- Query optimizer
- Storage Access Layer
There's a lot more there than just SQL as a language. Databases (and data warehouses) are so popular because you can abstract away a great deal of the complexity from your brain to the database itself. This enables you to focus more on the data.
dbt allows for further abstraction and standardization of the outermost layers of a database (SQL API, client library, connection manager) into a framework that both:
- Opens database technology to less technical users (a large swath of a DBA's role has been automated, similar to how the vast majority of folks with websites today no longer have to be "webmasters").
- Enables more meaningful conversations about how data warehousing should be done.
This is where dbt adapters become critical.
What needs to be adapted?
dbt adapters are responsible for adapting dbt's standard functionality to a particular database. Our prototypical database and adapter are PostgreSQL and dbt-postgres, and most of our adapters are somewhat based on the functionality described in dbt-postgres.
Connecting dbt to a new database will require a new adapter to be built or an existing adapter to be extended.
The outermost layers of a database map roughly to the areas in which the dbt adapter framework encapsulates inter-database differences.
Even amongst ANSI-compliant databases, there are differences in the SQL grammar. Here are some categories and examples of SQL statements that can be constructed differently:
|Category||Area of differences||Examples|
|Statement syntax||The use of |
|Workflow definition & semantics||Incremental updates|
|Relation and column attributes/configuration||Database-specific materialization configs|
|Permissioning||Grant statements that can only take one grantee at a time vs those that accept lists of grantees|
Python Client Library & Connection Manager
The other big category of inter-database differences comes with how the client connects to the database and executes queries against the connection. To integrate with dbt, a data platform must have a pre-existing python client library or support ODBC, using a generic python library like pyodbc.
|Category||Area of differences||Examples|
|Credentials & authentication||Authentication|
|Connection opening/closing||Create a new connection to db|
|Inserting local data||Load seed .|
How dbt encapsulates and abstracts these differences
Differences between databases are encoded into discrete areas:
|Python Classes||Configuration (See above [Python classes](##python classes)|
|Macros||SQL API & statement syntax (for example, how to create schema or how to get table info)|
|Materializations||Table/view/snapshot/ workflow definitions|
These classes implement all the methods responsible for:
- Connecting to a database and issuing queries.
- Providing dbt with database-specific configuration information.
|AdapterClass||High-level configuration type conversion and any database-specific python methods needed|
|AdapterCredentials||Typed dictionary of possible profiles and associated methods|
|AdapterConnectionManager||All the methods responsible for connecting to a database and issuing queries|
|AdapterRelation||How relation names should be rendered, printed, and quoted. Do relation names use all three parts? |
|AdapterColumn||How names should be rendered, and database-specific properties|
A set of macros responsible for generating SQL that is compliant with the target database.
A set of materializations and their corresponding helper macros defined in dbt using jinja and SQL. They codify for dbt how model files should be persisted into the database.
Below is a diagram of how dbt-postgres, the adapter at the center of dbt-core, works.adapter architecture diagram