Adapter/Core Decoupling Migration Guide #87
Replies: 2 comments 1 reply
-
Suggestion for the script that validates imports: it missed places that were full paths, that I'm going through manually - things like
|
Beta Was this translation helpful? Give feedback.
-
Updated the above guide to reflect a decision we have reached internally to not remove dbt-core as an explicit dependency of adapters. This decision is motivated by feedback that users expect dbt-core to be automatically installed with an adapter and breaking that experience is not appropriate for a minor version bump with minimal warning. However, we will be removing dbt-core from adapters in a later version so we will still want to not have adapters depend on dbt-core in their code. |
Beta Was this translation helpful? Give feedback.
-
Overview
Starting in the fall of 2023 we began a project to decouple dbt-core and the base adapter and define a stable adapter interface. For context on why we have pursued this project see: 1 2.
The north star of adapter development post dbt-core v1.8.0 is the following:
Being independent means that dbt-core changes can be released without requiring any changes to adapters and vice verse. Adapters will no longer directly depend on dbt-core and as a result do not need to worry about which version of dbt-core they are using. They only need to concern themselves with which versions of the dbt-adapters interface they support. In a future release we will be removing dbt-core as an explicit dependency of adapters.
For maintainers this means that they will not be pressured to update their adapter code so their users can get the latest dbt-core features. Likewise adapters will be able to release improvements on a quicker cadence than dbt-core. We believe this will result in more iterative feature development instead of dropping a bunch of changes on maintainers once a quarter.
To accomplish this we have taken the base adapter out of dbt-core and moved it to a new repository: dbt-adapters (this is also the pypi package name). The following diagram explains the new package structure:
The rest of this document will outline what changes are required for your adapter to support this new, post dbt-core 1.8, architecture.
Note
dbt-adapters as of publishing this is not technically stable (i.e. it’s pre 1.0) however we do not anticipate any sizable changes to the interface before then so consider it safe for maintainers to begin working against.
What’s changed?
Structure
To factor adapters out of dbt-core we have created two new python packages:
As the above diagram illustrates the dbt-adapters (and any implementing adapter) will never have a runtime dependency on dbt-core but dbt-core will depend on dbt-adapters.
dbt-adapters - all adapter specific code
AdapterLogger
dbt-common (
dbt_common
) - shared code being used by both dbt-adapters and dbt-coreOther changes: In addition to those packages we have also moved dbt-postgres to it’s own repo and dbt-tests-adapter now shares the same repo as dbt-adapters. We don’t anticipate these changes impacting other adapter maintainers.
Versioning & Releasing
Up to now adapters have been required release a new minor version to declare compatibility with dbt-core’s minor version. Post dbt-core version 1.8, adapters will not need to do this. Instead maintainers will need to declare their compatibility with dbt-adapters’ versions.
While the exact semantic version policy for dbt-adapters is still being finalized (this document will be updated when it is), maintainers can safely assume the following:
Breaking interface changes will be communicated in a major version bump (i.e.
1.0.0
→2.0.0
).Here “breaking” means:
Significant, but non-breaking, changes such as feature launches or performance enhancements will result in a minor version bump (i.e.
1.0.0
→1.1.0
)Some examples of this kind of change could be:
“Internal” changes to the base adapter or other small changes will be handled via a patch version bump (i.e.
1.0.0
→1.0.1
)Some examples of this could be:
We recommend maintainers pin to the major version as this will make it easier for the community to get the latest improvements to adapters. We strongly discourage pinning to a patch version as this could prevent critical security patches and bug fixes from reaching users.
Interfaces and Protocols
With the decoupling of dbt-core and adapters, we want to better define the semantics/separation of concerns between core and adapters. The key semantic change to understand is that now adapters do not reference or concern itself with core “concepts” like manifests and nodes but with adapter concepts like relations and materializations.
Practically speaking this means:
ManifestNode
withRelationConfig
, whereRelationConfig
is a protocol that contains all of the fields fromManifestNode
that adapters rely on. If you are relying on a field not captured inRelationConfig
that doesn’t mean your code will break however it does run the risk that dbt-core deprecates or removes it at a later date. Please raise an issue or PR in dbt-adapters if you run into this case.macro_context_generator
)profile.yml
MultipleDatabasesNotAllowedError
.No global imports from Core
Adapters no longer imports core globals: all global data is in common or passed as an argument.
What do I need to do?
Based on the migrations we have completed there are four steps:
remove dbt-coreupdate dbt-core to>=1.8
and add dbt-adapters)dbt.*
→ [dbt.adapters.*
,dbt_common.*
])The number and complexity of changes varies with each adapter but we anticipate this being a fairly smooth migration for most. If you run into bugs and or get stuck please raise an issue in dbt-adaptersdetailing the problem.
Step 1: update dependencies
Remove runtime dependency on dbt-core fromUpdate dbt-core to>=1.8
in[setup.py](http://setup.py)
orpyproject.toml
and add a dependency on dbt-adapters. We recommend pinning to major versions for now and only pinning to minor versions if necessary. Post dbt-adapters releasing it’s stable version this would look like:dbt-adapters>=1.0.0, <2.0.0
Maintain test dependency on dbt-core, as we will need to install dbt-core to run your tests.As we move forward (i.e.dbt-core>=1.8.0
), it should not matter which version of dbt-core you use to test since dbt-core no longer contains any adapter specific code. Eventually we will be asking all adapters to remove their dependency on dbt-core entirely.You can also see the remaining dbt-core dependencies by running this script. To run you will need to pass the relative path to the top level
dbt/
source directory. For example if you are in dbt-labs/dbt-snowflake, with dbt-core in the same directory, you could run this command:python ../dbt-core/scripts/migrate-adapters.py dbt/
Step 2: Shifting Imports
For the most part this category of change is just a matter of find/replace. Note that there may be additional import changes required for **unit or integration tests. Those will be covered in the below section on testing.
What has moved to
dbt_common.*
or todbt.adapters
:dbt.clients
from dbt.clients import agate_helper
→dbt_common.clients import agate_helper
dbt.exceptions
→dbt_common.exceptions
(some exceptions have also been move todbt.adapters
)DbtRuntimeError
DbtConfigError
(use in place ofDbtProfileError
)NotImplementedError
CompilationError
dbt.dataclass_schema
→dbt.common.dataclass_schema
dbt.events
dbt.events.functions
→dbt_common.events.functions
dbt.events.contextvars
→dbt_common.events.contextvars
AdapterLogger
→dbt.adapters.events.logging
dbt.contracts
AdapterResponse
→dbt.adapters.contracts.connection
dbt.contracts.graph.nodes
→dbt_common.contracts.constraints
dbt.contracts.relation
→dbt.adapters.contracts.relation
dbt.contracts.graph.model_config
→dbt.common.contracts.config.materialization
dbt.utils
→dbt.common.utils
classproperty
dbt.helper_types
→dbt_common.helper_types
dbt.ui
→dbt_common.ui
dbt.clients.system
→dbt.common.clients.system
Once you have updated the imports we recommend verifying by running this script again.
Step 3: Functional & Semantic Code Changes
You will need to make small to moderate code changes to replace your usage of the following classes/functions.
Manifest
: adapters no longer directly reference the manifestIn cases where you want an iterable of some database objects we now use an
Iterable
with a type based on what will actually be passed to the method (typically the newRelationConfig
protocol.For example in dbt-bigquery we made the following change:
def _get_catalog_schemas(self, manifest: Manifest)
became
def _get_catalog_schemas(self, relation_config: Iterable[RelationConfig])
Note: at runtime dbt-core is still passing the manifest so any jinja code that assumes something about the manifest should continue to work.
By convention we want to stop using dbt-core semantics in method names, this means we have changed some methods like
parse_model_node
toparse_relation_config
. While obviously not a requirement, we want to encourage maintainers to adopt this to simplify the mental model for working on an adapter.dbt_invocation_id
: if you were using something likeactive_user.invocation_id
please useget_invocation_id()
viafrom dbt.common.invocation import get_invocation_id
dbt.flags.MP_CONTEXT.Lock()
→self.lock
If you’re in the
ConnectionManager
this is now an argument that dbt-core passes this lock to adapters so you can replace it with a call toself.lock
. See this change in dbt-redshift for an example.If your
Adapter
class inherits fromBaseAdapter
(as opposed toSQLAdapter
) you will need to update your__init__
method to accept themp_context
arg like dbt-bigquery:Macros
In general, there are few necessary macro changes.
Some issues we came across:
{{ schema_relation }}
is now an object and may not string-ify in the way you expect. You may need to do actually do the string composition in your macro. See this example from dbt-snowflake:show terse objects in {{ schema_relation }} limit {{ max_results_per_iter }}
became…
show terse objects in {{ schema_relation.database }}.{{ schema_relation.schema }} limit {{ max_results_per_iter }}
if you have changed the expected type of your function from
Manifest
toRelationConfig
you may need to update your macro to pass themodel
instead of the wholeManifest
object. If you were previously passingconfig
now you simply passconfig.model
instead.An example of that can be seen here:
before:
{% set _configuration_changes = existing_relation.materialized_view_config_changeset(_existing_materialized_view, new_config) %}
after:
{% set _configuration_changes = existing_relation.materialized_view_config_changeset(_existing_materialized_view, new_config.model) %}
Step 4: Testing Updates
In addition to the import changes specified above there are a few other changes and sharp edges to get tests up and running.
Note on dbt-core test dependency:
As a team we have committed to creating a new test runner that will allow adapters to execute functional tests without dbt-core this quarter. Until this is ready we are in the somewhat awkward position of needing dbt-core as a test dependency.
Unit test changes:
Since all adapters now take an additional argument,
mp_context
, you will want to add something like the following example:Patching AdapterLogger, per the earlier section covering imports you will need to change how this class is patched, example:
monkeypatch.setattr(dbt.events, "AdapterLogger", Mock(return_value=log_mock))
→monkeypatch.setattr(dbt.adapters.events.logging, "AdapterLogger", Mock(return_value=log_mock))
If your adapter impl inherits from
BaseAdapter
(as opposed toSQLAdapter
) you will probably need to make some updates to reflect the new macro context that gets passed from dbt-core to the adapter.example from dbt-bigquery:
Functional/Integration test changes:
These should be pretty minimal other than adjusting import paths we have had to update a few of our test fixtures.
One gotcha is that schema relations have become dicts instead of strings so:
f"{database}.{schema}"
becomes{"database": database, "schema": schema}
Reference PRs
These are the pull requests showing how we migrated the adapters we maintain:
Beta Was this translation helpful? Give feedback.
All reactions