Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify vendoring and declare dependencies optionally #4457

Merged
merged 29 commits into from
Jul 17, 2024
Merged

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Jul 2, 2024

Summary of changes

Although this PR doesn't fully remove vendored packages, it re-writes the logic to make the vendoring layer very thin and optional, removing lots of cruft that's accumulated around it.

In particular, it adds the vendored packages to sys.path instead of proxying them through a submodule. It adds them to the back of sys.path to allow naturally-installed packages to take precedence (vendored packages are only a fallback). The dependencies are no longer pinned to a specific version except that the packages in _vendor are for a specific version.

Benefits include:

  • Gives precedence to naturally-installed packages if present (vendored packages are the fallback).
  • Setuptools itself can import modules naturally, no more extern module redirection and associated double-accounting.
  • Setuptools and pkg_resources share the same set of vendored packages (without creating a dependency).
  • The importlib_metadata hack can be removed (as only one copy of importlib_metadata will ever be imported).
  • Vendored packages no longer need to be patched.
  • De-vendoring is as simple as removing the _vendor directory (and supplying the dependencies).
  • It would be trivially easy to make the vendored packages an opt-in, such as requiring an environment variable.
  • Mypy checks now catch real issues with dependencies.

In order to avoid cyclic dependencies, the declared dependencies are for now declared in an optional core extra so they now appear in the metadata.

Ref #2825

Pull Request Checklist

@jaraco jaraco force-pushed the debt/2825-devendor branch from 4e4c01e to d2704c3 Compare July 3, 2024 01:07
@jaraco jaraco force-pushed the debt/2825-devendor branch from d2704c3 to b4b6bf7 Compare July 3, 2024 01:18
@jaraco
Copy link
Member Author

jaraco commented Jul 3, 2024

It's hard to see in GitHub, due to the changes to the vendored directories, but here's a stat summary excluding _vendor:

 setuptools debt/2825-devendor @ git diff --stat main.. -- ':(exclude)*/_vendor/*'
 .github/workflows/main.yml                |   1 -
 .gitignore                                |  23 ++++++++++
 conftest.py                               |   3 --
 mypy.ini                                  |  11 +++--
 newsfragments/2825.removal.rst            |   1 +
 pkg_resources/__init__.py                 |  33 ++++++++------
 pkg_resources/extern/__init__.py          | 104 -------------------------------------------
 pkg_resources/tests/test_resources.py     |   2 +-
 pyproject.toml                            |  14 +++++-
 setuptools/__init__.py                    |   3 ++
 setuptools/_core_metadata.py              |   8 ++--
 setuptools/_entry_points.py               |   6 +--
 setuptools/_importlib.py                  |  44 ++----------------
 setuptools/_itertools.py                  |   2 +-
 setuptools/_normalization.py              |   4 +-
 setuptools/_reqs.py                       |   4 +-
 setuptools/command/_requirestxt.py        |   4 +-
 setuptools/command/bdist_wheel.py         |  12 ++---
 setuptools/command/build_py.py            |   2 +-
 setuptools/command/easy_install.py        |   2 +-
 setuptools/command/editable_wheel.py      |   2 +-
 setuptools/command/egg_info.py            |   2 +-
 setuptools/command/test.py                |   4 +-
 setuptools/compat/py310.py                |   2 +-
 setuptools/config/_apply_pyprojecttoml.py |   4 +-
 setuptools/config/expand.py               |   4 +-
 setuptools/config/pyprojecttoml.py        |   2 +-
 setuptools/config/setupcfg.py             |   8 ++--
 setuptools/depends.py                     |   2 +-
 setuptools/dist.py                        |  10 ++---
 setuptools/extern/__init__.py             |  92 --------------------------------------
 setuptools/msvc.py                        |   3 +-
 setuptools/package_index.py               |   3 +-
 setuptools/tests/config/test_setupcfg.py  |   2 +-
 setuptools/tests/test_bdist_wheel.py      |   6 +--
 setuptools/tests/test_extern.py           |   2 +-
 setuptools/tests/test_setuptools.py       |   2 +-
 setuptools/tests/test_wheel.py            |   4 +-
 setuptools/wheel.py                       |   7 +--
 tools/vendored.py                         | 234 +++++++++++++-----------------------------------------------------------------------------------
 tox.ini                                   |   9 ++--
 41 files changed, 162 insertions(+), 525 deletions(-)

Net 363 lines removed.

@jaraco jaraco marked this pull request as ready for review July 3, 2024 04:39
@jaraco jaraco requested a review from abravalheri July 3, 2024 04:39
Copy link
Contributor

@abravalheri abravalheri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for working on this Jason.

This is a though one. In principle I have no problem with this and I think it would be fine. But it is hard to foresee the consequences of it.

The complicated part is that setuptools has no way of telling the installer/builder that the dependencies are "optional in a way", and that if they fail to install the dependencies (e.g. due to cycles) it is fine because we also have them bundled.

For example: if the user passes some flags to pip that forces pip to no use wheels for the installation, will this also mean that the build will fail with this new approach to vendoring?

I believe this is related to pypa/packaging-problems#342. Ideally we should have a standard solution for that problem.

It would be interesting to hear @pfmoore and @pradyunsg thoughts on this matter.

@pfmoore
Copy link
Member

pfmoore commented Jul 3, 2024

It would be interesting to hear @pfmoore and @pradyunsg thoughts on this matter.

My main thought is that --no-binary :all: needs to be supported as an installation process. I don't personally have a need for this, but at the time we added in-tree build backend support, one of the key drivers was that there was a fairly vocal group of users who insisted that being able to build everything from source was essential. I assume that currently they are happy with the state of the ecosystem (at least, we haven't heard much from them recently), but we should ensure that a change like this doesn't trigger a new round of debate.

Unfortunately, I don't have any specific examples of people who should be consulted on this question. Maybe opening a discussion on the packaging Discourse would be worthwhile?

@abravalheri
Copy link
Contributor

abravalheri commented Jul 3, 2024

Thank you much, @pfmoore.

This brings me to the question: if setuptools can build itself with no extra dependency listed on pyproject.toml [build-system] requires (thanks to bundling the vendored dependencies), but it does list dependencies in pyproject.toml [project] dependencies does it count as creating a cycle in the same way as PEP 517 forbids?

@pfmoore
Copy link
Member

pfmoore commented Jul 3, 2024

I'm not sure I follow your question. To build setuptools, the pyproject.toml says that there are no build requirements, the build backend path is ".", and you build using setuptools. If that works (i.e., all of the dependencies are picked up from vendored copies in the source tree) then building setuptools will work, yes.

However, if you want to install some project X that uses setuptools, the installer will set up a build environment that contains setuptools (building and then installing it as described above) and it will then install all of the dependencies listed in project.depends for setuptools. This will build each of those dependencies (likely using setuptools recursively to do so). It's possible that cached wheels will make this appear to work, but if you globally configure pip to not use a cache, you'll probably get an infinite recursion of build steps (which pip will stop after a certain limit, but I don't know if uv has considered this case, so be careful - it does crash your machine if left unchecked).

So I guess my answer is yes, this looks potentially dangerous to me.

@jaraco
Copy link
Member Author

jaraco commented Jul 3, 2024

I notice that hatchling is able to declare dependencies. Is that because none of the dependencies use hatchling as their build backend (trove-classifiers and pluggy use setuptools, tomli and packaging and pathspec use flit)? Does that mean that hatchling would break the world if pathspec or trove-classifiers were to switch to hatchling? I see hatchling does something clever - it relies on its own ouroboros backend for getting build dependencies as part of get_requires_for_build_*.

I see that flit addresses this concern by implementing the backend with no dependencies (flit-core). Maybe setuptools should do something similar, and have a separate backend with no dependencies that can build setuptools.

So I guess my answer is yes, this looks potentially dangerous to me.

Let me see if I can reproduce the danger. I'll get a Docker container that's resource-constrained and try to get the system to fall over building this branch from source.

Perhaps I should simply remove the declared dependencies for now, or move them to optional dependencies. That would separate this concern about cyclic (build) dependencies from simplifying the vendoring.

@jaraco
Copy link
Member Author

jaraco commented Jul 3, 2024

Let me see if I can reproduce the danger. I'll get a Docker container that's resource-constrained and try to get the system to fall over building this branch from source.

I tried to reproduce the danger, but have thusfar been unsuccessful. I have this dockerfile:

FROM jaraco/multipy-tox
RUN git clone https://github.com/pypa/setuptools
WORKDIR setuptools
RUN git checkout debt/2825-devendor
ENV PIP_NO_CACHE=1
ENV PIP_NO_BINARY=:all:
# disable pip-run
ENV PYTHONPATH=
CMD py -3.12 -m pip install .

It appears to be succeeding in installing setuptools and its dependencies (though it fails on autocommand due to Lucretiel/autocommand#28).

What I realized, though, is that when building a dependency and thus getting "setuptools" from source, it's getting it from PyPI, which doesn't yet have the declared dependencies. To simulate the danger, I'll need to upload an sdist of this PR somewhere and use that index.

My main thought is that --no-binary :all: needs to be supported as an installation process. I don't personally have a need for this, but at the time we added in-tree build backend support, one of the key drivers was that there was a fairly vocal group of users who insisted that being able to build everything from source was essential. I assume that currently they are happy with the state of the ecosystem (at least, we haven't heard much from them recently), but we should ensure that a change like this doesn't trigger a new round of debate.

I took a moment to brush up on pypa/packaging-problems#342, which attempts to capture the issue. I believe it's an outstanding issue and it's only been avoided by the following implicit constraint:

  • build backends can only have (build or runtime) dependencies on projects that don't rely on that build backend.

I think I've got that right. A build backend can have dependencies if all of the dependencies rely on a different build backend (hatchling, for now). Or a build backend can have no dependencies (flit_core). I wonder if we should document this constraint somewhere. Is there a place where we could document this constraint?

jaraco added 2 commits July 3, 2024 12:06
Now that the vendoring doesn't depend on importing the setuptools package, it's safe for pkg_resources to import _from_ there without actually importing setuptools.
@jaraco jaraco reopened this Jul 13, 2024
@jaraco jaraco merged commit 8b4acd2 into main Jul 17, 2024
42 of 72 checks passed
@jaraco jaraco deleted the debt/2825-devendor branch July 17, 2024 14:41
webknjaz added a commit to webknjaz/ansible that referenced this pull request Jul 18, 2024
This is now necessary since `setuptools >= 71` started preferring
externally present stdlib deps over the vendored ones.

Refs:
* pypa/setuptools#4457
* pypa/setuptools#4483
* pypa/setuptools#2825
@jaraco jaraco mentioned this pull request Jul 22, 2024
2 tasks
@@ -0,0 +1,166 @@
GNU LESSER GENERAL PUBLIC LICENSE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to ghost post on you, but why did autocommand get added here? The rest look they were modifications, but this seems like a new _vendor package entirely. I ask because as it is LGPL, we have some licensing concerns. Thanks!

Copy link
Contributor

@Avasam Avasam Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a maintainer, but I can answer your question of "why": autocommand is a dependency of jaraco.text. Used there: https://github.com/search?q=repo%3Ajaraco%2Fjaraco.text%20autocommand&type=code AFAIK setuptools doesn't need it, it's just pulled in with dependencies. If you really need to I'm sure you can devendor it.

@@ -74,6 +74,8 @@

import _imp

sys.path.extend(((vendor_path := os.path.join(os.path.dirname(os.path.dirname(__file__)), 'setuptools', '_vendor')) not in sys.path) * [vendor_path]) # fmt: skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hack breaks when this package is within a zip file... pypa/virtualenv#1727 cc @jaraco

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to hear that.

It's not fair to characterize this as a hack. It's the vendoring approach, which is required by the ecosystem to satisfy PEP 517's "no circular build dependencies" constraint. Setuptools is following best-available standards for supplying sophisticated behavior. There's a lot of work going on in pypa/packaging-problems#342 to solve the issue more holistically so that build backends can have and declare proper dependencies.

When I first authored this change, I considered the zipimport case and thought that this approach might work for the zipimport case but I never verified.

Since the release of this change, the vendoring support is there as a fallback for environments that cannot include the dependencies naturally. The best approach would be for the app/environment builder to ensure that setuptools[core] is installed (such that vendoring is unnecessary).

I'd be open to expanding the vendoring support to support better the zipimport case. I'll open an issue about it.

@Ryang20718
Copy link

This causes #3072 to become an issue again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants