Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split up CUDA-suffixed dependencies in dependencies.yaml #1057

Merged
merged 4 commits into from
Jul 24, 2024

Conversation

jameslamb
Copy link
Member

Description

Contributes to rapidsai/build-planning#31

In short, RAPIDS DLFW builds want to produce wheels with unsuffixed dependencies, e.g. cudf depending on rmm, not rmm-cu12.

This PR is part of a series across all of RAPIDS to try to support that type of build by setting up CUDA-suffixed and CUDA-unsuffixed dependency lists in dependencies.yaml.

For more details, see:

Notes for Reviewers

Why target 24.08?

This is targeting 24.08 because:

  1. it should be very low-risk
  2. getting these changes into 24.08 prevents the need to carry around patches for every library in DLFW builds using RAPIDS 24.08

@jameslamb jameslamb changed the title WIP: split up CUDA-suffixed dependencies in dependencies.yaml split up CUDA-suffixed dependencies in dependencies.yaml Jul 24, 2024
@jameslamb jameslamb marked this pull request as ready for review July 24, 2024 05:49
@jameslamb jameslamb requested a review from a team as a code owner July 24, 2024 05:49
@jameslamb jameslamb requested a review from msarahan July 24, 2024 05:49
@jameslamb jameslamb requested a review from a team as a code owner July 24, 2024 18:38
@jameslamb
Copy link
Member Author

jameslamb commented Jul 24, 2024

I've updated this based on the suggestions from rapidsai/cudf#16183.

Ran the following to check update-version.sh.

git fetch upstream --tags
ci/release/update-version.sh '24.10.00'

git grep -E '24\.8|24\.08|0\.39'

That revealed a few other places that needed updates in update-version.sh. Pushed those fixes here as well.

- *cudf_unsuffixed
# NOTE: cupy still has a "-cuda11x" suffix here, because it's suffixed
# in DLFW builds
- *cupy_cu11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matrix looks like it hasn't been updated to use fallback the way that the other repo's PR have, but is that intentional here because of the need to specify the cupy dep?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exactly. More context: rapidsai/cugraph#4552 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, it might be nice to refactor this to a "depends on cudf" / "depends on cupy" structure like we have for other repos.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I mentioned that in that comment I linked above (rapidsai/cugraph#4552 (comment)).

In the interest of time and not wanting to go through another CI cycle, I'll defer that to a follow-up PR.

@jameslamb jameslamb requested a review from vyasr July 24, 2024 22:26
@@ -10,7 +10,7 @@ build:
# and therefore tested in this no-CUDA environment
- |
pip install \
-C rapidsai.matrix-entry="cuda=12.2" \
-C rapidsai.matrix-entry="cuda=12.2;cuda_suffixed=true" \
Copy link
Contributor

@bdice bdice Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this missed in an upgrade to 12.5? Feel free to slip that in here or in a follow-up PR.

Suggested change
-C rapidsai.matrix-entry="cuda=12.2;cuda_suffixed=true" \
-C rapidsai.matrix-entry="cuda=12.5;cuda_suffixed=true" \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yes. I'll put up a follow-up PR fixing this... against 24.10. There's not a functional difference for ucx-py, since this is just a way to matrix matrices with wildcard selectors like "12.*".

- *cudf_unsuffixed
# NOTE: cupy still has a "-cuda11x" suffix here, because it's suffixed
# in DLFW builds
- *cupy_cu11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, it might be nice to refactor this to a "depends on cudf" / "depends on cupy" structure like we have for other repos.

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit e2ec8ef into rapidsai:branch-0.39 Jul 24, 2024
41 checks passed
@jameslamb jameslamb deleted the suffix-split branch July 24, 2024 23:16
@jameslamb
Copy link
Member Author

I merged this in the interest of reducing the amount of in-flight stuff ahead of code freeze. @bdice I made myself a reminder to put up a follow-up PR against 24.10 implementing your suggestions.

Don't want you to think I was ignoring them... I am just being really conservative with pushing new commits given how oversubscribed our CI runners are right now.

@jakirkham
Copy link
Member

Just know that RAPIDS 24.08 already includes 12.5

So if we need this to work with RAPIDS 24.08 CUDA 12.5, we would need that fix in 24.08

@jameslamb
Copy link
Member Author

So if we need this to work with RAPIDS 24.08 CUDA 12.5, we would need that fix in 24.08

It's not necessary to address #1057 (comment) in 24.08.

That line is just a selector to choose between installing libucx-cu11 and libucx-cu12 in docs-building (an environment that doesn't have CUDA or GPUs), and all that matters there functionally is that it match a dependencies.yaml matrix "12.*".

Functionally, it could be "cuda=12.9999" and still achieve the same goal.

I agree with changing the 12.2 to 12.5 just for the general benefit of defaulting to 12.5 everywhere as a form of subtle documentation about what's supported, but there's no need to rush that change into 24.08.

@jakirkham
Copy link
Member

Ok cool. Could it just be 12. then?

Maybe that avoids some confusion and saves future churn

@jameslamb
Copy link
Member Author

@jakirkham @bdice I've put up #1059, targeting branch-0.40, with changes that I think address all of your comments here.

rapids-bot bot pushed a commit that referenced this pull request Jul 25, 2024
…12 in fallback dependency lists (#1059)

Follow-up to #1057.

Implements the following suggestions based on review comments from @jakirkham and @bdice there:

* splits `cupy` out into its own list in `dependencies.yaml`, to allow removing some `cuda_suffixed: "false"` blocks that were otherwise redundant
* passes `--matrix-entry 'cuda=12.x'` in docs builds, to make it clearer that those builds are just trying to match `cuda: "12.*"` and are not at all dependent on any particular minor version of CUDA

Also proposes defaulting to `cupy-cuda12x` in `pyproject.toml`, as part of rapidsai/build-planning#68.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1059
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants