Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt to rmm logger changes #499

Merged
merged 22 commits into from
Nov 30, 2024
Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Nov 26, 2024

This PR adapts to breaking changes in rmm in rapidsai/rmm#1722.

@vyasr vyasr added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 26, 2024
@vyasr vyasr self-assigned this Nov 26, 2024
@vyasr vyasr requested a review from a team as a code owner November 26, 2024 19:41
@vyasr vyasr requested a review from a team as a code owner November 28, 2024 00:48
@vyasr vyasr requested a review from jameslamb November 28, 2024 00:48
@vyasr vyasr removed request for a team and jameslamb November 28, 2024 00:51
@vyasr
Copy link
Contributor Author

vyasr commented Nov 28, 2024

The arm wheel test is running into https://bugzilla.redhat.com/show_bug.cgi?id=1722181. A good reference discussion is in pytorch/pytorch#2575. dmlc/xgboost#8488 is an even better example of this issue, the torch one is closely related but slightly different. Ultimately the problem is that thread-local storage is being overallocated by some combination of the libraries that we are using before OpenMP is loaded.

We may be able to fix this by changing the import order of packages, but we'll need an arm machine to test that since this test is only failing on SBSA (I assume architectural differences lead to different TLS availability). It looks like we have precedent for using LD_PRELOAD in cuml, so I'm going to toss that into cuvs tests as well. Unfortunately because it's a wheel test we have to load the library that is found inside wheels, so I just added an import in conftest.

@vyasr vyasr requested a review from a team as a code owner November 28, 2024 15:15
@vyasr vyasr requested a review from raydouglass November 28, 2024 15:15
@github-actions github-actions bot added the ci label Nov 28, 2024
@vyasr vyasr requested a review from a team as a code owner November 28, 2024 18:10
@github-actions github-actions bot removed the ci label Nov 28, 2024
@vyasr vyasr removed request for a team and raydouglass November 28, 2024 20:37
@vyasr
Copy link
Contributor Author

vyasr commented Nov 30, 2024

/merge

@rapids-bot rapids-bot bot merged commit 31c59ce into rapidsai:branch-25.02 Nov 30, 2024
55 checks passed
@vyasr vyasr deleted the feat/rmm_logger branch November 30, 2024 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants