Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove assign_cpu_and_gpu_sets #4412

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Remove assign_cpu_and_gpu_sets #4412

wants to merge 4 commits into from

Conversation

epicfaace
Copy link
Member

Reasons for making this change

Remove assign_cpu_and_gpu_sets as I don't think this check is needed. This is important as it simplifies #4385, because workers won't need to do checks on their own cpu / gpusets (they can just request pods with specific GPUs). I could have just bypassed this code path for the kubernetes runtime, but it seems to be an unnecessary check we can just do away with -- because the bundle manager already handles ensuring the worker has enough CPUs / GPUs available to run a bundle. What do you think @AndrewJGaut @percyliang ?

% (request_gpus, len(gpuset), len(self.gpuset))
)

def propose_set(resource_set, request_count):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance that the user will actually get less CPUs / GPUs than they request? If so, then we might need to keep this function.

@epicfaace
Copy link
Member Author

We should discuss this in a meeting.

@@ -237,19 +237,7 @@ def mount_dependency(dependency, shared_file_system):
)
return run_state._replace(stage=RunStage.CLEANING_UP)

# Check CPU and GPU availability
try:
cpuset, gpuset = self.assign_cpu_and_gpu_sets_fn(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an if statement -- consider just not doing this with kubernetes.

@epicfaace
Copy link
Member Author

@AndrewJGaut any thoughts on this?

@AndrewJGaut
Copy link
Contributor

AndrewJGaut commented Mar 22, 2023

@AndrewJGaut any thoughts on this?

I believe you're correct @epicfaace and assign_cpu_and_gpu_sets can be removed.

Here's why: We do check if the worker has enough resources in the bundle-manager and keep a tally on the worker's cpus and gpus (e.g. see here).

Now, we could run into issues if, for instance, the bundle_manager was starting bundles in a multi-threaded fashion (and sent a start message to the same worker multiple times before decrementing the worker['cpus'] and/or worker['gpus'] in that function).However, the condition for entering the loop in which those resources are decremented will not return until the rest-server has received the message and sent it to the worker has received the message (or it returns False and no bundle is started anyway). Therefore, it will never be the case as long as bundle_manager is single-threaded that there will be a race condition related to worker cpu and gpu counts.

Since the bundle-manager will never schedule a run on the worker unless it has sufficient resources for all of its running bundles, it should never be the case that there aren't enough cpus and/or gpus on the worker to run a bundle.

Therefore, the assign_cpu_and_gpu_sets check done in the worker is not necessary and can be removed. However, it could become necessary if we wanted the worker and/or bundle-managers to be multi-threaded (with respect to transitioning bundle stages on the worker and run scheduling bundles on the bundle-manager). In that case, though, I'd recommend we keep a member variable in the worker keeping track of the number of cpus and gpus available rather than looping through the count for all run bundles as done here.

@AndrewJGaut
Copy link
Contributor

AndrewJGaut commented Mar 22, 2023

Some extra interesting tidbits:

  • The beginning of the function is the only place where we check whether or not there are enough CPUs or GPUs on the worker currently to run that bundle.
  • The scheduling algorithm on the bundle-manager makes more sense to me now. It loops over bundles it's trying to run. For each bundle, it calls a function that checks if any worker has the resources required to run that bundle. If so, it tries to start the bundle on one of those workers and then deducts resources if that fails. So, in particular, it does each bundle individually and with a single thread, which is why we can be sure that no race conditions will arise.
  • The bundle-manager only does one thing with multiple threads: make bundles
  • I can't actually find where the cpu sets and gpu sets are incremented after a run bundle finishes... but I know that it does b/c otherwise workers wouldn't be able to schedule bundles ad infinitum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants