-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about parallel computing on GPU #46
Comments
@Johnxiaoming I have been using two GPUs very heavily recently and both are utilized. Although the latest Github version should work, I have made enormous optimizations in the last two weeks that will post to github soon. I use DataParallel currently, and this works fine for single servers and AWS innstances. DistributedDataParallel might be better. To debug, what version of Omnipose and PyTorch do you have and what is your hardware? |
a tiny intel + double nvidia RTX GPU server. Thank you! |
@Johnxiaoming I just realized that you said running the model, not training (that's where my head has been at...). I know for sure that training uses both GPUs, but evaluating is another story. Some of my optimizations for training should apply to evaluation. The model itself is initialized with dataparallel in both training and evaluation, so my guess is that we simply are not saturating GPU0, and so GPU1 never gets called. Can you tell me what your typical image set looks like in number and resolution? If you monitored your GPU0 memory, I am curious to know if its VRAM was completely used up or if was well under the maximum capacity during evaluation. Some more explanation and planning: the behavior now is to process images in sequence. The Moreover, it makes sense to run the mask reconstruction in parallel - again, VRAM permitting. Doing the Euler integration in one loop for all images simultaneously instead of multiple loops in sequence is virtually guaranteed to be faster. I already figured out much of the code for this to parallelize training, and I can see exactly what we need to do for evaluation. I just need to find the time to implement it. I suspect I will do it by the end of the month, so stay tuned! |
I have two GPUs on our server. But when I run the model, the second GPU always does not work. I know my job is not very big, so that the first GPU hasn't been occupied fully. But I think if you can support parallel computing, it can save much more time. Even just adding a function that can let us select the GPU I want to use, it will also help a lot. Because I can run two jobs on each GPUs. Thanks.
If you feel difficult, I will fork it and do it by myself.
The text was updated successfully, but these errors were encountered: