-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FusedAdam requires cuda extensions #11
Comments
@mehranagh20 -- Are you using the code on a GPU, and do you have the appropriate CUDA drivers enabled? If you want to avoid using apex, you can swap out the AdamW optimizer for pytorch's AdamW. I think you might need to adjust some of the arguments. |
This is because of apex cannot import amp_C,you can check the file "/home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/optimizers/fused_adam.py", also you can use your python shell to verify this:
Maybe you can get error like:
And you can add |
I solved this problem by building with
rather than
My pip version is 22.3.1. |
I tried this thing: pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./ but it did not solve the problem |
Try below: pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings --global-option=--cpp_ext --config-settings --global-option=--cuda_ext ./ It worked with pip 23.2.1 on python 3.9 |
This works for me! Thanks! |
I have built the apex module based on the procedure explained but when trying to train the model on cifar10, I get:
I understand that this is an apex-related issue since I get the following error when trying to run
examples/simple/distributed
in the apex repo:I have tried many things to fix this issue but no luck. I have two questions:
FusedAdam requires cuda extensions
even though I build apex with--global-option="--cpp_ext" --global-option="--cuda_ext"
options?The text was updated successfully, but these errors were encountered: