FusedAdam requires cuda extensions #11

mehranagh20 · 2021-04-15T09:53:53Z

I have built the apex module based on the procedure explained but when trying to train the model on cifar10, I get:

/lustre03/project/6054857/mehranag/vdvae/data.py:147: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
  trX = np.vstack(data['data'] for data in tr_data)
Traceback (most recent call last):
  File "train.py", line 144, in <module>
    main()
  File "train.py", line 140, in main
    train_loop(H, data_train, data_valid_or_test, preprocess_fn, vae, ema_vae, logprint)
  File "train.py", line 59, in train_loop
    optimizer, scheduler, cur_eval_loss, iterate, starting_epoch = load_opt(H, vae, logprint)
  File "/lustre03/project/6054857/mehranag/vdvae/train_helpers.py", line 180, in load_opt
    optimizer = AdamW(vae.parameters(), weight_decay=H.wd, lr=H.lr, betas=(H.adam_beta1, H.adam_beta2))
  File "/home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/optimizers/fused_adam.py", line 79, in __init__
    raise RuntimeError('apex.optimizers.FusedAdam requires cuda extensions')
RuntimeError: apex.optimizers.FusedAdam requires cuda extensions

I understand that this is an apex-related issue since I get the following error when trying to run examples/simple/distributed in the apex repo:

Warning:  multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.  Original ImportError was: ImportError("/lib64/libm.so.6: version `GLIBC_2.29' not found (required by /home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/amp_C.cpython-36m-x86_64-linux-gnu.so)",)
final loss =  tensor(0.5392, device='cuda:0', grad_fn=<MseLossBackward>)

I have tried many things to fix this issue but no luck. I have two questions:

Does anybody know why I get FusedAdam requires cuda extensions even though I build apex with --global-option="--cpp_ext" --global-option="--cuda_ext" options?
How can I avoid using apex? - I am only trying to test some stuff on cifar10 and don't need the distributed training feature considering that I'm getting some weird errors!

The text was updated successfully, but these errors were encountered:

rewonc · 2021-05-03T16:39:41Z

@mehranagh20 -- Are you using the code on a GPU, and do you have the appropriate CUDA drivers enabled?

If you want to avoid using apex, you can swap out the AdamW optimizer for pytorch's AdamW. I think you might need to adjust some of the arguments.

Chiang97912 · 2022-11-28T07:43:30Z

This is because of apex cannot import amp_C，you can check the file "/home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/optimizers/fused_adam.py", also you can use your python shell to verify this:

import torch
import amp_C  # must import torch before import amp_C

Maybe you can get error like: libstdc++.so.6: version 'GLIBCXX_3.4.20' not found, If so, you can try the following commands:

conda install libgcc
export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH
cd /path/to/anaconda/envs/myenv/lib
ln -s libstdc++.so.6.0.30 libstdc++.so.6

And you can add export LD_LIBRARY_PATH=/path/to/anaconda/envs/myenv/lib:$LD_LIBRARY_PATH to ~/.bashrc file.

ShoufaChen · 2023-06-25T13:19:59Z

I solved this problem by building with

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

rather than

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

My pip version is 22.3.1.

AanchalChugh · 2023-07-22T23:01:48Z

I tried this thing: pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./ but it did not solve the problem

barikata1984 · 2023-09-30T20:47:42Z

Try below:

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings --global-option=--cpp_ext --config-settings --global-option=--cuda_ext ./

It worked with pip 23.2.1 on python 3.9

Guodanding · 2024-05-11T02:31:59Z

Try below:

pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings --global-option=--cpp_ext --config-settings --global-option=--cuda_ext ./

It worked with pip 23.2.1 on python 3.9

This works for me! Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FusedAdam requires cuda extensions #11

FusedAdam requires cuda extensions #11

mehranagh20 commented Apr 15, 2021

rewonc commented May 3, 2021

Chiang97912 commented Nov 28, 2022 •

edited

Loading

ShoufaChen commented Jun 25, 2023

AanchalChugh commented Jul 22, 2023

barikata1984 commented Sep 30, 2023 •

edited

Loading

Guodanding commented May 11, 2024

FusedAdam requires cuda extensions #11

FusedAdam requires cuda extensions #11

Comments

mehranagh20 commented Apr 15, 2021

rewonc commented May 3, 2021

Chiang97912 commented Nov 28, 2022 • edited Loading

ShoufaChen commented Jun 25, 2023

AanchalChugh commented Jul 22, 2023

barikata1984 commented Sep 30, 2023 • edited Loading

Guodanding commented May 11, 2024

Chiang97912 commented Nov 28, 2022 •

edited

Loading

barikata1984 commented Sep 30, 2023 •

edited

Loading