You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 9, 2024. It is now read-only.
I run this script deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8
and it gets stuck just like in the picture.
Log:
(base) raihanafiandi@instance-1:~/playground/transformers-bloom-inference$ deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8
[2023-03-14 05:30:02,152] [WARNING] [runner.py:186:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
[2023-03-14 05:30:02,965] [INFO] [runner.py:550:main] cmd = /opt/conda/bin/python3.7 -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8
[2023-03-14 05:30:04,255] [INFO] [launch.py:142:main] WORLD INFO DICT: {'localhost': [0]}
[2023-03-14 05:30:04,255] [INFO] [launch.py:149:main] nnodes=1, num_local_procs=1, node_rank=0
[2023-03-14 05:30:04,255] [INFO] [launch.py:161:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
[2023-03-14 05:30:04,255] [INFO] [launch.py:162:main] dist_world_size=1
[2023-03-14 05:30:04,255] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0
[2023-03-14 05:30:05,816] [INFO] [comm.py:663:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
*** Loading the model bigscience/bloomz-7b1
Fetching 8 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 104857.60it/s]
Fetching 8 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 113359.57it/s]
Fetching 8 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 99864.38it/s]
Fetching 8 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 32832.13it/s]
[2023-03-14 05:30:13,610] [INFO] [logging.py:77:log_dist] [Rank 0] DeepSpeed info: version=0.8.2, git-hash=unknown, git-branch=unknown
[2023-03-14 05:30:13,611] [WARNING] [config_utils.py:77:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2023-03-14 05:30:13,611] [INFO] [logging.py:77:log_dist] [Rank 0] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
Installed CUDA version 11.0 does not match the version torch was compiled with 11.1 but since the APIs are compatible, accepting this combination
Using /home/raihanafiandi/.cache/torch_extensions/py37_cu111 as PyTorch extensions root...
Is it because I run other bloom models? (bloomz-7b-1)? Please help me on this. Thank you
The text was updated successfully, but these errors were encountered:
I run this script
deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8
and it gets stuck just like in the picture.
Log:
Is it because I run other bloom models? (bloomz-7b-1)? Please help me on this. Thank you
The text was updated successfully, but these errors were encountered: