You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, author, when I execute run_crag_inference.sh, the error shows that the single card graphics memory is insufficient, but the multi-card operation seems to be not set in the original code, right?
[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB. GPU
The text was updated successfully, but these errors were encountered:
Hi! generator = LLM(model=args.generator_path, dtype="half") utilized single GPU in our code, you can add the parameter tensor_parallel_size for multiple-GPU inference.
Hello, author, when I execute run_crag_inference.sh, the error shows that the single card graphics memory is insufficient, but the multi-card operation seems to be not set in the original code, right?
[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB. GPU
The text was updated successfully, but these errors were encountered: