CRAG_inference.py #19

zsyggg · 2024-08-18T03:26:56Z

Hello, author, when I execute run_crag_inference.sh, the error shows that the single card graphics memory is insufficient, but the multi-card operation seems to be not set in the original code, right?

[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB. GPU

HuskyInSalt · 2024-09-04T05:09:25Z

Hi! generator = LLM(model=args.generator_path, dtype="half") utilized single GPU in our code, you can add the parameter tensor_parallel_size for multiple-GPU inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRAG_inference.py #19

CRAG_inference.py #19

zsyggg commented Aug 18, 2024 •

edited

Loading

HuskyInSalt commented Sep 4, 2024

CRAG_inference.py #19

CRAG_inference.py #19

Comments

zsyggg commented Aug 18, 2024 • edited Loading

HuskyInSalt commented Sep 4, 2024

zsyggg commented Aug 18, 2024 •

edited

Loading