OutOfMemoryError #355

nwoyecid · 2023-07-06T17:19:05Z

I have OOM error during inference but not during training.

This happens even with batch size of 1 and even with increasing the GPU memory.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 11.25 GiB (GPU 0; 44.42 GiB total capacity; 36.96 GiB already allocated; 3.95 GiB free; 38.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I think its not OOM issue but something wrong with the Trainer increasing reserving memory.

The text was updated successfully, but these errors were encountered:

nwoyecid · 2023-07-12T14:51:21Z

This error occurs only during inference/evaluation and not during training.

AhmedGamal411 · 2023-09-25T11:49:43Z

I can infer fine, can you share the code ?

Siddharth-Latthe-07 · 2024-06-25T05:24:25Z

I guess there could be many factors for occurrence of the OOM error:-
Like if you are using techniques like beam search in NLP models or large batch sizes for evaluation.
Secondly, Memory fragmentation can lead to inefficient use of GPU memory.
Try using this:-

import torch
torch.cuda.set_per_process_memory_fraction(0.9)  # Adjust the fraction as needed
torch.backends.cuda.matmul.allow_tf32 = True  # Enabling TF32 precision to save memory

or you may even try to clean up the cache before starting inference by this:-

import torch
torch.cuda.empty_cache()

provide the code for more detailed help
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OutOfMemoryError #355

OutOfMemoryError #355

nwoyecid commented Jul 6, 2023

nwoyecid commented Jul 12, 2023

AhmedGamal411 commented Sep 25, 2023

Siddharth-Latthe-07 commented Jun 25, 2024

OutOfMemoryError #355

OutOfMemoryError #355

Comments

nwoyecid commented Jul 6, 2023

nwoyecid commented Jul 12, 2023

AhmedGamal411 commented Sep 25, 2023

Siddharth-Latthe-07 commented Jun 25, 2024