Replies: 1 comment
-
There are many reasons causing nan in FP16 training. You can refer to PyTorch documentation for advices. MMEngine uses PyTorch |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When I use FP16(pytorch amp API), I meet loss nan bug.
So how about your AMP code? Does it can fix loss_nan? Or do you meet loss_nan in amp training ?How do you fix this problem?
Thank you for your answer.
Beta Was this translation helpful? Give feedback.
All reactions