You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to understand LayerNorm tutorial.
I figured out the main algorithm, but I still have a question about atomic operations in grouped computation of dw/db in backward.
Specifically, I'm asking about the operation tl.atomic_xchg(Count, 1) in the code:
def_layer_norm_bwd_dx_fused(...):
...
# Accumulate partial sums for dw/dbpartial_dw= (dy*xhat).to(w.dtype)
partial_db= (dy).to(w.dtype)
whiletl.atomic_cas(Lock, 0, 1) ==1:
passcount=tl.load(Count)
# First store doesn't accumulateifcount==0:
tl.atomic_xchg(Count, 1)
else:
partial_dw+=tl.load(DW, mask=mask)
partial_db+=tl.load(DB, mask=mask)
tl.store(DW, partial_dw, mask=mask)
tl.store(DB, partial_db, mask=mask)
# Release the locktl.atomic_xchg(Lock, 0)
Why do we need tl.atomic_xchg here?
Why do we not use just the tl.store(Count, 1)?
If I correctly understand, the program is under the lock while tl.atomic_cas(Lock, 0, 1) == 1:..., and using tl.store should be correct in this case (like the code tl.store(DW, partial_dw, mask=mask) does correct work).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm trying to understand LayerNorm tutorial.
I figured out the main algorithm, but I still have a question about atomic operations in grouped computation of
dw
/db
in backward.Specifically, I'm asking about the operation
tl.atomic_xchg(Count, 1)
in the code:Why do we need
tl.atomic_xchg
here?Why do we not use just the
tl.store(Count, 1)
?If I correctly understand, the program is under the lock
while tl.atomic_cas(Lock, 0, 1) == 1:...
, and usingtl.store
should be correct in this case (like the codetl.store(DW, partial_dw, mask=mask)
does correct work).Beta Was this translation helpful? Give feedback.
All reactions