Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[NeMo-UX] Turn on mcore performance optimizations (NVIDIA#10209)
* expose TP overlap Signed-off-by: Jieming Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * add tp overlap recipes Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * turn on pipeline parallel overlap Signed-off-by: Jimmy Zhang <[email protected]> * refactor Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * Update base.py Signed-off-by: JimmyZhang12 <[email protected]> * Update megatron_parallel.py Signed-off-by: JimmyZhang12 <[email protected]> * remove env var Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * add optimization config Signed-off-by: Jimmy Zhang <[email protected]> * fix typo Signed-off-by: Jimmy Zhang <[email protected]> * refactor into megatron parallel setup Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * refactor Signed-off-by: Jimmy Zhang <[email protected]> * fix config ordering, add wgrad deferral Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * cleanup Signed-off-by: Jimmy Zhang <[email protected]> * use config Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * clean Signed-off-by: Jimmy Zhang <[email protected]> * enable wgrad defferal Signed-off-by: Jimmy Zhang <[email protected]> * add grad bucket size Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * move everthing into a callback Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * cleanup Signed-off-by: Jimmy Zhang <[email protected]> * fix imports Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * move userbuffer init Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * cleanup Signed-off-by: Jimmy Zhang <[email protected]> * fix VP Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * address comments Signed-off-by: Jimmy Zhang <[email protected]> * add gradient accum guard Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * Update base.py Signed-off-by: JimmyZhang12 <[email protected]> * address comments Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * address comments Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> --------- Signed-off-by: Jieming Zhang <[email protected]> Signed-off-by: JimmyZhang12 <[email protected]> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: JimmyZhang12 <[email protected]> Co-authored-by: Jieming Zhang <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]>
- Loading branch information