Add AMP to ImageNet classification and segmentation scripts + auto layout #1201

Kh4L · 2020-02-25T01:21:06Z

Signed-off-by: Serge Panev [email protected]

mli · 2020-03-11T01:41:02Z

Job PR-1201-3 is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1201/3/index.html
Code coverage of this PR: vs. Master:

wuxun-zhang · 2020-03-11T04:03:17Z

scripts/classification/imagenet/train_imagenet.py

+    assert not opt.auto_layout or opt.amp, "--auto-layout needs to be used with --amp"
+
+    if opt.amp:
+        amp.init(layout_optimization=opt.auto_layout)


Referring to definition of amp.init() here, seems there is no argument like layout_optimization ?

It's an internal feature, it will be added soon

Thanks for clarification.

Just curiously, when setting both --amp and --dtype float16, what will be happening?

mli · 2020-03-11T05:18:22Z

Job PR-1201-9 is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1201/9/index.html
Code coverage of this PR: vs. Master:

wuxun-zhang · 2020-03-12T05:55:22Z

scripts/classification/imagenet/train_imagenet.py

@@ -105,6 +106,10 @@ def parse_args():
                        help='name of training log file')
    parser.add_argument('--use-gn', action='store_true',
                        help='whether to use group norm.')
+    parser.add_argument('--amp', action='store_true',
+                    help='Use MXNet AMP for mixed precision training.')
+    parser.add_argument('--auto-layout', action='store_true',


Could you also add an option like --target-dtype since now we not only have float16 for amp, but bfloat16. Then, we can pass target-dtype to amp.init() to enable float16/bfloat16 training for GPU and CPU respectively. Thanks.

wuxun-zhang · 2020-03-12T05:55:32Z

scripts/classification/imagenet/train_imagenet.py

        if opt.resume_states is not '':
            trainer.load_states(opt.resume_states)

+        if opt.amp:


Here may need change to if opt.amp and opt.target_dtype == 'float16':

wuxun-zhang · 2020-03-12T05:55:46Z

scripts/classification/imagenet/train_imagenet.py

@@ -404,8 +417,13 @@ def train(ctx):
                                  p.astype('float32', copy=False)) for yhat, y, p in zip(outputs, label, teacher_prob)]
                    else:
                        loss = [L(yhat, y.astype(opt.dtype, copy=False)) for yhat, y in zip(outputs, label)]
-                for l in loss:
-                    l.backward()
+                    if opt.amp:


Here may need change to if opt.amp and opt.target_dtype == 'float16':

zhreshold · 2020-04-03T21:34:42Z

@hetong007 @bryanyzhu @zhanghang1989

bryanyzhu · 2020-04-03T21:46:31Z

scripts/segmentation/train.py

@@ -210,7 +216,12 @@ def __init__(self, args, logger):
                v.wd_mult = 0.0

        self.optimizer = gluon.Trainer(self.net.module.collect_params(), args.optimizer,
-                                       optimizer_params, kvstore=kv)
+                                       optimizer_params, update_on_kvstore=(False if args.amp else None))


May I know why kvstore=kv is deleted? Could you add it back? Thanks.

bryanyzhu · 2020-04-03T21:47:57Z

scripts/segmentation/train.py

@@ -95,6 +96,11 @@ def parse_args():
    # synchronized Batch Normalization
    parser.add_argument('--syncbn', action='store_true', default=False,
                        help='using Synchronized Cross-GPU BatchNorm')
+    # performance related
+    parser.add_argument('--amp', action='store_true',


We usually add default=False for arguments. Could you add it? Thank you.

xinyu-intel · 2020-04-22T07:06:20Z

@Kh4L any update on this PR?

zhreshold · 2020-05-11T19:14:01Z

@Kh4L Any update for this? BTW, do you have numbers for the improvement?

Signed-off-by: Serge Panev <[email protected]>

Kh4L changed the title ~~Add AMP to ImageNet classification script~~ Add AMP to ImageNet classification and segmentation scripts Mar 11, 2020

Kh4L force-pushed the add_amp_classification branch from 21c9d60 to 52c5650 Compare March 11, 2020 03:30

Kh4L changed the title ~~Add AMP to ImageNet classification and segmentation scripts~~ Add AMP to ImageNet classification and segmentation scripts + auto layout Mar 11, 2020

Kh4L force-pushed the add_amp_classification branch from 52c5650 to a90357b Compare March 11, 2020 03:55

wuxun-zhang reviewed Mar 11, 2020

View reviewed changes

Kh4L force-pushed the add_amp_classification branch 2 times, most recently from 5682d9a to f51f405 Compare March 11, 2020 04:19

wuxun-zhang reviewed Mar 12, 2020

View reviewed changes

bryanyzhu reviewed Apr 3, 2020

View reviewed changes

Kh4L added 2 commits November 20, 2020 11:57

Add AMP to ImageNet classification script

52ead50

Signed-off-by: Serge Panev <[email protected]>

Add auto layout to classification, detection and segmentation scripts

f2e92a4

Signed-off-by: Serge Panev <[email protected]>

Kh4L force-pushed the add_amp_classification branch from f51f405 to f2e92a4 Compare November 20, 2020 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AMP to ImageNet classification and segmentation scripts + auto layout #1201

Add AMP to ImageNet classification and segmentation scripts + auto layout #1201

Kh4L commented Feb 25, 2020

mli commented Mar 11, 2020

wuxun-zhang Mar 11, 2020

Kh4L Mar 11, 2020

wuxun-zhang Mar 11, 2020

wuxun-zhang Mar 12, 2020

mli commented Mar 11, 2020

wuxun-zhang Mar 12, 2020

wuxun-zhang Mar 12, 2020

wuxun-zhang Mar 12, 2020

zhreshold commented Apr 3, 2020

bryanyzhu Apr 3, 2020

bryanyzhu Apr 3, 2020

xinyu-intel commented Apr 22, 2020

zhreshold commented May 11, 2020

Add AMP to ImageNet classification and segmentation scripts + auto layout #1201

Are you sure you want to change the base?

Add AMP to ImageNet classification and segmentation scripts + auto layout #1201

Conversation

Kh4L commented Feb 25, 2020

mli commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mli commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhreshold commented Apr 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xinyu-intel commented Apr 22, 2020

zhreshold commented May 11, 2020