The changed things -- DataParallel removed from train.py script #310

jinseok-karl · 2020-12-15T05:33:06Z

jinseok-karl
Dec 15, 2020

Hello. Thanks for sharing this great codes.
From the new version,
In train.py does not containg args.num_gpu.
Is it implies I can't train multi-gpu more from now?

In contrast, validate.py contains args.num_gpu but not args.cpu.
I think both are important.
Is there any my misunderstanding?

Best regard
Karl

Answered by rwightman

Dec 15, 2020

@jinseok-karl yes, I removed support for DataParallel in the train script. It wasn't worth maintaining as it conflicts with a number of the useful other training options and seems to be a lower priority for PyTorch team these days. It is slower than DDP and all around not so useful. DDP is really easy to use via the shell script here for multi-gpu single machine training. It's still used for validation because it's hard to get 100% correct multi-gpu validation for ALL samples in a validation set without using it (or writing some extra fiddly code), the DDP default data setup involves padding the last few samples.

See:
https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html
ht…

View full answer

rwightman · 2020-12-15T16:35:12Z

rwightman
Dec 15, 2020
Maintainer

@jinseok-karl yes, I removed support for DataParallel in the train script. It wasn't worth maintaining as it conflicts with a number of the useful other training options and seems to be a lower priority for PyTorch team these days. It is slower than DDP and all around not so useful. DDP is really easy to use via the shell script here for multi-gpu single machine training. It's still used for validation because it's hard to get 100% correct multi-gpu validation for ALL samples in a validation set without using it (or writing some extra fiddly code), the DDP default data setup involves padding the last few samples.

See:
https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html
https://discuss.pytorch.org/t/dataparallel-vs-distributeddataparallel/77891

1 reply

jinseok-karl Dec 16, 2020
Author

Thanks! It really helps. I appreciate to you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The changed things -- DataParallel removed from train.py script #310

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

The changed things -- DataParallel removed from train.py script #310

jinseok-karl Dec 15, 2020

Replies: 1 comment · 1 reply

rwightman Dec 15, 2020 Maintainer

jinseok-karl Dec 16, 2020 Author

jinseok-karl
Dec 15, 2020

Replies: 1 comment 1 reply

rwightman
Dec 15, 2020
Maintainer

jinseok-karl Dec 16, 2020
Author