-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why we can not set use_multiprocessing=True #16
Comments
I have found that deadlock happens when use_multiprocessing = True. Using Dastaset from Pytorch to get a data generator may be a better choice if you want to load data in parallel. |
I use keras.utils.Sequence, and now it can load data in parallel, but the train speed is still slow,about 15s/step, batch_size=8. I think it may be the computational complexity of DprnnBlock is too high, Another, the GPU memory utilization is very low, only 151M. Have you any idea I can speed up |
Check the CPU and the GPU usage. Maybe there is something wrong with your Tensorflow and the CUDA is unavailable. Check the versions of tf and keras.
Replace the LSTM with CUDNNLSTM can speed up the training.
…---Original---
From: ***@***.***>
Date: Thu, Mar 31, 2022 19:27 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] why we can not set use_multiprocessing=True (Issue #16)
I use keras.utils.Sequence, and now it can load data in parallel, but the train speed is still slow,about 15s/step, batch_size=8. I think it may be the computational complexity of DprnnBlock is too high, Another, the GPU memory utilization is very low, only 151M. Have you any idea I can speed up
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
what is your train speed? and how much your GPU memory utilization |
1s / batch, 12 Gb
…---Original---
From: ***@***.***>
Date: Thu, Mar 31, 2022 19:42 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] why we can not setuse_multiprocessing=True (Issue #16)
what is your train speed? and how much your GPU memory utilization
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
oh my god, my tensorflow version is 1.15.0,and my cuda is 11.4. so I should upgrate my tensorflow? |
you can update the Tensorflow to 2.X and the training step still works.
if you want to use tf 1.X on CUDA 11, install the nvidia-tensorflow by:
pip install --upgrade pip
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
pip install nvidia-tensorboard==1.15
…---------------- 原始邮件 ------------------
发件人: "Le-Xiaohuai-speech/DPCRN_DNS3" ***@***.***>;
发送时间: 2022年3月31日(星期四) 晚上7:47
***@***.***>;
***@***.******@***.***>;
主题: Re: [Le-Xiaohuai-speech/DPCRN_DNS3] why we can not set use_multiprocessing=True (Issue #16)
oh my god, my tensorflow version is 1.15.0,and my cuda is 11.4. so I should upgrate my tensorflow?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
it works, thanks! |
@plutols @Le-Xiaohuai-speech Could you please tell me how you used keras.utils.Sequence? |
self.model.fit_generator(data_generator.generator(batch_size = self.batch_size,validation = False),
validation_data = data_generator.generator(batch_size =self.batch_size,validation = True),
epochs = self.max_epochs,
steps_per_epoch = data_generator.train_length//self.batch_size,
validation_steps = self.batch_size,
# use_multiprocessing=True,
callbacks=[checkpointer, reduce_lr, csv_logger, early_stopping])
when I set use_multiprocessing=True, then the train can not start,but when I set use_multiprocessing=False,then the train speed is very low. any idea I can use multiprocessing
The text was updated successfully, but these errors were encountered: