Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why we can not set use_multiprocessing=True #16

Open
plutols opened this issue Mar 31, 2022 · 9 comments
Open

why we can not set use_multiprocessing=True #16

plutols opened this issue Mar 31, 2022 · 9 comments

Comments

@plutols
Copy link

plutols commented Mar 31, 2022

self.model.fit_generator(data_generator.generator(batch_size = self.batch_size,validation = False),
validation_data = data_generator.generator(batch_size =self.batch_size,validation = True),
epochs = self.max_epochs,
steps_per_epoch = data_generator.train_length//self.batch_size,
validation_steps = self.batch_size,
# use_multiprocessing=True,
callbacks=[checkpointer, reduce_lr, csv_logger, early_stopping])
when I set use_multiprocessing=True, then the train can not start,but when I set use_multiprocessing=False,then the train speed is very low. any idea I can use multiprocessing

@Le-Xiaohuai-speech
Copy link
Owner

I have found that deadlock happens when use_multiprocessing = True. Using Dastaset from Pytorch to get a data generator may be a better choice if you want to load data in parallel.

@plutols
Copy link
Author

plutols commented Mar 31, 2022

I use keras.utils.Sequence, and now it can load data in parallel, but the train speed is still slow,about 15s/step, batch_size=8. I think it may be the computational complexity of DprnnBlock is too high, Another, the GPU memory utilization is very low, only 151M. Have you any idea I can speed up

@Le-Xiaohuai-speech
Copy link
Owner

Le-Xiaohuai-speech commented Mar 31, 2022 via email

@plutols
Copy link
Author

plutols commented Mar 31, 2022

what is your train speed? and how much your GPU memory utilization

@Le-Xiaohuai-speech
Copy link
Owner

Le-Xiaohuai-speech commented Mar 31, 2022 via email

@plutols
Copy link
Author

plutols commented Mar 31, 2022

oh my god, my tensorflow version is 1.15.0,and my cuda is 11.4. so I should upgrate my tensorflow?

@Le-Xiaohuai-speech
Copy link
Owner

Le-Xiaohuai-speech commented Mar 31, 2022 via email

@plutols
Copy link
Author

plutols commented Mar 31, 2022

it works, thanks!

@rohithmars
Copy link

@plutols @Le-Xiaohuai-speech
i am curious how to use keras.utils.Sequence helps to use multiprocessing. When I tried it, the training cannot start. It seems to be stuck after displaying epoch 1/200

Could you please tell me how you used keras.utils.Sequence?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants