-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processpool #515
base: master
Are you sure you want to change the base?
Processpool #515
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Looking good. Can you please add a bit of documentation to this explaining the background behind what it solves and why (with links to background info), because it's a bit non-obvious! |
Will do, thank you. |
Ok,
Besides that, we also provide overloaded Speaking of daemonic, what is it all about? For our purposes, just 2 things seem relevant.
I don't think it would a controversial thing to say that First, what is the reason we need 4 different types of pools? As long as we only use the
In reality, it looks like most of the features are not being used. At least in fastai code:
I'd like to propose:
if max_workers:
with pool(...) as pl: r = pl.map(...)`
else: r = map(...)
Proposed interface for parallel: #before:
parallel(f, items, *args, n_workers=defaults.cpus, total=None, progress=None, pause=0,
method=None, threadpool=False, chunksize=1, maxtasksperchild=None, **kwargs)
#after:
parallel(f, items, *args, n_workers=defaults.cpus, total=None, progress=None, pause=0,
chunksize=1, reuse_workers=True, **kwargs) P.S:
At the bottom of the imports cell, we have try:
if sys.platform == 'darwin' and IN_NOTEBOOK: set_start_method("fork")
except: pass It was introduced in a commit that just says "fix import" 65d703f , and it changes the behaviour of all python multiprocessing methods on macOS. Should it be there at all? |
@jph00 , I had a crack at it. What do you think? https://github.com/xl0/fastcore/blob/minpool/nbs/03a_parallel.ipynb All fastai tests worked with --n_workers=1 and default. |
Yes, that's the only way we've found to get DL notebooks to run in practice, despite the concerns that you correctly pointed out. In your analysis above you seem AFAICT to largely be restricting your impact analysis to usages within the fastcore lib itself. Would you be able to also test the impact of these changes on running the tests for nbdev, fastai, ghapi, and execnb? (Apologies in advance if you've already done this and I missed it.) |
@jph00 , Thank you for the clarification. No, I did test with fastcore, nbdev and fastai. I just checked - execnb passes clearly, ghapi needs to drop |
@jph00 , one thing I've noticed with There is no guarantee as to which worker will start first when using Do you think this is an issue? |
# |export
def _gen(items, pause):
for item in items:
time.sleep(pause)
yield item
with ProcessPool(n_workers, context=get_context(method), reuse_workers=reuse_workers) as ex:
lock = Manager().Lock()
_g = partial(_call, lock, pause, n_workers, g)
r = ex.imap(_g, items, chunksize=chunksize)
with ProcessPool(n_workers, context=get_context(method), reuse_workers=reuse_workers) as ex:
_items = _gen(items, pause)
r = ex.imap(g, _items, chunksize=chunksize)
|
@jph00 , please disregard the previous comment, I was wrong. Would it be an issue that the tasks can start out of order? |
No description provided.