Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[conflict] Multiprocessing start method on macOS #4853

Open
meiyasan opened this issue Aug 16, 2024 · 5 comments
Open

[conflict] Multiprocessing start method on macOS #4853

meiyasan opened this issue Aug 16, 2024 · 5 comments
Assignees

Comments

@meiyasan
Copy link

Hello,

I have many conflicts with other imports on macOS using PyCBC due to:
https://github.com/gwastro/pycbc/blob/master/pycbc/__init__.py#L211C1-L216C49

 # MacosX after python3.7 switched to 'spawn', however, this does not
    # preserve common state information which we have relied on when using
    # multiprocessing based pools.
    import multiprocessing
    if hasattr(multiprocessing, 'set_start_method'):
        multiprocessing.set_start_method('fork')

Here is a typical error message:

  File "/Users/marcomeyer/.conda/envs/myenv/lib/python3.11/multiprocessing/context.py", line 248, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set

I assume this is used for performances, but in my case I don't use pycbc for heavy computation.
Is there any chance to use spawn method or just disable this method using a custom variable maybe ?

@ahnitz
Copy link
Member

ahnitz commented Aug 20, 2024

@xkzl Does this PR #4620 address the issue in your use case? If not, we are happy to accept PRs here to help improve this behavior for everyone. Or suggestions for how you'd like this to work.

@meiyasan
Copy link
Author

meiyasan commented Sep 24, 2024

@ahnitz I see some updates from #4620, yes same issue.

I would perhaps recommend using the following piece of code, wherever a Pool is called instead of imposing a context to everyone in a shared library such as PyCBC.

from multiprocessing import get_context
get_context("fork").Pool()

instead of:

set_start_method('fork')
mp.Pool()

@ahnitz
Copy link
Member

ahnitz commented Sep 24, 2024

@xkzl Thank you for that suggestion. That's seems like a very straightforward change so I've created a PR #4890 to correct this.

@ahnitz ahnitz self-assigned this Sep 24, 2024
@ahnitz
Copy link
Member

ahnitz commented Sep 24, 2024

@xkzl When you get the chance, let us know if this issue is now resolved. If you are satisfied, please close this issue, otherwise, and update on what further problems you experience would be helpful.

@meiyasan
Copy link
Author

meiyasan commented Oct 6, 2024

There might be an additional fix, because of changes made in 'fork()' by Apple at High Sierra release for security purposes.

It would consist in checking the version of macOS >=10.13 and maybe consider using spawn after that. In my use of multiprocessing, I use "spawn" for macOS >= 10.13 otherwise
"fork".

Roughly speaking "fork" is faster, because it reuses previous program state while "spawn" reloads all imports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants