Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated exec_folder for multi-threading calls #218

Open
astrozot opened this issue Apr 6, 2024 · 0 comments
Open

Duplicated exec_folder for multi-threading calls #218

astrozot opened this issue Apr 6, 2024 · 0 comments

Comments

@astrozot
Copy link

astrozot commented Apr 6, 2024

I am often calling Pigeons.pigeons inside a Threads.@threads for loop to take advantage of extra computing resources I have. I am now trying to use the checkpoint feature, but I noticed that the way the exec_folder is set is not thread-safe: there is no check that the created folder is unique, as the name is just based (essentially) on the current time. As a result, if I start all the threads at the same time, I will end up having the same folder, with a lot of issues then on the saved checkpoints.

To have thread-safe folders, one could just replace the current code with

function next_exec_folder()
    formatted_time = Dates.format(now(), dateformat"yyyy-mm-dd-HH-MM-SS")
    result = mktempdir("results/all"; prefix=formatted_time, cleanup=false)
    _ensure_symlinked(result)
    return result
end

This is simpler and (seems to me) safer than the current code, and solves the issue. There is then still a problem with _ensure_symlinked(result), as that function would produce just one symbolic link, but I believe this is more complicated to solve.

https://github.com/Julia-Tempering/Pigeons.jl/blob/ab190b885272c66992d84102fbdfcf5ebb97c0d9/src/utils/exec_folder.jl#L11C14-L11C50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant