Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About a best practice vignette #15

Open
pat-s opened this issue Jan 14, 2020 · 2 comments
Open

About a best practice vignette #15

pat-s opened this issue Jan 14, 2020 · 2 comments

Comments

@pat-s
Copy link

pat-s commented Jan 14, 2020

@HenrikBengtsson and I were discussing the optimal use of {doRNG} within the {future} framework in futureverse/doFuture#41.

The discussion started after I posted a blog post about reproducible parallel streams in R.

We've been discussing which way of the following is better to recommend/use longterm

  • using the %dorng% operator
  • using doRNG::registerDoRNG() with %dopar%

Maybe you can give some insights from the dev perspective. These could potentially be added to the README/vignette of this package.

@renozao
Copy link
Owner

renozao commented Feb 26, 2020 via email

@HenrikBengtsson
Copy link

Thanks for your input on this. So, I initially agreed with you but lately, I think developers should use the explicit %dorng%. My rationale is that it is only with %dorng% that you as the developer can be sure that the algorithm you develop uses a valid RNG stream. With doRNG::registerDoRNG() and %dopar% we put that burden on the end-user and we cannot demand that they should know what is needed, especially not if our foreach() code is part of a package far down the dependency tree. Because of this, I'd even argue that not using %dorng% is incorrect/a bug.

If parallel RNG end up being handled upstream by foreach() (RevolutionAnalytics/foreach#6), then I can imagine that one can replace the %dorng%-wrapper "hack" with a formal argument, e.g. y <- foreach(..., rng = TRUE) %dopar% { ... }. That way the developer has full control of parallel RNGs. One can imagine support for alternative parallel RNGs via this argument too, e.g. rng = "L'Ecuyer-CMRG", rng = list(<seeds>), etc. but the exact design of that is better discussed in RevolutionAnalytics/foreach#6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants