Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zombie fix attempts #1704

Merged
merged 3 commits into from
Nov 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Package: brms
Encoding: UTF-8
Type: Package
Title: Bayesian Regression Models using 'Stan'
Version: 2.22.5
Date: 2024-11-08
Version: 2.22.6
Date: 2024-11-14
Authors@R:
c(person("Paul-Christian", "Bürkner", email = "[email protected]",
role = c("aut", "cre")),
Expand Down
10 changes: 10 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@
* Fit extended-support Beta models via family `xbeta`
thanks to Ioannis Kosmidis. (#1698)

### Bug Fixes

* Avoid the creation of zombie workers when executing `log_lik`
in parallel thanks to Aki Vehtari and Noa Kallioinen.
For now, `log_lik` will use PSOCK clusters if run
in parallel even on Unix systems. To avoid potential speed loss for small
models, `log_lik` will not use `option(mc.cores)` anymore.
These changes may be reverted once the underlying causes of this
issue have been fixed. (#1658)

### Other Changes

* Improve sampling efficiency of `beta_binomial` models. (#1703)
Expand Down
6 changes: 4 additions & 2 deletions R/brmsfit-helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -845,11 +845,13 @@ arg_names <- function(method) {
}

# validate 'cores' argument for use in post-processing functions
validate_cores_post_processing <- function(cores) {
validate_cores_post_processing <- function(cores, use_mc_cores = FALSE) {
if (is.null(cores)) {
if (os_is_windows()) {
if (os_is_windows() || !use_mc_cores) {
# multi cores often leads to a slowdown on windows
# in post-processing functions as discussed in #1129
# multi cores may also lead to zombie workers
# on unix systems as discussed in #1658
cores <- 1L
} else {
cores <- getOption("mc.cores", 1L)
Expand Down
2 changes: 1 addition & 1 deletion R/log_lik.R
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ log_lik.brmsprep <- function(object, cores = NULL, ...) {
object$dpars[[dp]] <- get_dpar(object, dpar = dp)
}
N <- choose_N(object)
out <- plapply(seq_len(N), log_lik_fun, cores = cores, prep = object)
out <- plapply(seq_len(N), log_lik_fun, .cores = cores, prep = object)
out <- do_call(cbind, out)
colnames(out) <- NULL
old_order <- object$old_order
Expand Down
17 changes: 10 additions & 7 deletions R/misc.R
Original file line number Diff line number Diff line change
Expand Up @@ -421,22 +421,25 @@ cblapply <- function(X, FUN, ...) {
}

# parallel lapply sensitive to the operating system
plapply <- function(X, FUN, cores = 1, ...) {
if (cores == 1) {
# args:
# .psock: use a PSOCK cluster? Default is TRUE until
#. the zombie worker issue #1658 has been fully resolved
plapply <- function(X, FUN, .cores = 1, .psock = TRUE, ...) {
if (.cores == 1) {
out <- lapply(X, FUN, ...)
} else {
if (!os_is_windows()) {
out <- parallel::mclapply(X = X, FUN = FUN, mc.cores = cores, ...)
if (!os_is_windows() && !.psock) {
out <- parallel::mclapply(X = X, FUN = FUN, mc.cores = .cores, ...)
} else {
cl <- parallel::makePSOCKcluster(cores)
cl <- parallel::makePSOCKcluster(.cores)
on.exit(parallel::stopCluster(cl))
out <- parallel::parLapply(cl = cl, X = X, fun = FUN, ...)
}
# The version below hopefully prevents the spawning of zombies
# The version below was suggested to prevent the spawning of zombies
# but it does not always succeed in that. It also seems to cause
# other issues as discussed in #1658, so commented out for now.
# cl_type <- ifelse(os_is_windows(), "PSOCK", "FORK")
# cl <- parallel::makeCluster(cores, type = cl_type)
# cl <- parallel::makeCluster(.cores, type = cl_type)
# # Register a cleanup for the cluster in case the function fails
# # Need to wrap in a tryCatch to avoid error if cluster is already stopped
# on.exit(tryCatch(
Expand Down
2 changes: 1 addition & 1 deletion R/posterior_predict.R
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ posterior_predict.brmsprep <- function(object, transform = NULL, sort = FALSE,
pp_fun <- paste0("posterior_predict_", object$family$fun)
pp_fun <- get(pp_fun, asNamespace("brms"))
N <- choose_N(object)
out <- plapply(seq_len(N), pp_fun, cores = cores, prep = object, ...)
out <- plapply(seq_len(N), pp_fun, .cores = cores, prep = object, ...)
if (grepl("_mv$", object$family$fun)) {
out <- do_call(abind, c(out, along = 3))
out <- aperm(out, perm = c(1, 3, 2))
Expand Down
Loading