diff --git a/DESCRIPTION b/DESCRIPTION index 37f1e5e..7a245f8 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,9 +1,9 @@ Package: IMIFA Type: Package -Date: 2019-12-11 +Date: 2020-03-30 Title: Infinite Mixtures of Infinite Factor Analysers and Related Models Version: 2.1.2 -Authors@R: c(person("Keefe", "Murphy", email = "keefe.murphy@ucd.ie", role = c("aut", "cre")), +Authors@R: c(person("Keefe", "Murphy", email = "keefe.murphy@ucd.ie", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-7709-3159")), person("Cinzia", "Viroli", email = "cinzia.viroli@unibo.it", role = "ctb"), person("Isobel Claire", "Gormley", email = "claire.gormley@ucd.ie", role = "ctb")) Description: Provides flexible Bayesian estimation of Infinite Mixtures of Infinite Factor Analysers and related models, for nonparametrically clustering high-dimensional data, introduced by Murphy et al. (2019) . The IMIFA model conducts Bayesian nonparametric model-based clustering with factor analytic covariance structures without recourse to model selection criteria to choose the number of clusters or cluster-specific latent factors, mostly via efficient Gibbs updates. Model-specific diagnostic tools are also provided, as well as many options for plotting results, conducting posterior inference on parameters of interest, posterior predictive checking, and quantifying uncertainty. @@ -17,17 +17,16 @@ Imports: matrixStats, mclust (>= 5.1), mvnfast, - Rfast (>= 1.9.7), + Rfast (>= 1.9.8), slam, viridis Suggests: gmp, knitr, mcclust, - methods, rmarkdown, Rmpfr -RoxygenNote: 7.0.2 +RoxygenNote: 7.1.0 VignetteBuilder: knitr Collate: 'MainFunction.R' diff --git a/NAMESPACE b/NAMESPACE index be9e689..b5fc353 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -46,6 +46,7 @@ importFrom(Rfast,"Var") importFrom(Rfast,"colMaxs") importFrom(Rfast,"colMedians") importFrom(Rfast,"colTabulate") +importFrom(Rfast,"colVars") importFrom(Rfast,"is.symmetric") importFrom(Rfast,"matrnorm") importFrom(Rfast,"rowAll") @@ -53,6 +54,7 @@ importFrom(Rfast,"rowMaxs") importFrom(Rfast,"rowOrder") importFrom(Rfast,"rowSort") importFrom(Rfast,"rowTabulate") +importFrom(Rfast,"rowVars") importFrom(matrixStats,"colMeans2") importFrom(matrixStats,"colSums2") importFrom(matrixStats,"rowLogSumExps") diff --git a/R/Diagnostics.R b/R/Diagnostics.R index 810de2f..2946ec7 100644 --- a/R/Diagnostics.R +++ b/R/Diagnostics.R @@ -22,7 +22,7 @@ #' @param vari.rot Logical indicating whether the loadings matrix/matrices template(s) should be \code{\link[stats]{varimax}} rotated first, prior to the Procrustes rotation steps. Defaults to \code{FALSE}. Not necessary at all for clustering purposes, or inference on the covariance matrix, but useful if interpretable inferences on the loadings matrix/matrices are desired. Arguments to \code{\link[stats]{varimax}} can be passed via the \code{...} construct, but note that the argument \code{normalize} here defaults to \code{FALSE}. #' @param z.avgsim Logical (defaults to \code{FALSE}) indicating whether the clustering should also be summarised with a call to \code{\link{Zsimilarity}} by the clustering with minimum mean squared error to the similarity matrix obtained by averaging the stored adjacency matrices, in addition to the MAP estimate. #' -#' Note that the MAP clustering is computed \emph{conditional} on the estimate of the number of clusters (whether that be the modal estimate or the estimate according to \code{criterion}) and other parameters are extracted conditional on this estimate of \code{G}: however, in constrast, the number of distinct clusters in the summarised labels obtained by specifying \code{z.avgsim=TRUE} may not necessarily coincide with the MAP estimate of \code{G}, but it may provide a useful alternative summary of the partitions explored during the chain, and the user is free to call \code{\link{get_IMIFA_results}} again with the new suggested \code{G} value. +#' Note that the MAP clustering is computed \emph{conditional} on the estimate of the number of clusters (whether that be the modal estimate or the estimate according to \code{criterion}) and other parameters are extracted conditional on this estimate of \code{G}: however, in contrast, the number of distinct clusters in the summarised labels obtained by specifying \code{z.avgsim=TRUE} may not necessarily coincide with the MAP estimate of \code{G}, but it may provide a useful alternative summary of the partitions explored during the chain, and the user is free to call \code{\link{get_IMIFA_results}} again with the new suggested \code{G} value. #' #' Please be warned that this feature requires loading the \code{\link[mcclust]{mcclust}} package. This is liable to take considerable time to compute, and may not even be possible if the number of observations &/or number of stored iterations is large and the resulting matrix isn't sufficiently sparse. When \code{z.avgsim=TRUE}, both the summarised clustering and the similarity matrix are stored: the latter can be visualised as part of a call to \code{\link{plot.Results_IMIFA}}. #' @param zlabels For any method that performs clustering, the true labels can be supplied if they are known in order to compute clustering performance metrics. This also has the effect of ordering the MAP labels (and thus the ordering of cluster-specific parameters) to most closely correspond to the true labels if supplied. @@ -54,13 +54,13 @@ #' \item{\code{Uniquenesses}}{Posterior summaries for the uniquenesses, after conditioning on \code{G}.} #' #' The objects \code{Means}, \code{Loadings}, \code{Scores} and \code{Uniquenesses} (if stored when calling \code{\link{mcmc_IMIFA}}!) also contain, as well as the posterior summaries, the entire chain of valid samples of each, as well as, for convenience, the last valid samples of each (after conditioning on the modal \code{G} and \code{Q} values, and accounting for label switching, and rotational invariance via Procrustes rotation). -#' @note For the "\code{IMIFA}", "\code{IMFA}", "\code{OMIFA}", and "\code{OMFA}" methods, the retained mixing proportions are renormalised after conditioning on the modal \code{G}. This is especially necessary for the compution of the \code{error.metrics}, just note that the values on which posterior inference are conducted will ever so slightly differ from the actually sampled values. +#' @note For the "\code{IMIFA}", "\code{IMFA}", "\code{OMIFA}", and "\code{OMFA}" methods, the retained mixing proportions are renormalised after conditioning on the modal \code{G}. This is especially necessary for the computation of the \code{error.metrics}, just note that the values on which posterior inference are conducted will ever so slightly differ from the actually sampled values. #' #' Due to the way the offline label-switching correction is performed, different runs of this function may give \emph{very slightly} different results in terms of the cluster labellings (and by extension the parameters, which are permuted in the same way), but only if the chain was run for an extremely small number of iterations, well below the number required for convergence, and samples of the cluster labels match poorly across iterations (particularly if the number of clusters suggested by those sampled labels is high). #' @keywords IMIFA main #' @include MainFunction.R #' @export -#' @importFrom Rfast "colMaxs" "colTabulate" "Median" "rowAll" "rowMaxs" "rowOrder" "rowSort" "rowTabulate" "Var" +#' @importFrom Rfast "colMaxs" "colTabulate" "Median" "rowAll" "rowMaxs" "rowOrder" "rowSort" "rowTabulate" "rowVars" "Var" #' @importFrom mclust "classError" #' @importFrom matrixStats "colSums2" "rowMeans2" "rowMedians" "rowQuantiles" "rowSums2" #' @importFrom slam "as.simple_triplet_matrix" @@ -258,7 +258,7 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = Qlen <- length(Qtmp) if(length(Q) == 1) Q <- rep(Q, Qlen) if(length(Q) != Qlen) stop(paste0("'Q' must be supplied for each cluster, as a scalar or vector of length G=", G), call.=FALSE) - if(any(Q * (Qtmp - Q) < 0)) stop(paste0("'Q' can't be greater than the maximum number of factors stored in ", ifelse(method == "IFA", "", "any cluster of "), match.call()$sims), call.=FALSE) + if(any(Q * (Qtmp - Q) < 0)) stop(paste0("'Q' can't be greater than the maximum number of factors stored in ", ifelse(method == "IFA", "", "any cluster of "), deparse(match.call()$sims)), call.=FALSE) } } @@ -504,7 +504,7 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = if(sw["pi.sw"]) { pi.prop <- provideDimnames(pies[Gseq,storeG, drop=FALSE], base=list(gnames, ""), unique=FALSE) pi.prop <- if(inf.G) sweep(pi.prop, 2L, colSums2(pi.prop), FUN="/", check.margin=FALSE) else pi.prop - var.pi <- stats::setNames(.row_vars(pi.prop), gnames) + var.pi <- stats::setNames(rowVars(pi.prop), gnames) ci.pi <- rowQuantiles(pi.prop, probs=conf.levels) ci.pi <- if(G == 1) t(ci.pi) else ci.pi post.pi <- stats::setNames(rowMeans2(pi.prop), gnames) @@ -759,13 +759,13 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = # Compute posterior means and % variation explained if(sw["mu.sw"]) { post.mu <- if(clust.ind) rowMeans2(mu) else post.mu - var.mu <- if(uni) Var(mu) else .row_vars(mu) + var.mu <- if(uni) Var(mu) else rowVars(mu) ci.tmp <- rowQuantiles(mu, probs=conf.levels) ci.mu <- if(uni) t(ci.tmp) else ci.tmp } if(sw["psi.sw"]) { post.psi <- if(clust.ind) rowMeans2(psi) else post.psi - var.psi <- if(uni) Var(psi) else .row_vars(psi) + var.psi <- if(uni) Var(psi) else rowVars(psi) ci.tmp <- rowQuantiles(psi, probs=conf.levels) ci.psi <- if(uni) t(ci.tmp) else ci.tmp } @@ -774,14 +774,14 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = post.load <- rowMeans(lmat, dims=2) var.load <- apply(lmat, c(1L, 2L), Var) ci.load <- apply(lmat, c(1L, 2L), stats::quantile, conf.levels) - var.exp <- sum(post.load * post.load)/n.var + var.exp <- rowSums2(post.load * post.load) class(post.load) <- "loadings" } else if(sw["psi.sw"]) { if(sizes[g] > 1) { dat.gg <- dat[z.ind[[g]],, drop=FALSE] cov.gg <- stats::cov(dat.gg) } - var.exp <- ifelse(sizes[g] <= 1, 0L, max(0L, (sum(diag(cov.gg)) - sum(post.psi))/n.var)) + var.exp <- ifelse(sizes[g] <= 1, 0L, pmax(0L, diag(cov.gg) - post.psi)) } else { var.exp <- NULL } @@ -1006,9 +1006,10 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = } } if(all(sw["psi.sw"] || !clust.ind, any(sw["l.sw"], Q0X))) { - var.exps <- vapply(lapply(result, "[[", "var.exp"), function(x) ifelse(is.null(x), NA, x), numeric(1L)) - var.exps <- if(sum(is.na(var.exps)) == G) NULL else var.exps - Err <- list(Var.Exps = var.exps, Exp.Var = ifelse(clust.ind, sum(var.exps * post.pi), unname(var.exps))) + var.exps <- lapply(lapply(result, "[[", "var.exp"), function(x) if(is.null(x)) NA else x) + clust.exps <- if(sum(is.na(var.exps)) == G) NULL else vapply(var.exps, sum, na.rm=TRUE, numeric(1L))/n.var + var.exps <- if(sum(is.na(var.exps)) == G) NULL else if(clust.ind) stats::setNames(rowSums2(sweep(do.call(cbind, var.exps), 2L, FUN="*", post.pi, check.margin=FALSE), na.rm=TRUE), varnames) else stats::setNames(var.exps[[1L]], varnames) + Err <- list(Var.Exps = var.exps, Clust.Exps = clust.exps, Exp.Var = ifelse(clust.ind, sum(clust.exps * post.pi), unname(clust.exps))) if(all(error.metrics, sw["mu.sw"] || !clust.ind)) { if(cov.met) { cov.est <- as.matrix(cov.est) @@ -1032,7 +1033,7 @@ get_IMIFA_results.IMIFA <- function(sims = NULL, burnin = 0L, thinning = last.met <- c(MSE = mse[e.store], MEDSE = medse[e.store], MAE = mae[e.store], MEDAE = medae[e.store], RMSE = rmse[e.store], NRMSE = nrmse[e.store]) last.met <- if(frobenius) c(last.met, PPRE = Fro[e.store]) else last.met last.met <- last.met[!is.na(last.met)] - Err <- if(cov.met) c(list(Median = med.met, Avg = mean.met, CIs = metricCI), Err[seq_len(3L)], if(all(cov.met, sw.pi)) c(list(Last.Cov = Sigma)), c(list(Final = last.met)), Err[4L:5L]) else c(list(Avg = mean.met, CIs = metricCI), Err[seq_len(2L)], list(Final = last.met)) + Err <- if(cov.met) c(list(Median = med.met, Avg = mean.met, CIs = metricCI), Err[seq_len(4L)], if(all(cov.met, sw.pi)) c(list(Last.Cov = Sigma)), c(list(Final = last.met)), Err[5L:6L]) else c(list(Avg = mean.met, CIs = metricCI), Err[seq_len(3L)], list(Final = last.met)) if(frobenius) { rcounts <- stats::setNames(lapply(Pseq, function(p) t(rowQuantiles(sapply(rcounts, "[[", p), probs=c(conf.levels[1L], 0.5, conf.levels[2L])))), varnames) class(dcounts) <- "listof" diff --git a/R/FullConditionals.R b/R/FullConditionals.R index d58766f..cc11ba9 100644 --- a/R/FullConditionals.R +++ b/R/FullConditionals.R @@ -64,7 +64,7 @@ # Local Shrinkage .sim_phi <- function(Q, P, nu1, nu2, tau, load.2, sigma = 1L) { - base::matrix(stats::rgamma(P * Q, shape=nu1 + 0.5, rate=nu2 + (sigma * sweep(load.2, 2L, tau, FUN="*", check.margin=FALSE))/2), nrow=P, ncol=Q) + matrix(stats::rgamma(P * Q, shape=nu1 + 0.5, rate=nu2 + (sigma * sweep(load.2, 2L, tau, FUN="*", check.margin=FALSE))/2), nrow=P, ncol=Q) } # Column Shrinkage @@ -106,7 +106,8 @@ #' (prior <- rDirichlet(G=5, alpha=1)) #' (posterior <- rDirichlet(G=5, alpha=1, nn=c(20, 41, 32, 8, 12))) rDirichlet <- function(G, alpha, nn = 0L) { - tmp <- stats::rgamma(G, shape=alpha + nn, rate=1L) + shape <- alpha + nn + tmp <- if(all(shape == 1)) stats::rexp(G, 1L) else stats::rgamma(G, shape=shape, rate=1L) tmp/sum(tmp) } @@ -126,7 +127,7 @@ # Cluster Labels #' Simulate Cluster Labels from Unnormalised Log-Probabilities using the Gumbel-Max Trick #' -#' Samples cluster labels for N observations from G clusters efficiently using log-probabilities and the so-called Gumbel-Max trick, without requiring that the log-probabilities be normalised; thus redunant computation can be avoided. +#' Samples cluster labels for N observations from G clusters efficiently using log-probabilities and the so-called Gumbel-Max trick, without requiring that the log-probabilities be normalised; thus redundant computation can be avoided. #' @param probs An N x G matrix of unnormalised probabilities on the log scale, where N is he number of observations that require labels to be sampled and G is the number of active clusters s.t. sampled labels can take values in \code{1:G}. Typically \code{N > G}. #' @param slice A logical indicating whether or not the indicator correction for slice sampling has been applied to \code{probs}. Defaults to \code{FALSE} but is \code{TRUE} for the "\code{IMIFA}" and "\code{IMFA}" methods under \code{\link{mcmc_IMIFA}}. Details of this correction are given in Murphy et. al. (2019). When set to \code{TRUE}, this results in a speed-improvement when \code{probs} contains non-finite values (e.g. \code{-Inf}, corresponding to zero on the probability scale). #' @return A vector of N sampled cluster labels, with the largest label no greater than G. @@ -142,7 +143,7 @@ #' #' @references Murphy, K., Viroli, C., and Gormley, I. C. (2019) Infinite mixtures of infinite factor analysers, \emph{Bayesian Analysis}, 1-27. <\href{https://projecteuclid.org/euclid.ba/1570586978}{doi:10.1214/19-BA1179}>. #' -#' Yellot, J. I. Jr. (1977) The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution, \emph{Journal of Mathematical Psychology}, 15: 109-144. +#' Yellott, J. I. Jr. (1977) The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution, \emph{Journal of Mathematical Psychology}, 15: 109-144. #' @export #' #' @author Keefe Murphy - <\email{keefe.murphy@@ucd.ie}> @@ -188,7 +189,7 @@ shape2 <- shape + G - 1L rate2 <- rate - log(stats::rbeta(1, alpha + 1, N)) weight <- shape2/(shape2 + N * rate2) - weight * stats::rgamma(1, shape=shape2 + 1, rate=rate2) + (1 - weight) * stats::rgamma(1, shape=shape2, rate=rate2) + weight * stats::rgamma(1L, shape=shape2 + 1, rate=rate2) + (1 - weight) * stats::rgamma(1L, shape=shape2, rate=rate2) } .log_palpha <- function(alpha, discount, alpha.shape, alpha.rate, N, G) { @@ -269,6 +270,7 @@ } # Scores + #' @importFrom Rfast "matrnorm" .sim_eta_p <- function(Q, N) { matrix(matrnorm(N, Q), nrow=N, ncol=Q) } @@ -417,7 +419,7 @@ #' Moment Matching Parameters of Shifted Gamma Distributions #' #' This function takes shape and rate parameters of a Gamma distribution and modifies them to achieve the same expected value and variance when the left extent of the support of the distribution is shifted up or down. -#' @param shape,rate Shape and rate parameters a and b, respectibely, of a Gamma(a, b) distribution. Both must be strictly positive. +#' @param shape,rate Shape and rate parameters a and b, respectively, of a Gamma(a, b) distribution. Both must be strictly positive. #' @param shift Modifier, such that the Gamma distribution has support on (\code{shift}, \eqn{\infty}). Can be positive or negative, though typically negative and small. #' @param param Switch controlling whether the supplied \code{rate} parameter is indeed a rate, or actually a scale parameter. Also governs whether the output is given in terms of rate or scale. Defaults to "\code{rate}". #' @@ -584,13 +586,13 @@ #' @param Q The number of latent factors (which can be 0, corresponding to a model with diagonal covariance). This argument is vectorised. #' @param P The number of variables. Must be a single strictly positive integer. #' @param G The number of clusters. This defaults to 1. Must be a single strictly positive integer. -#' @param method By default, calculation assumes the \code{UUU} model with unconstrained loadings and unconstrained diagonal uniquesseses (i.e. the factor analysis model). The seven other models detailed in McNicholas and Murphy (2008) are given too (of which currently the first four are accomodated within \code{\link{mcmc_IMIFA}}). The first letter denotes whether loadings are constrained/unconstrained across clusters; the second letter denotes the same for the uniquenesses; the final letter denotes whether uniquenesses are in turn constrained to be isotropic. Finally, the 4 extra 4-letter models from the EPGMM family (McNicholas and Murphy, 2010), are also included. +#' @param method By default, calculation assumes the \code{UUU} model with unconstrained loadings and unconstrained diagonal uniquesseses (i.e. the factor analysis model). The seven other models detailed in McNicholas and Murphy (2008) are given too (of which currently the first four are accommodated within \code{\link{mcmc_IMIFA}}). The first letter denotes whether loadings are constrained/unconstrained across clusters; the second letter denotes the same for the uniquenesses; the final letter denotes whether uniquenesses are in turn constrained to be isotropic. Finally, the 4 extra 4-letter models from the EPGMM family (McNicholas and Murphy, 2010), are also included. #' @param equal.pro Logical variable indicating whether or not the mixing mixing proportions are equal across clusters in the model (default = \code{FALSE}). #' #' @return A vector of length \code{length(Q)} giving the total number of parameters, including means and mixing proportions, and not only covariance parameters. Set \code{equal.pro} to \code{FALSE} and subtract \code{G * P} from the result to determine the number of covariance parameters only. #' @keywords utility #' -#' @note This function is used to calculate the penalty terms for the \code{aic.mcmc} and \code{bic.mcmc} model selection criteria implemented in \code{\link{get_IMIFA_results}} for \emph{finite} factor models (though \code{\link{mcmc_IMIFA}} currently only implements the \code{UUU}, \code{UUC}, \code{UCU}, and \code{UCC} covariance structures). The function is vectorized with respect to the argument \code{Q}. +#' @note This function is used to calculate the penalty terms for the \code{aic.mcmc} and \code{bic.mcmc} model selection criteria implemented in \code{\link{get_IMIFA_results}} for \emph{finite} factor models (though \code{\link{mcmc_IMIFA}} currently only implements the \code{UUU}, \code{UUC}, \code{UCU}, and \code{UCC} covariance structures). The function is vectorised with respect to the argument \code{Q}. #' #' Though the function is available for standalone use, note that no checks take place, in order to speed up repeated calls to the function inside \code{\link{mcmc_IMIFA}}. #' @export @@ -682,7 +684,7 @@ #' A heatmap of \code{z.sim} may provide a useful visualisation, if appropriately ordered. The user is also invited to perform hierarchical clustering using \code{\link[stats]{hclust}} after first converting this similarity matrix to a distance matrix - "complete" linkage is recommended. Alternatively, \code{\link[mclust]{hc}} could be used. #' @return A list containing three elements: #' \item{z.avg}{The 'average' clustering, with minimum squared distance to \code{z.sim}.} -#' \item{z.sim}{The N x N similary matrix, in a sparse format (see \code{\link[slam]{simple_triplet_matrix}}).} +#' \item{z.sim}{The N x N similarity matrix, in a sparse format (see \code{\link[slam]{simple_triplet_matrix}}).} #' \item{MSE.z}{A vector of length M recording the MSEs between each clustering and the 'average' clustering.} #' @export #' @keywords utility @@ -827,7 +829,7 @@ } # Positive-(Semi)Definite Checker -#' Check Postive-(Semi)definiteness of a matrix +#' Check Positive-(Semi)definiteness of a matrix #' #' Tests whether all eigenvalues of a symmetric matrix are positive (or strictly non-negative) to check for positive-definiteness and positive-semidefiniteness, respectively. If the supplied matrix doesn't satisfy the test, the nearest matrix which does can optionally be returned. #' @param x A matrix, assumed to be real and symmetric. @@ -835,7 +837,7 @@ #' #' (default: tol = \code{max(dim(x))*max(E)*.Machine$double.eps}, where \code{E} is the vector of absolute eigenvalues). #' @param semi Logical switch to test for positive-semidefiniteness when \code{TRUE} or positive-definiteness when \code{FALSE} (the default). -#' @param make Logical switch to return the nearest matrix which satisifies the test - if the test has been passed, this is of course just \code{x} itself, otherwise the nearest positive-(semi)definite matrix. Note that for reasons due to finite precision arithmetic, finding the nearest positive-definite and nearest positive-semidefinite matrices are effectively equivalent tasks. +#' @param make Logical switch to return the nearest matrix which satisfies the test - if the test has been passed, this is of course just \code{x} itself, otherwise the nearest positive-(semi)definite matrix. Note that for reasons due to finite precision arithmetic, finding the nearest positive-definite and nearest positive-semidefinite matrices are effectively equivalent tasks. #' #' @return If \code{isTRUE(make)}, a list with two components: #' \item{\code{check}}{A logical value indicating whether the matrix satisfies the test.} @@ -886,7 +888,7 @@ #' Ledermann Bound #' #' Returns the maximum number of latent factors in a factor analysis model for data of dimension \code{P} which actually achieves dimension reduction in terms of the number of covariance parameters. This Ledermann bound is given by the largest integer smaller than or equal to the solution \eqn{k}{k} of \eqn{(M - k)^2 \geq M + k}{(M - k)^2 >= M + k}. -#' @param P Integer number of variables in data set. This argument is vectorized. +#' @param P Integer number of variables in data set. This argument is vectorised. #' @param isotropic Logical indicating whether uniquenesses are constrained to be isotropic, in which case the bound is simply \eqn{P-1}{P-1}. Defaults to \code{FALSE}. #' #' @return The Ledermann bound, a non-negative integer, or a vector of \code{length(P)} such bounds. @@ -898,7 +900,10 @@ #' #' @examples #' Ledermann(c(25, 50, 100)) - Ledermann <- function(P, isotropic = FALSE) { +#' +#' data(olive) +#' Ledermann(ncol(olive[,-c(1,2)])) + Ledermann <- function(P, isotropic = FALSE) { # heteroscedastic factors if(!is.numeric(P) || any(P <= 0, floor(P) != P)) stop("'P' must be a strictly positive integer", call.=FALSE) if(length(isotropic) > 1 || @@ -974,9 +979,10 @@ if((N <- nrow(X)) != nrow(Xstar)) stop("X and Xstar do not have the same number of rows", call.=FALSE) if((P <- ncol(X)) != (P2 <- ncol(Xstar))) { if(P < P2) { warning("X padded out to same number of columns as Xstar\n", call.=FALSE, immediate.=TRUE) - X <- cbind(X, base::matrix(0L, nrow=N, ncol=P2 - P)) + X <- cbind(X, matrix(0L, nrow=N, ncol=P2 - P)) } else stop("X cannot have more columns than Xstar", call.=FALSE) } + if(P2 == 0) stop("Xstar must contain at least one column", call.=FALSE) if(anyNA(Xstar) || anyNA(X)) stop("X and Xstar are not allowed to contain missing values", call.=FALSE) J <- if(translate) diag(N) - 1/N else diag(N) C <- if(translate) crossprod(Xstar, J) %*% X else crossprod(Xstar, X) @@ -1037,6 +1043,7 @@ #' @param N The sample size. #' @param alpha The concentration parameter. Must be specified and must be strictly greater than \code{-discount}. The case \code{alpha=0} is accommodated. When \code{discount} is negative \code{alpha} must be a positive integer multiple of \code{abs(discount)}. #' @param discount The discount parameter for the Pitman-Yor process. Must be less than 1, but typically lies in the interval [0, 1). Defaults to 0 (i.e. the Dirichlet process). When \code{discount} is negative \code{alpha} must be a positive integer multiple of \code{abs(discount)}. +#' @param MPFR Logical indicating whether the high-precision libraries \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} are invoked, at the expense of run-time. Defaults to \code{TRUE} and \strong{must} be \code{TRUE} for \code{\link{G_expected}} when \code{alpha=0} and \code{\link{G_variance}} when \code{discount} is non-zero. See \strong{\code{Note}}. #' #' @details All arguments are vectorised. Users can also consult \code{\link{G_priorDensity}} in order to solicit sensible priors. #' @@ -1046,7 +1053,7 @@ #' @name G_moments #' @rdname G_moments #' -#' @note \code{G_variance} requires use of the \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} libraries for non-zero \code{discount} values. \code{G_expected} requires these libraries only for the \code{alpha=0} case. Despite the high precision arithmetic used, the functions can be unstable for small values of \code{discount}. +#' @note \code{G_variance} requires use of the \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} libraries for non-zero \code{discount} values. \code{G_expected} requires these libraries only for the \code{alpha=0} case. Despite the high precision arithmetic used, the functions can still be unstable for small values of \code{discount}. See the argument \code{MPFR}. #' #' @seealso \code{\link{G_priorDensity}}, \code{\link[Rmpfr]{Rmpfr}} #' @references De Blasi, P., Favaro, S., Lijoi, A., Mena, R. H., Prunster, I., and Ruggiero, M. (2015) Are Gibbs-type priors the most natural generalization of the Dirichlet process?, \emph{IEEE Transactions on Pattern Analysis and Machine Intelligence}, 37(2): 212-229. @@ -1055,23 +1062,26 @@ #' @usage #' G_expected(N, #' alpha, -#' discount = 0) +#' discount = 0, +#' MPFR = TRUE) #' @examples -#' G_expected(N=50, alpha=19.23356) -#' G_variance(N=50, alpha=19.23356) +#' G_expected(N=50, alpha=19.23356, MPFR=FALSE) +#' G_variance(N=50, alpha=19.23356, MPFR=FALSE) #' +#' G_expected(N=50, alpha=c(19.23356, 12.21619, 1), +#' discount=c(0, 0.25, 0.7300045), MPFR=FALSE) #' # require("Rmpfr") -#' # G_expected(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045)) -#' # G_variance(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045)) +#' # G_variance(N=50, alpha=c(19.23356, 12.21619, 1), +#' # discount=c(0, 0.25, 0.7300045), MPFR=c(FALSE, TRUE, TRUE)) #' #' # Examine the growth rate of the DP -#' DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i)) +#' DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i, MPFR=FALSE)) #' matplot(DP, type="l", xlab="N", ylab="G") #' #' # Examine the growth rate of the PYP #' # PY <- sapply(c(0.1, 0.25, 0.5), function(i) G_expected(1:200, alpha=1, discount=i)) #' # matplot(PY, type="l", xlab="N", ylab="G") - G_expected <- Vectorize(function(N, alpha, discount = 0) { + G_expected <- Vectorize(function(N, alpha, discount = 0, MPFR = TRUE) { if(!all(is.numeric(N), is.numeric(discount), is.numeric(alpha))) stop("All inputs must be numeric", call.=FALSE) if(discount >= 1) stop("'discount' must be less than 1", call.=FALSE) @@ -1080,12 +1090,16 @@ if(discount < 0 && (alpha %% discount) != 0) stop("'alpha' must be a positive integer multiple of 'abs(discount)' when 'discount' is negative", call.=FALSE) if(alpha == 0 && discount <= 0) stop("'discount' must be strictly positive when 'alpha=0", call.=FALSE) - if(suppressMessages(requireNamespace("Rmpfr", quietly=TRUE))) { - mpfrind <- TRUE - on.exit(.detach_pkg("Rmpfr")) - on.exit(.detach_pkg("gmp"), add=TRUE) + if(alpha == 0 && isFALSE(MPFR)) stop("'MPFR' must be TRUE when 'alpha' == 0", call.=FALSE) + imgp <- isNamespaceLoaded("Rmpfr") + if(mpfrind <- (isTRUE(MPFR) && suppressMessages(requireNamespace("Rmpfr", quietly=TRUE)))) { + if(isFALSE(igmp)) { + on.exit(.detach_pkg("Rmpfr")) + on.exit(.detach_pkg("gmp"), add=TRUE) + } alpha <- Rmpfr::mpfr(alpha, precBits=256) } + if(alpha == 0) { if(mpfrind) { tmp <- sum(log(alpha + 1L + 0L:(N - 2L))) @@ -1123,9 +1137,10 @@ #' @usage #' G_variance(N, #' alpha, -#' discount = 0) +#' discount = 0, +#' MPFR = TRUE) #' @export - G_variance <- Vectorize(function(N, alpha, discount = 0) { + G_variance <- Vectorize(function(N, alpha, discount = 0, MPFR = TRUE) { if(!all(is.numeric(N), is.numeric(discount), is.numeric(alpha))) stop("All inputs must be numeric", call.=FALSE) if(discount >= 1) stop("'discount' must be less than 1", call.=FALSE) @@ -1134,10 +1149,13 @@ if(discount < 0 && alpha %% discount != 0) stop("'alpha' must be a positive integer multiple of 'abs(discount)' when 'discount' is negative", call.=FALSE) #if(alpha == 0) { warning("'alpha'=0 case note yet implemented", call.=FALSE, immediate.=TRUE); return(Inf) } - if(suppressMessages(requireNamespace("Rmpfr", quietly=TRUE))) { - mpfrind <- TRUE - on.exit(.detach_pkg(Rmpfr)) - on.exit(.detach_pkg(gmp), add=TRUE) + if(discount != 0 && isFALSE(MPFR)) stop("'MPFR' must be TRUE when 'discount' is non-zero", call.=FALSE) + imgp <- isNamespaceLoaded("Rmpfr") + if(mpfrind <- (isTRUE(MPFR) && suppressMessages(requireNamespace("Rmpfr", quietly=TRUE)))) { + if(isFALSE(igmp)) { + on.exit(.detach_pkg(Rmpfr)) + on.exit(.detach_pkg(gmp), add=TRUE) + } alpha <- Rmpfr::mpfr(alpha, precBits=256) } else if(discount != 0) stop("'Rmpfr' package not installed", call.=FALSE) @@ -1151,7 +1169,7 @@ var + alpha2 * (trigamma(alpha + N) - trigamma(alpha)) } } else { - alpha <- ifelse(alpha == 0, .Machine$double.eps, alpha) + alpha <- if(alpha == 0) .Machine$double.eps else alpha sum.ad <- alpha + discount poch.a <- Rmpfr::pochMpfr(alpha, N) poch.ad <- Rmpfr::pochMpfr(sum.ad, N) @@ -1440,19 +1458,19 @@ #' @param alpha.hyper #' \describe{ #' \item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{A vector of length 2 giving hyperparameters for the prior on the Pitman-Yor / Dirichlet process concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(\code{2}, \code{4}). Choosing a larger rate is particularly important, as it encourages clustering. The prior is shifted to have support on (\code{-discount}, \code{Inf}) when non-zero \code{discount} is supplied and remains fixed (i.e. \code{learn.d=FALSE}) or when \code{learn.d=TRUE}.} -#' \item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{A vector of length 2 giving hyperparameters a and b for the prior on the Dirichlet concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(2, 4). Note that the suplied rate will be multiplied by \code{range.G}, to encourage clustering, such that the form of the prior is Ga(a, b * G).} +#' \item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{A vector of length 2 giving hyperparameters a and b for the prior on the Dirichlet concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(2, 4). Note that the supplied rate will be multiplied by \code{range.G}, to encourage clustering, such that the form of the prior is Ga(a, b * G).} #' } #' @param discount The discount parameter used when generalising the Dirichlet process to the Pitman-Yor process. Defaults to 0, but typically must lie in the interval [0, 1). If greater than zero, \code{alpha} can be supplied greater than \code{-discount}. By default, Metropolis-Hastings steps are invoked for updating this parameter via \code{learn.d}. The special case of \code{discount < 0} is allowed, in which case \code{learn.d=FALSE} is forced and \code{alpha} must be a positive integer multiple of \code{abs(discount)}. #' @param learn.d Logical indicating whether the \code{discount} parameter is to be updated via Metropolis-Hastings (defaults to\code{TRUE}). #' @param d.hyper Hyperparameters for the Beta(a,b) prior on the \code{discount} parameter. Defaults to Beta(1,1), i.e. Uniform(0,1). -#' @param ind.slice Logical indicitating whether the independent slice-efficient sampler is to be employed (defaults to \code{TRUE}). If \code{FALSE} the dependent slice-efficient sampler is employed, whereby the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the decreasingly ordered mixing proportions. +#' @param ind.slice Logical indicating whether the independent slice-efficient sampler is to be employed (defaults to \code{TRUE}). If \code{FALSE} the dependent slice-efficient sampler is employed, whereby the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the decreasingly ordered mixing proportions. #' @param rho Parameter controlling the rate of geometric decay for the independent slice-efficient sampler, s.t. \eqn{\xi=(1-\rho)\rho^{g-1}}{xi = (1 - rho)rho^(g-1)}. Must lie in the interval [0, 1). Higher values are associated with better mixing but longer run times. Defaults to 0.75, but 0.5 is an interesting special case which guarantees that the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the \emph{expectation} of the decreasingly ordered mixing proportions. Only relevant when \code{ind.slice} is \code{TRUE}. #' @param trunc.G The maximum number of allowable and storable clusters under the "\code{IMIFA}" and "\code{IMFA}" models. The number of active clusters to be sampled at each iteration is adaptively truncated, with \code{trunc.G} as an upper limit for storage reasons. Defaults to \code{max(min(N-1, 50), range.G))} and must satisfy \code{range.G <= trunc.G < N}. Note that large values of \code{trunc.G} may lead to memory capacity issues. #' @param kappa The spike-and-slab prior distribution on the \code{discount} hyperparameter is assumed to be a mixture with point-mass at zero and a continuous Beta(a,b) distribution. \code{kappa} gives the weight of the point mass at zero (the 'spike'). Must lie in the interval [0,1]. Defaults to 0.5. Only relevant when \code{isTRUE(learn.d)}. A value of 0 ensures non-zero discount values (i.e. Pitman-Yor) at all times, and \emph{vice versa}. Note that \code{kappa} will default to exactly 0 if \code{alpha<=0} and \code{learn.alpha=FALSE}. -#' @param IM.lab.sw Logial indicating whether the two forced label switching moves are to be implemented (defaults to \code{TRUE}) when running one of the infinite mixture models. +#' @param IM.lab.sw Logical indicating whether the two forced label switching moves are to be implemented (defaults to \code{TRUE}) when running one of the infinite mixture models. #' @param zeta #' \describe{ -#' \item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{Tuning parameter controlling the acceptance rate of the random-walk proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}, where \code{2 * zeta} gives the full width of the uniform proposal distribution. These steps are only invoked when either \code{discount} is non-zero and fixed or \code{learn.d=TRUE}, otherwise \code{alpha} is learned by Gibbs updates. Must be strictly positive (if invoked). Defauts to \code{2}.} +#' \item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{Tuning parameter controlling the acceptance rate of the random-walk proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}, where \code{2 * zeta} gives the full width of the uniform proposal distribution. These steps are only invoked when either \code{discount} is non-zero and fixed or \code{learn.d=TRUE}, otherwise \code{alpha} is learned by Gibbs updates. Must be strictly positive (if invoked). Defaults to \code{2}.} #' \item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{Tuning parameter controlling the standard deviation of the log-normal proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}. Must be strictly positive (if invoked). Defaults to \code{0.75}.} #' } #' @param tune.zeta A list with the following named arguments, used for tuning \code{zeta} (which is either the width of the uniform proposal for the "\code{IMFA}" or "\code{IMIFA}" methods or the standard deviation of the log-normal proposal for the "\code{OMFA}" or "\code{OMIFA}" methods) for \code{alpha}, via diminishing Robbins-Monro type adaptation, when the \code{alpha} parameter is learned via Metropolis-Hastings steps: @@ -1487,7 +1505,7 @@ #' @usage #' bnpControl(learn.alpha = TRUE, #' alpha.hyper = c(2L, 4L), -#' discount = NULL, +#' discount = 0, #' learn.d = TRUE, #' d.hyper = c(1L, 1L), #' ind.slice = TRUE, @@ -1507,7 +1525,7 @@ #' # Alternatively specify these arguments directly #' # sim <- mcmc_IMIFA(olive, "IMIFA", n.iters=5000, learn.d=FALSE, #' # ind.slice=FALSE, alpha.hyper=c(3, 3)) - bnpControl <- function(learn.alpha = TRUE, alpha.hyper = c(2L, 4L), discount = NULL, learn.d = TRUE, d.hyper = c(1L, 1L), + bnpControl <- function(learn.alpha = TRUE, alpha.hyper = c(2L, 4L), discount = 0, learn.d = TRUE, d.hyper = c(1L, 1L), ind.slice = TRUE, rho = 0.75, trunc.G = NULL, kappa = 0.5, IM.lab.sw = TRUE, zeta = NULL, tune.zeta = list(...), ...) { miss.args <- list(discount = missing(discount), IM.lab.sw = missing(IM.lab.sw), kappa = missing(kappa), trunc.G = missing(trunc.G), zeta = missing(zeta), learn.d = missing(learn.d)) @@ -1535,7 +1553,7 @@ if(any(!is.numeric(kappa), length(kappa) != 1)) stop("'kappa' must be a single number", call.=FALSE) if(kappa < 0 || kappa > 1) stop("'kappa' must lie in the interval [0, 1]", call.=FALSE) - discount <- ifelse(missing(discount), ifelse(learn.d, ifelse(kappa != 0 && stats::runif(1) <= kappa, 0L, pmin(stats::rbeta(1, d.hyper[1L], d.hyper[2L]), 1 - .Machine$double.eps)), 0), discount) + discount <- ifelse(missing(discount), ifelse(learn.d, ifelse(kappa != 0 && stats::runif(1L) <= kappa, 0L, pmin(stats::rbeta(1L, d.hyper[1L], d.hyper[2L]), 1 - .Machine$double.eps)), 0L), discount) if(any(!is.numeric(discount), length(discount) != 1)) stop("'discount' must be a single number", call.=FALSE) if(discount >= 1) stop("'discount' must be less than 1", call.=FALSE) @@ -1613,7 +1631,7 @@ #' @param phi.hyper A vector of length 2 giving the shape and rate hyperparameters for the gamma prior on the local shrinkage parameters. Passed to \code{\link{MGP_check}} to ensure validity. Defaults to \code{c(3, 2)}. It is suggested that the rate be <= shape minus 1 to induce local shrinkage, though the cumulative shrinkage property is unaffected by these hyperparameters. Excessively small values may lead to critical numerical issues and should thus be avoided; indeed it is \emph{suggested} that the shape be >=1. #' @param sigma.hyper A vector of length 2 giving the shape and rate hyperparameters for the gamma prior on the cluster shrinkage parameters. Passed to \code{\link{MGP_check}} to ensure validity. Defaults to \code{c(3, 2)}. Again, it is \emph{suggested} that the shape be >= 1. Only relevant for the "\code{IMIFA}", "\code{OMIFA}", and "\code{MIFA}" methods when \code{isTRUE(cluster.shrink)}. #' @param prop Proportion of loadings elements within the neighbourhood \code{eps} of zero necessary to consider a loadings column redundant. Defaults to \code{floor(0.7 * P)/P}, where \code{P} is the number of variables in the data set. However, if the data set is univariate or bivariate, the default is \code{0.5} (see Note). -#' @param eps Neighbourhood epsilon of zero within which a loadings entry is considered negligible according to \code{prop}. Defaults to \code{0.1}. +#' @param eps Neighbourhood epsilon of zero within which a loadings entry is considered negligible according to \code{prop}. Defaults to \code{0.1}. Must be positive. #' @param adapt A logical value indicating whether adaptation of the number of cluster-specific factors is to take place when the MGP prior is employed. Defaults to \code{TRUE}. Specifying \code{FALSE} and supplying \code{range.Q} within \code{\link{mcmc_IMIFA}} provides a means to either approximate the infinite factor model with a fixed high truncation level, or to use the MGP prior in a finite factor context, however this is NOT recommended for the "\code{OMIFA}" and "\code{IMIFA}" methods. #' @param forceQg A logical indicating whether the upper limit on the number of cluster-specific factors \code{Q} is also cluster-specific. Defaults to \code{FALSE}: when \code{TRUE}, the number of factors in each cluster is kept below the number of observations in each cluster, in addition to the bound defined by \code{range.Q}. Only relevant for the "\code{IMIFA}", "\code{OMIFA}", and "\code{MIFA}" methods, and only invoked when \code{adapt} is \code{TRUE}. May be useful for low-dimensional data sets for which identifiable solutions are desired. #' @param cluster.shrink A logical value indicating whether to place the prior specified by \code{sigma.hyper} on the cluster shrinkage parameters. Defaults to \code{TRUE}. Specifying \code{FALSE} is equivalent to fixing all cluster shrinkage parameters to 1. Only relevant for the "\code{IMIFA}", "\code{OMIFA}", and "\code{MIFA}" methods. If invoked, the posterior mean cluster shrinkage factors will be reported. @@ -1676,11 +1694,11 @@ !is.numeric(alpha.d2), c(alpha.d1, alpha.d2) < 1)) stop("All column shrinkage shape hyperparameter values must be numeric and at least 1", call.=FALSE) if(prop > 1 || - prop <= 0) stop("'prop' must be lie in the interval (0, 1]", call.=FALSE) - if(eps <= 0 || - eps >= 1) stop("'eps' must be lie in the interval (0, 1)", call.=FALSE) + prop <= 0) stop("'prop' must be lie in the interval (0, 1]", call.=FALSE) + if(eps <= 0) stop("'eps' must be greater than 0", call.=FALSE) + if(eps >= 1) warning("'eps' should typically be smaller than 1, particularly for scaled data\n", call.=FALSE, immediate.=TRUE) if(any(length(adapt) != 1, - !is.logical(adapt))) stop("'adapt' must be a single logical indicator", call.=FALSE) + !is.logical(adapt))) stop("'adapt' must be a single logical indicator", call.=FALSE) if(any(length(forceQg) != 1, !is.logical(forceQg))) stop("'forceQg' must be a single logical indicator", call.=FALSE) forceQg <- forceQg && adapt @@ -1736,7 +1754,7 @@ #' #' @details \code{\link{storeControl}} is provided for assigning values for IMIFA models within \code{\link{mcmc_IMIFA}}. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning). #' -#' @note Posterior inference and plotting won't be posssible for parameters not stored. +#' @note Posterior inference and plotting won't be possible for parameters not stored. #' #' Non-storage of parameters will almost surely prohibit the computation of posterior predictive checking error metrics within \code{\link{get_IMIFA_results}} also. In particular, if such error metrics are desired, \code{mu.switch} and \code{psi.switch} must be \code{TRUE} for all but the "\code{FA}" and "\code{IFA}" models, \code{load.switch} must be \code{TRUE} for all but the entirely zero-factor models, and \code{pi.switch} must be \code{TRUE} for models with clustering structure and unequal mixing proportions for all but the PPRE metric. \code{score.switch=TRUE} is not required for any posterior predictive checking. #' @@ -1853,12 +1871,13 @@ #' Pareto Scaling #' -#' Pareto scaling of a numeric matrix, with or without centering. Obserations are scaled by the square-root of their column-wise standard deviations. +#' Pareto scaling of a numeric matrix, with or without centering. Observations are scaled by the square-root of their column-wise standard deviations. #' @param x A numeric matrix-like object. #' @param centering A logical vector indicating whether centering is to be applied (default=\code{TRUE}). #' #' @return The Pareto scaled version of the matrix \code{x}. #' @importFrom matrixStats "colMeans2" +#' @importFrom Rfast "colVars" #' @export #' @author Keefe Murphy - <\email{keefe.murphy@@ucd.ie}> #' @references van den Berg, R.A., Hoefsloot, H.C.J, Westerhuis, J.A. and Smilde, A.K. and van der Werf, M.J. (2006) Centering, scaling, and transformations: improving the biological information content of metabolomics data. \emph{BMC Genomics}, 7, 1, 142. @@ -1873,7 +1892,7 @@ if(length(centering) != 1 || !is.logical(centering)) stop("'centering' must be a single logical indicator", call.=FALSE) x <- as.matrix(x) - .scale2(x, centering, sqrt(.col_vars(x, std=TRUE))) + .scale2(x, centering, sqrt(colVars(x, std=TRUE))) } # Other Hidden Functions @@ -1959,7 +1978,7 @@ } #' @importFrom matrixStats "colMeans2" - .col_vars <- function(x, std = FALSE, avg = NULL) { + .col_vars <- function(x, std = FALSE, avg = NULL) { # formerly replaced Rfast::colVars if(length(std) > 1 || !is.logical(std)) stop("'std' must be a single logical indicator") if(!is.matrix(x)) stop("'x' must be a matrix") @@ -2122,18 +2141,18 @@ } .permutations <- function(n) { - if(n == 1) { return(matrix(1)) - } else if(n < 2) stop("n must be a positive integer", call.=FALSE) + if(n == 1) { return(matrix(1L)) + } else if(n < 2) stop("n must be a positive integer", call.=FALSE) z <- matrix(1L) - for(i in 2:n) { - x <- cbind(z, i) - a <- c(1L:i, 1L:(i - 1L)) - nr <- nrow(x) - z <- matrix(0L, ncol=ncol(x), nrow=i * nr) - z[1:nr,] <- x - for(j in 2:i - 1) { - z[j * nr + 1L:nr,] <- x[,a[1L:i + j]] - } + for(i in 2L:n) { + x <- cbind(z, i) + a <- c(1L:i, 1L:(i - 1L)) + nr <- nrow(x) + z <- matrix(0L, ncol=ncol(x), nrow=i * nr) + z[1L:nr,] <- x + for(j in (2L:i) - 1L) { + z[j * nr + 1L:nr,] <- x[,a[(1L:i) + j]] + } } dimnames(z) <- NULL z @@ -2216,7 +2235,8 @@ tmp } - .row_vars <- function(x, std = FALSE, suma = NULL) { # replaces Rfast::rowVars + #' @importFrom matrixStats "rowSums2" + .row_vars <- function(x, std = FALSE, suma = NULL) { # formerly replaced Rfast::rowVars if(length(std) > 1 || !is.logical(std)) stop("'std' must be a single logical indicator") if(!is.matrix(x)) stop("'x' must be a matrix") @@ -2228,24 +2248,25 @@ #' @importFrom matrixStats "colMeans2" "rowSums2" .scale2 <- function(x, center = TRUE, scale = TRUE) { # replaces Rfast::standardise - cmeans <- if(isTRUE(center)) colMeans2(x) else center - center <- if(is.logical(center)) center else is.numeric(center) - scaling <- if(is.logical(scale)) scale else is.numeric(scale) + cmeans <- if(isTRUE(center)) colMeans2(x) else center + center <- if(is.logical(center)) center else is.numeric(center) + scaling <- if(is.logical(scale)) scale else is.numeric(scale) if(center && scaling) { y <- t(x) - cmeans if(isTRUE(scale)) t(y/sqrt(rowSums2(y * y)) * sqrt(nrow(x) - 1L)) else t(y/scale) } else if(center) { t(t(x) - cmeans) } else if(scaling) { - t(t(x)/if(isTRUE(scale)) .col_vars(x, std=TRUE) else scale) + t(t(x)/if(isTRUE(scale)) colVars(x, std=TRUE) else scale) } else x } + #' @importFrom Rfast "colVars" .tune_beta0 <- function(beta0, dat, type=c("diag", "mse")) { # unused dat <- as.matrix(dat) N <- nrow(dat) P <- ncol(dat) - inv.cov <- (beta0 + N/2L) * base::solve(diag(beta0, P) + 0.5 * crossprod(.scale2(dat, TRUE, TRUE))) * tcrossprod(1/.col_vars(dat, std=TRUE)) + inv.cov <- (beta0 + N/2L) * base::solve(diag(beta0, P) + 0.5 * crossprod(.scale2(dat, TRUE, TRUE))) * tcrossprod(1/colVars(dat, std=TRUE)) error <- diag(P) - (stats::cov(dat) %*% inv.cov) switch(EXPR=match.arg(type), diag=sum(diag(error)^2)/P, mean(error * error)) } diff --git a/R/Gibbs_FA.R b/R/Gibbs_FA.R index c7663f1..08d8ae1 100644 --- a/R/Gibbs_FA.R +++ b/R/Gibbs_FA.R @@ -49,7 +49,7 @@ eta <- .sim_eta_p(Q=Q, N=N) lmat <- matrix(.sim_load_p(Q=Q, P=P, sigma.l=sigma.l), nrow=P, ncol=Q) psi.inv <- .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta) - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data, avg=mu), max(.col_vars(data, avg=mu))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) max.p <- (psi.alpha - 1)/psi.beta inf.ind <- psi.inv > max(max.p) psi.inv[inf.ind] <- switch(EXPR=uni.type, constrained=max.p, rep(max.p, P))[inf.ind] diff --git a/R/Gibbs_IFA.R b/R/Gibbs_IFA.R index b4a1bf7..902fed7 100644 --- a/R/Gibbs_IFA.R +++ b/R/Gibbs_IFA.R @@ -53,7 +53,7 @@ tau <- cumprod(delta) lmat <- matrix(vapply(Pseq, function(j) .sim_load_ps(Q=Q, phi=phi[j,], tau=tau), numeric(Q)), nrow=P, byrow=TRUE) psi.inv <- .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta) - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data, avg=mu), max(.col_vars(data, avg=mu))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) max.p <- (psi.alpha - 1)/psi.beta inf.ind <- psi.inv > max(max.p) psi.inv[inf.ind] <- switch(EXPR=uni.type, constrained=max.p, rep(max.p, P))[inf.ind] diff --git a/R/Gibbs_IMFA.R b/R/Gibbs_IMFA.R index 3c531c3..3363ffa 100644 --- a/R/Gibbs_IMFA.R +++ b/R/Gibbs_IMFA.R @@ -102,7 +102,7 @@ } else psi.inv <- replicate(trunc.G, .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta), simplify="array") psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) @@ -175,7 +175,7 @@ ksi <- pi.prop log.ksi <- log(ksi) } - u.slice <- stats::runif(N, 0, ksi[z]) + u.slice <- stats::runif(N, 0L, ksi[z]) min.u <- min(u.slice) G.old <- G if(ind.slice) { diff --git a/R/Gibbs_IMIFA.R b/R/Gibbs_IMIFA.R index f4407f6..e8ea516 100644 --- a/R/Gibbs_IMIFA.R +++ b/R/Gibbs_IMIFA.R @@ -117,7 +117,7 @@ } else psi.inv <- replicate(trunc.G, .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta), simplify="array") psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) @@ -167,15 +167,16 @@ dat.g <- lapply(Gs, function(g) data[z == g,, drop=FALSE]) c.data <- lapply(Gs, function(g) sweep(dat.g[[g]], 2L, mu[,g], FUN="-", check.margin=FALSE)) n.eta <- nn + n0q0 <- nn0 & Q0 if(!any(Q0)) { eta <- .empty_mat(nr=N) eta.tmp <- lapply(Gs, function(g) eta[z == g,, drop=FALSE]) lmat[Gs] <- replicate(G, .empty_mat(nr=P)) } else { - eta.tmp <- lapply(Gs, function(g) if(all(nn0[g], Q0[g])) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else matrix(0L, nrow=ifelse(Q0[g], 0, nn[g]), ncol=Qs[g])) - EtE <- lapply(Gs, function(g) if(nn0[g]) crossprod(eta.tmp[[g]])) - lmat[Gs] <- lapply(Gs, function(g) matrix(if(all(nn0[g], Q0[g])) vapply(Ps, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], - EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else + eta.tmp <- lapply(Gs, function(g) if(n0q0[g]) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else base::matrix(0L, nrow=ifelse(Q0[g], 0L, nn[g]), ncol=Qs[g])) + EtE <- lapply(Gs, function(g) if(n0q0[g]) crossprod(eta.tmp[[g]])) + lmat[Gs] <- lapply(Gs, function(g) matrix(if(n0q0[g]) vapply(Ps, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], + EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else vapply(Ps, function(j) .sim_load_ps(Q=Qs[g], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])), nrow=P, byrow=TRUE)) } @@ -196,33 +197,34 @@ sum.data=sum.data[,g], lmat=lmat[[g]], mu.zero=mu.zero) else .sim_mu_p(P=P, sig.mu.sqrt=sig.mu.sqrt, mu.zero=mu.zero), numeric(P)) # Shrinkage - if(all(Q0)) { + if(any(Q0)) { load.2 <- lapply(lmat[Gs], .power2) - phi[Gs] <- lapply(Gs, function(g) if(nn0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], + phi[Gs] <- lapply(Gs, function(g) if(n0q0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], load.2=load.2[[g]], sigma=MGPsig[g]) else .sim_phi_p(Q=Qs[g], P=P, nu1=nu1, nu2=nu2)) - sum.terms <- lapply(Gs, function(g) if(nn0[g]) colSums2(phi[[g]] * load.2[[g]])) + sum.terms <- lapply(Gs, function(g) if(n0q0[g]) colSums2(phi[[g]] * load.2[[g]])) for(g in Gs) { Qg <- Qs[g] Q1g <- Q1[g] - if(nn0[g]) { + if(n0q0[g]) { for(k in seq_len(Qg)) { delta[[g]][k] <- if(k > 1) .sim_deltak(alpha.d2=alpha.d2, beta.d2=beta.d2, delta.k=delta[[g]][k], tau.kq=tau[[g]][k:Qg], P=P, Q=Qg, k=k, sum.term.kq=sum.terms[[g]][k:Qg], sigma=MGPsig[g]) else .sim_delta1(Q=Qg, P=P, tau=tau[[g]], sum.term=sum.terms[[g]], alpha.d1=ifelse(Q1g, alpha.d2, alpha.d1), beta.d1=ifelse(Q1g, beta.d2, beta.d1), delta.1=delta[[g]][1L], sigma=MGPsig[g]) tau[[g]] <- cumprod(delta[[g]]) - } + } } else { - for(k in seq_len(Qg)) { - delta[[g]][k] <- if(k > 1) .sim_delta_p(alpha=alpha.d2, beta=beta.d2) else .sim_delta_p(alpha=ifelse(Q1g, alpha.d2, alpha.d1), beta=ifelse(Q1g, beta.d2, beta.d1)) + if(Q0[g]) { + delta[[g]] <- c(.sim_delta_p(alpha=ifelse(Q1g, alpha.d2, alpha.d1), beta=ifelse(Q1g, beta.d2, beta.d1)), .sim_delta_p(Q=Qg, alpha=alpha.d2, beta=beta.d2)) tau[[g]] <- cumprod(delta[[g]]) } } } if(cluster.shrink) { - nnX <- nn0[Gs] - nnG0 <- which(nnX) - MGPsig[nnG0] <- .sim_sigma(G=G.non, P=P, Qs=Qs[nnG0], rho1=rho1, rho2=rho2, sum.terms=sum.terms[nnG0], tau=tau[nnG0]) - MGPsig[which(!nnX)] <- .sim_sigma_p(G=G - G.non, rho1=rho1, rho2=rho2) + nnX <- n0q0[Gs] + n0Gq <- which(nnX) + nGq0 <- length(n0Gq) + MGPsig[n0Gq] <- .sim_sigma(G=nGq0, P=P, Qs=Qs[n0Gq], rho1=rho1, rho2=rho2, sum.terms=sum.terms[n0Gq], tau=tau[n0Gq]) + MGPsig[which(!nnX)] <- .sim_sigma_p(G=G - nGq0, rho1=rho1, rho2=rho2) } } @@ -231,7 +233,7 @@ ksi <- pi.prop log.ksi <- log(ksi) } - u.slice <- stats::runif(N, 0, ksi[z]) + u.slice <- stats::runif(N, 0L, ksi[z]) min.u <- min(u.slice) G.old <- G if(ind.slice) { @@ -329,7 +331,7 @@ phi[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(phi[[g]][,seq_len(Qs.old[h])], .rgamma0(n=P, shape=nu1, rate=nu2)) else phi[[g]][,nonred[[h]], drop=FALSE]) delta[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) c(delta[[g]][seq_len(Qs.old[h])], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) else delta[[g]][nonred[[h]]]) tau[nn0] <- lapply(delta[nn.ind], cumprod) - lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) + lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]] * MGPsig[g])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) Qemp <- Qs[!nn0] Qpop <- Qs[nn0] Qmax <- max(Qpop) @@ -350,10 +352,10 @@ delta[[t]] <- c(delta[[t]], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) tau[[t]] <- cumprod(delta[[t]]) if(store.eta && t %in% Gs) { - eta.tmp[[t]] <- cbind(eta.tmp[[t]], .empty_mat(nc=1)) + eta.tmp[[t]] <- cbind(eta.tmp[[t]], base::matrix(0L, nrow=n.eta[t], ncol=1L)) } Qt <- Qt + 1L - lmat[[t]] <- cbind(lmat[[t]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[t]][,Qt] * tau[[t]][Qt])))) + lmat[[t]] <- cbind(lmat[[t]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[t]][,Qt] * tau[[t]][Qt] * MGPsig[t])))) } } } @@ -476,8 +478,8 @@ if(all(sw["s.sw"], any(Q0))) { max.Q <- max(Qs) - eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gs, function(g) cbind(eta.tmp[[g]], matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp - q0ng <- (!Q0 | Q1)[seq_along(n.eta)] & n.eta > 0 + eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gs, function(g) cbind(eta.tmp[[g]], base::matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp + q0ng <- (!Q0 | Q1) & n.eta > 0 if(any(q0ng)) { eta.tmp[q0ng] <- lapply(Gs[q0ng], function(g, x=eta.tmp[[g]]) { row.names(x) <- row.names(dat.g[[g]]); x }) } diff --git a/R/Gibbs_MFA.R b/R/Gibbs_MFA.R index dae137e..f91a723 100644 --- a/R/Gibbs_MFA.R +++ b/R/Gibbs_MFA.R @@ -84,7 +84,7 @@ } else psi.inv <- vapply(Gseq, function(g) .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta[,g]), numeric(P)) psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) @@ -164,12 +164,12 @@ psi.inv[,left] <- psi.inv[,right, drop=FALSE] pi.prop[left] <- pi.prop[right] nn[left] <- nn[right] - if(mu0g) { - mu.zero[,left] <- mu.zero[,right, drop=FALSE] - } - if(psi0g) { - psi.beta[,left] <- psi.beta[,right, drop=FALSE] - } + if(mu0g) { + mu.zero[,left] <- mu.zero[,right, drop=FALSE] + } + if(psi0g) { + psi.beta[,left] <- psi.beta[,right, drop=FALSE] + } } } diff --git a/R/Gibbs_MIFA.R b/R/Gibbs_MIFA.R index 4dc3491..078cbc1 100644 --- a/R/Gibbs_MIFA.R +++ b/R/Gibbs_MIFA.R @@ -105,7 +105,7 @@ } else psi.inv <- vapply(Gseq, function(g) .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta[,g]), numeric(P)) psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) @@ -130,15 +130,16 @@ dat.g <- lapply(Gseq, function(g) data[z == g,, drop=FALSE]) c.data <- lapply(Gseq, function(g) sweep(dat.g[[g]], 2L, mu[,g], FUN="-", check.margin=FALSE)) n.eta <- nn + n0q0 <- nn0 & Q0 if(!any(Q0)) { eta <- .empty_mat(nr=N) eta.tmp <- lapply(Gseq, function(g) eta[z == g,, drop=FALSE]) lmat <- replicate(G, .empty_mat(nr=P)) } else { - eta.tmp <- lapply(Gseq, function(g) if(all(nn0[g], Q0[g])) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else matrix(0L, nrow=ifelse(Q0[g], 0, nn[g]), ncol=Qs[g])) - EtE <- lapply(Gseq, function(g) if(nn0[g]) crossprod(eta.tmp[[g]])) - lmat <- lapply(Gseq, function(g) matrix(if(all(nn0[g], Q0[g])) vapply(Pseq, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], - EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else + eta.tmp <- lapply(Gseq, function(g) if(n0q0[g]) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else base::matrix(0L, nrow=ifelse(Q0[g], 0L, nn[g]), ncol=Qs[g])) + EtE <- lapply(Gseq, function(g) if(n0q0[g]) crossprod(eta.tmp[[g]])) + lmat <- lapply(Gseq, function(g) matrix(if(n0q0[g]) vapply(Pseq, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], + EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else vapply(Pseq, function(j) .sim_load_ps(Q=Qs[g], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])), nrow=P, byrow=TRUE)) } @@ -159,15 +160,15 @@ sum.data=sum.data[,g], lmat=lmat[[g]], N=nn[g], P=P) else .sim_mu_p(P=P, sig.mu.sqrt=sig.mu.sqrt, mu.zero=mu.zero[,g]), numeric(P)) # Shrinkage - if(all(Q0)) { + if(any(Q0)) { load.2 <- lapply(lmat, .power2) - phi <- lapply(Gseq, function(g) if(nn0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], + phi <- lapply(Gseq, function(g) if(n0q0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], load.2=load.2[[g]], sigma=MGPsig[g]) else .sim_phi_p(Q=Qs[g], P=P, nu1=nu1, nu2=nu2)) - sum.terms <- lapply(Gseq, function(g) if(nn0[g]) colSums2(phi[[g]] * load.2[[g]])) + sum.terms <- lapply(Gseq, function(g) if(n0q0[g]) colSums2(phi[[g]] * load.2[[g]])) for(g in Gseq) { Qg <- Qs[g] Q1g <- Q1[g] - if(nn0[g]) { + if(n0q0[g]) { for(k in seq_len(Qg)) { delta[[g]][k] <- if(k > 1) .sim_deltak(alpha.d2=alpha.d2[g], beta.d2=beta.d2, delta.k=delta[[g]][k], tau.kq=tau[[g]][k:Qg], P=P, Q=Qg, k=k, sum.term.kq=sum.terms[[g]][k:Qg], sigma=MGPsig[g]) else .sim_delta1(Q=Qg, P=P, tau=tau[[g]], sum.term=sum.terms[[g]], @@ -175,16 +176,16 @@ tau[[g]] <- cumprod(delta[[g]]) } } else { - for(k in seq_len(Qg)) { - delta[[g]][k] <- if(k > 1) .sim_delta_p(alpha=alpha.d2[g], beta=beta.d2) else .sim_delta_p(alpha=ifelse(Q1g, alpha.d2[g], alpha.d1[g]), beta=ifelse(Q1g, beta.d2, beta.d1)) + if(Q0[g]) { + delta[[g]] <- c(.sim_delta_p(alpha=ifelse(Q1g, alpha.d2[g], alpha.d1[g]), beta=ifelse(Q1g, beta.d2, beta.d1)), .sim_delta_p(Q=Qg, alpha=alpha.d2[g], beta=beta.d2)) tau[[g]] <- cumprod(delta[[g]]) } } } if(cluster.shrink) { - G.non <- length(nn.ind) - MGPsig[nn0] <- .sim_sigma(G=G.non, P=P, Qs=Qs[nn0], rho1=rho1, rho2=rho2, sum.terms=sum.terms[nn0], tau=tau[nn0]) - MGPsig[!nn0] <- .sim_sigma_p(G=G - G.non, rho1=rho1, rho2=rho2) + nGq0 <- sum(n0q0) + MGPsig[n0q0] <- .sim_sigma(G=nGq0, P=P, Qs=Qs[n0q0], rho1=rho1, rho2=rho2, sum.terms=sum.terms[n0q0], tau=tau[n0q0]) + MGPsig[!n0q0] <- .sim_sigma_p(G=G - nGq0, rho1=rho1, rho2=rho2) } } @@ -229,7 +230,7 @@ phi[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(phi[[g]][,seq_len(Qs.old[h])], .rgamma0(n=P, shape=nu1, rate=nu2)) else phi[[g]][,nonred[[h]], drop=FALSE]) delta[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) c(delta[[g]][seq_len(Qs.old[h])], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) else delta[[g]][nonred[[h]]]) tau[nn0] <- lapply(delta[nn.ind], cumprod) - lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) + lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]] * MGPsig[g])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) Qemp <- Qs[!nn0] Qpop <- Qs[nn0] Qmax <- max(Qpop) @@ -250,10 +251,10 @@ delta[[g]] <- c(delta[[g]], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) tau[[g]] <- cumprod(delta[[g]]) if(store.eta) { - eta.tmp[[g]] <- cbind(eta.tmp[[g]], .empty_mat(nc=1)) + eta.tmp[[g]] <- cbind(eta.tmp[[g]], base::matrix(0L, nrow=n.eta[g], ncol=1L)) } Qg <- Qg + 1L - lmat[[g]] <- cbind(lmat[[g]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qg] * tau[[g]][Qg])))) + lmat[[g]] <- cbind(lmat[[g]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qg] * tau[[g]][Qg] * MGPsig[g])))) } } } @@ -295,21 +296,21 @@ n.eta[left] <- n.eta[right] } if(mu0g) { - mu.zero[,left] <- mu.zero[,right, drop=FALSE] + mu.zero[,left] <- mu.zero[,right, drop=FALSE] } if(psi0g) { - psi.beta[,left] <- psi.beta[,right, drop=FALSE] + psi.beta[,left] <- psi.beta[,right, drop=FALSE] } if(all(delta0g, !ad1.x)) { - alpha.d1[left] <- alpha.d1[right] + alpha.d1[left] <- alpha.d1[right] } if(all(delta0g, !ad2.x)) { - alpha.d2[left] <- alpha.d2[right] + alpha.d2[left] <- alpha.d2[right] } if(cluster.shrink) { - MGPsig[left] <- MGPsig[right] + MGPsig[left] <- MGPsig[right] } } } @@ -320,15 +321,15 @@ if(z.err && !err.z) { cat("\n"); warning("\nAlgorithm may slow due to corrections for Choleski decompositions of non-positive-definite covariance matrices\n", call.=FALSE) err.z <- TRUE } - if(storage) { + if(storage) { if(verbose) utils::setTxtProgressBar(pb, iter) new.it <- which(iters == iter) if(sw["mu.sw"]) mu.store[,,new.it] <- mu if(all(sw["s.sw"], any(Q0))) { max.Q <- max(Qs) - eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gseq, function(g) cbind(eta.tmp[[g]], matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp - q0ng <- (!Q0 | Q1) & n.eta > 0 + eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gseq, function(g) cbind(eta.tmp[[g]], base::matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp + q0ng <- (!Q0 | Q1) & n.eta > 0 if(any(q0ng)) { eta.tmp[q0ng] <- lapply(Gseq[q0ng], function(g, x=eta.tmp[[g]]) { row.names(x) <- row.names(dat.g[[g]]); x }) } diff --git a/R/Gibbs_OMFA.R b/R/Gibbs_OMFA.R index 6572aaa..3f323b5 100644 --- a/R/Gibbs_OMFA.R +++ b/R/Gibbs_OMFA.R @@ -83,7 +83,7 @@ } else psi.inv <- replicate(G, .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta), simplify="array") psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) diff --git a/R/Gibbs_OMIFA.R b/R/Gibbs_OMIFA.R index 11320a3..9e418ad 100644 --- a/R/Gibbs_OMIFA.R +++ b/R/Gibbs_OMIFA.R @@ -98,7 +98,7 @@ } else psi.inv <- replicate(G, .sim_psi_ip(P=P, psi.alpha=psi.alpha, psi.beta=psi.beta), simplify="array") psi.inv <- if(uni) t(psi.inv) else psi.inv if(isTRUE(one.uni)) { - psi.inv[] <- 1/switch(EXPR=uni.type, constrained=.col_vars(data), max(.col_vars(data))) + psi.inv[] <- 1/switch(EXPR=uni.type, constrained=colVars(data), max(colVars(data))) } else { tmp.psi <- (nn[nn0] - 1L)/pmax(rowsum(data^2, z) - rowsum(data, z)^2/nn[nn0], 0L) tmp.psi <- switch(EXPR=uni.type, unconstrained=t(tmp.psi), matrix(Rfast::rowMaxs(tmp.psi, value=TRUE), nrow=P, ncol=G, byrow=TRUE)) @@ -139,15 +139,16 @@ dat.g <- lapply(Gseq, function(g) data[z == g,, drop=FALSE]) c.data <- lapply(Gseq, function(g) sweep(dat.g[[g]], 2L, mu[,g], FUN="-", check.margin=FALSE)) n.eta <- nn + n0q0 <- nn0 & Q0 if(!any(Q0)) { eta <- .empty_mat(nr=N) eta.tmp <- lapply(Gseq, function(g) eta[z == g,, drop=FALSE]) lmat <- replicate(G, .empty_mat(nr=P)) } else { - eta.tmp <- lapply(Gseq, function(g) if(all(nn0[g], Q0[g])) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else matrix(0L, nrow=ifelse(Q0[g], 0, nn[g]), ncol=Qs[g])) - EtE <- lapply(Gseq, function(g) if(nn0[g]) crossprod(eta.tmp[[g]])) - lmat <- lapply(Gseq, function(g) matrix(if(all(nn0[g], Q0[g])) vapply(Pseq, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], - EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else + eta.tmp <- lapply(Gseq, function(g) if(n0q0[g]) .sim_score(N=nn[g], lmat=lmat[[g]], Q=Qs[g], Q1=Q1[g], c.data=c.data[[g]], psi.inv=psi.inv[,g]) else base::matrix(0L, nrow=ifelse(Q0[g], 0L, nn[g]), ncol=Qs[g])) + EtE <- lapply(Gseq, function(g) if(n0q0[g]) crossprod(eta.tmp[[g]])) + lmat <- lapply(Gseq, function(g) matrix(if(n0q0[g]) vapply(Pseq, function(j) .sim_load_s(Q=Qs[g], c.data=c.data[[g]][,j], Q1=Q1[g], + EtE=EtE[[g]], eta=eta.tmp[[g]], psi.inv=psi.inv[,g][j], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])) else vapply(Pseq, function(j) .sim_load_ps(Q=Qs[g], phi=phi[[g]][j,], tau=tau[[g]], sigma=MGPsig[g]), numeric(Qs[g])), nrow=P, byrow=TRUE)) } @@ -168,15 +169,15 @@ sum.data=sum.data[,g], lmat=lmat[[g]], mu.zero=mu.zero) else .sim_mu_p(P=P, sig.mu.sqrt=sig.mu.sqrt, mu.zero=mu.zero), numeric(P)) # Shrinkage - if(all(Q0)) { + if(any(Q0)) { load.2 <- lapply(lmat, .power2) - phi <- lapply(Gseq, function(g) if(nn0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], + phi <- lapply(Gseq, function(g) if(n0q0[g]) .sim_phi(Q=Qs[g], P=P, nu1=nu1, nu2=nu2, tau=tau[[g]], load.2=load.2[[g]], sigma=MGPsig[g]) else .sim_phi_p(Q=Qs[g], P=P, nu1=nu1, nu2=nu2)) - sum.terms <- lapply(Gseq, function(g) if(nn0[g]) colSums2(phi[[g]] * load.2[[g]])) + sum.terms <- lapply(Gseq, function(g) if(n0q0[g]) colSums2(phi[[g]] * load.2[[g]])) for(g in Gseq) { Qg <- Qs[g] Q1g <- Q1[g] - if(nn0[g]) { + if(n0q0[g]) { for(k in seq_len(Qg)) { delta[[g]][k] <- if(k > 1) .sim_deltak(alpha.d2=alpha.d2, beta.d2=beta.d2, delta.k=delta[[g]][k], tau.kq=tau[[g]][k:Qg], P=P, Q=Qg, k=k, sum.term.kq=sum.terms[[g]][k:Qg], sigma=MGPsig[g]) else .sim_delta1(Q=Qg, P=P, tau=tau[[g]], sum.term=sum.terms[[g]], @@ -184,15 +185,16 @@ tau[[g]] <- cumprod(delta[[g]]) } } else { - for(k in seq_len(Qg)) { - delta[[g]][k] <- if(k > 1) .sim_delta_p(alpha=alpha.d2, beta=beta.d2) else .sim_delta_p(alpha=ifelse(Q1g, alpha.d2, alpha.d1), beta=ifelse(Q1g, beta.d2, beta.d1)) + if(Q0[g]) { + delta[[g]] <- c(.sim_delta_p(alpha=ifelse(Q1g, alpha.d2, alpha.d1), beta=ifelse(Q1g, beta.d2, beta.d1)), .sim_delta_p(Q=Qg, alpha=alpha.d2, beta=beta.d2)) tau[[g]] <- cumprod(delta[[g]]) } } } if(cluster.shrink) { - MGPsig[nn0] <- .sim_sigma(G=G.non, P=P, Qs=Qs[nn0], rho1=rho1, rho2=rho2, sum.terms=sum.terms[nn0], tau=tau[nn0]) - MGPsig[!nn0] <- .sim_sigma_p(G=G - G.non, rho1=rho1, rho2=rho2) + nGq0 <- sum(n0q0) + MGPsig[n0q0] <- .sim_sigma(G=nGq0, P=P, Qs=Qs[n0q0], rho1=rho1, rho2=rho2, sum.terms=sum.terms[n0q0], tau=tau[n0q0]) + MGPsig[!n0q0] <- .sim_sigma_p(G=G - nGq0, rho1=rho1, rho2=rho2) } } @@ -238,7 +240,7 @@ phi[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(phi[[g]][,seq_len(Qs.old[h])], .rgamma0(n=P, shape=nu1, rate=nu2)) else phi[[g]][,nonred[[h]], drop=FALSE]) delta[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) c(delta[[g]][seq_len(Qs.old[h])], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) else delta[[g]][nonred[[h]]]) tau[nn0] <- lapply(delta[nn.ind], cumprod) - lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) + lmat[nn0] <- lapply(nn.ind, function(g, h=which(nn.ind == g)) if(notred[h]) cbind(lmat[[g]][,seq_len(Qs.old[h])], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qs[g]] * tau[[g]][Qs[g]] * MGPsig[g])))) else lmat[[g]][,nonred[[h]], drop=FALSE]) Qemp <- Qs[!nn0] Qpop <- Qs[nn0] Qmax <- max(Qpop) @@ -259,10 +261,10 @@ delta[[g]] <- c(delta[[g]], stats::rgamma(n=1, shape=alpha.d2, rate=beta.d2)) tau[[g]] <- cumprod(delta[[g]]) if(store.eta) { - eta.tmp[[g]] <- cbind(eta.tmp[[g]], .empty_mat(nc=1)) + eta.tmp[[g]] <- cbind(eta.tmp[[g]], base::matrix(0L, nrow=n.eta[g], ncol=1L)) } Qg <- Qg + 1L - lmat[[g]] <- cbind(lmat[[g]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qg] * tau[[g]][Qg])))) + lmat[[g]] <- cbind(lmat[[g]], stats::rnorm(n=P, mean=0, sd=sqrt(1/(phi[[g]][,Qg] * tau[[g]][Qg] * MGPsig[g])))) } } } @@ -303,8 +305,8 @@ if(all(sw["s.sw"], any(Q0))) { max.Q <- max(Qs) - eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gseq, function(g) cbind(eta.tmp[[g]], matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp - q0ng <- (!Q0 | Q1) & n.eta > 0 + eta.tmp <- if(length(unique(Qs)) != 1) lapply(Gseq, function(g) cbind(eta.tmp[[g]], base::matrix(0L, nrow=n.eta[g], ncol=max.Q - Qs[g]))) else eta.tmp + q0ng <- (!Q0 | Q1) & n.eta > 0 if(any(q0ng)) { eta.tmp[q0ng] <- lapply(Gseq[q0ng], function(g, x=eta.tmp[[g]]) { row.names(x) <- row.names(dat.g[[g]]); x }) } diff --git a/R/IMIFA.R b/R/IMIFA.R index 16ffe16..f4466c1 100644 --- a/R/IMIFA.R +++ b/R/IMIFA.R @@ -6,8 +6,8 @@ #' \itemize{ #' \item{Type: }{Package} #' \item{Package: }{IMIFA} -#' \item{Date: }{2019-12-11 (this version), 2017-02-02 (original release)} #' \item{Version: }{2.1.2} +#' \item{Date: }{2020-03-30 (this version), 2017-02-02 (original release)} #' \item{Licence: }{GPL (>=2)} #' } #' diff --git a/R/MainFunction.R b/R/MainFunction.R index 7b44b69..5ffc401 100644 --- a/R/MainFunction.R +++ b/R/MainFunction.R @@ -15,7 +15,7 @@ #' \item{"\code{IMIFA}"}{Infinite Mixtures of Infinite Factor Analysers} #' } #' In principle, of course, one could overfit the "\code{MFA}" or "\code{MIFA}" models, but it is recommend to use the corresponding model options which begin with `O' instead. Note that the "\code{classify}" method is not yet implemented. -#' @param range.G Depending on the method employed, either the range of values for the number of clusters, or the conseratively high starting value for the number of clusters. Defaults to (and must be!) \code{1} for the "\code{FA}" and "\code{IFA}" methods. For the "\code{MFA}" and "\code{MIFA}" models this is to be given as a range of candidate models to explore. For the "\code{OMFA}", "\code{OMIFA}", "\code{IMFA}", and "\code{IMIFA}" models, this is the conservatively high number of clusters with which the chain is to be initialised (default = \code{max(25, ceiling(3 * log(N)))} for large N, or \code{min(N-1, ceiling(3 * log(N)))} for small N<=50). +#' @param range.G Depending on the method employed, either the range of values for the number of clusters, or the conservatively high starting value for the number of clusters. Defaults to (and must be!) \code{1} for the "\code{FA}" and "\code{IFA}" methods. For the "\code{MFA}" and "\code{MIFA}" models this is to be given as a range of candidate models to explore. For the "\code{OMFA}", "\code{OMIFA}", "\code{IMFA}", and "\code{IMIFA}" models, this is the conservatively high number of clusters with which the chain is to be initialised (default = \code{max(25, ceiling(3 * log(N)))} for large N, or \code{min(N-1, ceiling(3 * log(N)))} for small N<=50). #' #' For the "\code{OMFA}", and "\code{OMIFA}" models this upper limit remains fixed for the entire length of the chain; the upper limit for the for the "\code{IMFA}" and "\code{IMIFA}" models can be specified via \code{trunc.G} (see \code{\link{bnpControl}}), which must satisfy \code{range.G <= trunc.G < N}. #' @@ -26,7 +26,7 @@ #' #' If \code{length(range.G) * length(range.Q)} is large, consider not storing unnecessary parameters (via \code{\link{storeControl}}), or breaking up the range of models to be explored into chunks and sending each chunk to \code{\link{get_IMIFA_results}}. #' -#' See \code{\link{Ledermann}} for bounds on \code{range.Q}; this is useful in both the finite factor and infinite factor settings, as one may wish to ensure te fixed number of factors, or upper limits on the number of factors, respectively, respects this bound to yield indentifiable solutions, particularly in low-dimensional settings. +#' See \code{\link{Ledermann}} for bounds on \code{range.Q}; this is useful in both the finite factor and infinite factor settings, as one may wish to ensure the fixed number of factors, or upper limits on the number of factors, respectively, respects this bound to yield indentifiable solutions, particularly in low-dimensional settings. #' @param MGP A list of arguments pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models "\code{IFA}", "\code{MIFA}", "\code{OMIFA}", and "\code{IMIFA}" only. Defaults are set by a call to \code{\link{mgpControl}}, with further checking of validity by \code{\link{MGP_check}} (though arguments can also be supplied here directly). #' @param BNP A list of arguments pertaining to the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors, for use with the infinite mixture models "\code{IMFA}" and "\code{IMIFA}", or select arguments related to the Dirichlet concentration parameter for the overfitted mixtures "\code{OMFA}" and \code{"OMIFA"}. Defaults are set by a call to \code{\link{bnpControl}} (though arguments can also be supplied here directly). #' @param mixFA A list of arguments pertaining to \emph{all other} aspects of model fitting, e.g. MCMC settings, cluster initialisation, and hyperparameters common to every \code{method} in the \code{IMIFA} family. Defaults are set by a call to \code{\link{mixfaControl}} (though arguments can also be supplied here directly). @@ -52,7 +52,7 @@ #' \item{\strong{\code{\link{storeControl}}} - }{Supply logical indicators governing storage of parameters of interest for all models in the IMIFA family. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning).} #' } #' Note however that the named arguments of these functions can also be supplied directly. Parameter starting values are obtained by simulation from the relevant prior distribution specified in these control functions, though initial means and mixing proportions are computed empirically. -#' @return A list of lists of lists of class "\code{IMIFA}" to be passed to \code{\link{get_IMIFA_results}}. If the returned object is \code{x}, candidate models are accesible via subsetting, where \code{x} is of the following form: +#' @return A list of lists of lists of class "\code{IMIFA}" to be passed to \code{\link{get_IMIFA_results}}. If the returned object is \code{x}, candidate models are accessible via subsetting, where \code{x} is of the following form: #' #' \code{x[[1:length(range.G)]][[1:length(range.Q)]]}. #' @@ -60,7 +60,7 @@ #' @keywords IMIFA main #' @export #' @importFrom matrixStats "colMeans2" "colSums2" "rowLogSumExps" "rowMeans2" "rowSums2" -#' @importFrom Rfast "matrnorm" +#' @importFrom Rfast "colVars" "matrnorm" #' @importFrom mvnfast "dmvn" #' @importFrom slam "as.simple_sparse_array" "as.simple_triplet_matrix" #' @importFrom mclust "emControl" "Mclust" "mclustBIC" "mclustICL" "hc" "hclass" "hcE" "hcEEE" "hcEII" "hcV" "hcVII" "hcVVV" @@ -167,7 +167,7 @@ mcmc_IMIFA <- function(dat, method = c("IMIFA", "IMFA", "OMIFA", "OMFA", "MIFA" raw.dat <- raw.dat[,num.check, drop=FALSE] } glo.mean <- colMeans2(data.matrix(raw.dat)) - glo.scal <- .col_vars(data.matrix(raw.dat), std=TRUE) + glo.scal <- colVars(data.matrix(raw.dat), std=TRUE) if(isTRUE(mixFA$drop0sd)) { sdx <- glo.scal sd0ind <- sdx <= 0 @@ -648,12 +648,7 @@ mcmc_IMIFA <- function(dat, method = c("IMIFA", "IMFA", "OMIFA", "OMFA", "MIFA" } clust[[g]] <- list(z = zi[[g]], pi.alpha = alpha, pi.prop = pi.prop[[g]]) if(is.element(method, c("MFA", "MIFA"))) { - sw0g.tmp <- sw0gs - if(all(g > 9, any(sw0gs))) { - sw0g.tmp <- stats::setNames(vector("logical", 4L), names(sw0gs)) - warning(paste0(names(which(sw0gs)), " set to FALSE where G > 9, as 'exact' label-switching is not possible in this case\n"), call.=FALSE) - } - clust[[g]] <- append(clust[[g]], list(l.switch = sw0g.tmp)) + clust[[g]] <- append(clust[[g]], list(l.switch = sw0gs)) } if(is.element(method, c("classify", "MIFA"))) { clust[[g]] <- append(clust[[g]], list(alpha.d1 = alpha.d1[[g]], alpha.d2 = alpha.d2[[g]])) @@ -787,7 +782,7 @@ mcmc_IMIFA <- function(dat, method = c("IMIFA", "IMFA", "OMIFA", "OMFA", "MIFA" if(centered) warning("Data supplied is globally centered, are you sure?\n", call.=FALSE) for(g in seq_len(range.G)) { tmp.dat <- raw.dat[zlabels == levels(zlabels)[g],] - scal <- switch(EXPR=scaling, none=FALSE, .col_vars(as.matrix(tmp.dat), std=TRUE)) + scal <- switch(EXPR=scaling, none=FALSE, colVars(as.matrix(tmp.dat), std=TRUE)) scal <- switch(EXPR=scaling, pareto=sqrt(scal), scal) tmp.dat <- .scale2(as.matrix(tmp.dat), center=centering, scale=scal) if(sigmu.miss) { diff --git a/R/PlottingFunctions.R b/R/PlottingFunctions.R index c43d159..1568bf0 100644 --- a/R/PlottingFunctions.R +++ b/R/PlottingFunctions.R @@ -1200,8 +1200,8 @@ plot.Results_IMIFA <- function(x, plot.meth = c("all", "correlation", "density" graphics::title(main="Posterior Confusion Matrix") graphics::mtext(side=1, at=Gseq, Gseq, line=1) graphics::mtext(side=2, at=Gseq, rev(Gseq), line=1, las=1) - graphics::mtext(side=1, "Allocation", line=2) - graphics::mtext(side=2, "Cluster", line=2) + graphics::mtext(side=1, "Cluster", line=2) + graphics::mtext(side=2, "Allocation", line=2) suppressWarnings(heat_legend(plot.x, cols=i.cols, cex.lab=0.8, ...)) } graphics::box(lwd=2) @@ -1437,7 +1437,7 @@ plot.Results_IMIFA <- function(x, plot.meth = c("all", "correlation", "density" if(is.element(param, c("alpha", "discount"))) { plot.x <- switch(EXPR=param, alpha=clust$Alpha$alpha, discount=as.vector(clust$Discount$discount)) if(switch(EXPR=param, alpha=clust$Alpha$alpha.rate, discount=clust$Discount$disc.rate) == 0 || - (attr(x, "Discount") >= 0 && length(unique(round(plot.x, min(.ndeci(plot.x))))) == 1)) { + ((is.null(attr(x, "Discount")) || attr(x, "Discount") >= 0) && length(unique(round(plot.x, min(.ndeci(plot.x))))) == 1)) { warning(paste0(switch(EXPR=param, alpha="Acceptance", discount=ifelse(attr(x, "Kappa0"), "Acceptance", "Mutation")), " rate too low: can't plot ", ifelse(all.ind, ifelse(partial, "partial-", "auto-"), ""), "correlation function", ifelse(all.ind, "\n", "s\n")), call.=FALSE) next } @@ -1704,11 +1704,13 @@ plot.Results_IMIFA <- function(x, plot.meth = c("all", "correlation", "density" #' # Now plot them to examine tail behaviour as discount increases #' # (PY <- G_priorDensity(N=50, alpha=c(19.23356, 6.47006, 1), discount=c(0, 0.47002, 0.7300045))) G_priorDensity <- function(N, alpha, discount = 0, show.plot = TRUE, type = "h") { + igmp <- isNamespaceLoaded("Rmpfr") firstex <- suppressMessages(requireNamespace("Rmpfr", quietly=TRUE)) - if(isTRUE(firstex)) { + if(isFALSE(firstex)) { stop("'Rmpfr' package not installed", call.=FALSE) + } else if(isFALSE(igmp)) { on.exit(.detach_pkg("Rmpfr")) on.exit(.detach_pkg("gmp"), add=TRUE) - } else stop("'Rmpfr' package not installed", call.=FALSE) + } oldpal <- grDevices::palette() on.exit(grDevices::palette(oldpal), add=isFALSE(firstex)) defpar <- suppressWarnings(graphics::par(no.readonly=TRUE)) @@ -1771,7 +1773,7 @@ plot.Results_IMIFA <- function(x, plot.meth = c("all", "correlation", "density" log(Rmpfr::pochMpfr(alphi + 1, N - 1L)) - c(seqN, mn) * log(abs(disci)), rep(-Inf, N - mn)) } lnkd <- lapply(Nseq, function(g) Rmpfr::sumBinomMpfr(g, f=function(k) Rmpfr::pochMpfr(-k * disci, N))) - rx[,i] <- gmp::asNumeric(exp(vnk - lfactorial(Nsq2)) * abs(methods::new("mpfr", unlist(lnkd)))) + rx[,i] <- gmp::asNumeric(exp(vnk - lfactorial(Nsq2)) * abs(Rmpfr::mpfr2array(unlist(lnkd), dim=N))) } } @@ -1952,13 +1954,13 @@ plot.Results_IMIFA <- function(x, plot.meth = c("all", "correlation", "density" #' dat <- train[,ind] #' #' \donttest{# Fit an IMIFA model (warning: quite slow!) -#' # sim <- mcmc_IMIFA(dat, n.iters=100, prec.mu=1e-03, z.init="kmeans", +#' # sim <- mcmc_IMIFA(dat, n.iters=1000, prec.mu=1e-03, z.init="kmeans", #' # centering=FALSE, scaling="none") #' # res <- get_IMIFA_results(sim, zlabels=ylab) #' #' # Examine the posterior mean image of the first two clusters -#' show_IMIFA_digit(res, dat=train, ind=ind) -#' show_IMIFA_digit(res, dat=train, ind=ind, G=2)} +#' # show_IMIFA_digit(res, dat=train, ind=ind) +#' # show_IMIFA_digit(res, dat=train, ind=ind, G=2)} show_IMIFA_digit <- function(res, G = 1L, what = c("mean", "last"), dat = NULL, ind = NULL, ...) { UseMethod("show_IMIFA_digit") } diff --git a/R/SimulateData.R b/R/SimulateData.R index 0d088d1..a499adb 100644 --- a/R/SimulateData.R +++ b/R/SimulateData.R @@ -103,7 +103,7 @@ sim_IMIFA_data <- function(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), anyNA(nn) || any(length(nn) != G, sum(nn) != N)) stop(paste0("'nn' must be an integer vector of length G=", G, " which sums to N=", N, " without missing values"), call.=FALSE) - if(any(nn <= 0)) warning("All 'nn' values should be strictly positive; are you sure you want to simulate empty clusters?\n", call.=FALSE, immediate.=FALSE) + if(any(nn <= 0)) warning("All 'nn' values should be strictly positive; are you sure you want to simulate empty clusters?\n", call.=FALSE, immediate.=TRUE) } else { if(!is.numeric(pis) || anyNA(pis) || @@ -119,7 +119,7 @@ sim_IMIFA_data <- function(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), iter <- iter + 1L nn0 <- any(nn <= 0) } - if(nn0) warning("Empty clusters generated; are you sure?\n Try supplying 'nn' directly instead of small 'pis\n'", call.=FALSE, immediate.=FALSE) + if(nn0) warning("Empty clusters generated; are you sure?\nTry supplying 'nn' directly instead of small 'pis'\n", call.=FALSE, immediate.=TRUE) } if(!(mumiss <- missing(mu))) { @@ -194,7 +194,7 @@ sim_IMIFA_data <- function(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), nztest > P) || nztest != floor(nztest)) stop(paste0("Every entry of 'non.zero' must be a strictly positive integer, less than or equal to P=", P), call.=FALSE) } - simdata <- base::matrix(0L, nrow=0L, ncol=P) + simdata <- .empty_mat(nc=P) true.mu <- true.l <- true.psi <- true.cov <- stats::setNames(vector("list", G), paste0("Cluster", Gseq)) sq_mat <- if(P > 50) function(x) diag(sqrt(diag(x))) else sqrt @@ -228,7 +228,7 @@ sim_IMIFA_data <- function(N = 300L, G = 3L, P = 50L, Q = rep(floor(log(P)), G), if(!all(is.symmetric(covmat), is.numeric(covmat))) stop("Invalid covariance matrix!", call.=FALSE) sigma <- if(all(Q.g > 0, P > 1, method == "marginal")) .chol(is.posi_def(covmat, make=TRUE)$X.new) else sq_mat(covmat) - means <- matrix(mu.true, nrow=N.g, ncol=P, byrow=TRUE) + switch(EXPR=method, conditional=tcrossprod(eta.true[true.zlab == g, seq_len(Q.g), drop=FALSE], l.true), 0L) + means <- base::matrix(mu.true, nrow=N.g, ncol=P, byrow=TRUE) + switch(EXPR=method, conditional=tcrossprod(eta.true[true.zlab == g, seq_len(Q.g), drop=FALSE], l.true), 0L) simdata <- rbind(simdata, means + matrnorm(N.g, P) %*% sigma) dimnames(l.true) <- list(vnames, if(Q.g > 0) paste0("Factor ", seq_len(Q.g))) true.mu[[g]] <- mu.true diff --git a/R/data.R b/R/data.R index 3cc2566..ab54008 100644 --- a/R/data.R +++ b/R/data.R @@ -16,6 +16,8 @@ #' Data on the percentage composition of eight fatty acids found by lipid fraction of 572 Italian olive oils. The data come from three areas; within each area there are a number of constituent regions, of which there are 9 in total. #' @format A data frame with 572 observations and 10 columns. The first columns gives the area (one of Southern Italy, Sardinia, and Northern Italy), the second gives the region, and the remaining 8 columns give the variables. Southern Italy comprises the North Apulia, Calabria, South Apulia, and Sicily regions, Sardinia is divided into Inland Sardinia and Coastal Sardinia and Northern Italy comprises the Umbria, East Liguria, and West Liguria regions. #' @references Forina, M., Armanino, C., Lanteri, S. and Tiscornia, E. (1983). Classification of olive oils from their fatty acid composition, In Martens, H. and Russrum Jr., H. (Eds.), \emph{Food Research and Data Analysis}, Applied Science Publishers, London, pp. 189-214. +#' +#' Forina, M. and Tiscornia, E. (1982). Pattern recognition methods in the prediction of Italian olive oil origin by their fatty acid content, \emph{Annali di Chimica}, 72:143-155. #' @examples #' data(olive, package="IMIFA") #' pairs(olive[,-(1:2)], col=olive$area) diff --git a/inst/NEWS.md b/inst/NEWS.md index 2c0ba9c..817f241 100644 --- a/inst/NEWS.md +++ b/inst/NEWS.md @@ -1,8 +1,20 @@ __Infinite Mixtures of Infinite Factor Analysers__ ================================================== +## IMIFA v2.1.2 - (_9th release [patch update]: 2020-03-30_) ### Bug Fixes -* Ensured compatibility with latest version of `Rfast` package. +* Fixes and speed-ups to MGP updates and adaptive Gibbs sampler for IMIFA/OMIFA/MIFA models: + * Fixes and speed-ups to MGP parameter updates when _some_ clusters have zero factors. + * Additional speed-ups to simulation of column-shrinkage parameters when _some_ clusters are empty. + * Properly accounted for the cluster-shrinkage parameters when the number of factors increases. + * Minor bug fixes for padding scores when the maximum number of factors increases. +* Variable-specific communalities (`x$Error$Var.Exps`) now returned by `get_IMIFA_results` in addition to +proportion of explained variance per cluster (`x$Error$Clust.Exps`; previously `x$Error$Var.Exps`). +* `G_expected` & `G_variance` gain the arg. `MPFR` to control use of suggested packages. +* Minor speed-up to `rDirichlet` for the symmetric uniform case. +* Ensured compatibility with latest version of `Rfast` package (w/ minor speed-ups). +* Removed `methods` package from `Suggests:`. +* Spell-checking of documentation and fixes to `donttest` examples. ## IMIFA v2.1.1 - (_8th release [patch update]: 2019-12-11_) ### Improvements @@ -205,7 +217,7 @@ __Infinite Mixtures of Infinite Factor Analysers__ * Optimised compression of `olive`, `coffee` and vignette data and used `LazyData: true`. * Added `call.=FALSE` to `stop()` messages and `immediate.=TRUE` to certain `warning()` calls. * Removed dependency on`adrop`, `e1071`, `graphics`, `grDevices`, `plotrix`, `stats` & `utils` libraries. -* Reduced dependency on `Rfast` w/ own versions of `colVars`, `rowVars`, & `standardise`. +* Reduced dependency on `Rfast` w/ own version of `standardise`. * Added utility function `IMIFA_news` for accessing this `NEWS` file. * Added `CITATION` file. * Extensively improved package documentation: diff --git a/man/G_moments.Rd b/man/G_moments.Rd index 5f51716..a04f629 100644 --- a/man/G_moments.Rd +++ b/man/G_moments.Rd @@ -8,11 +8,13 @@ \usage{ G_expected(N, alpha, - discount = 0) + discount = 0, + MPFR = TRUE) G_variance(N, alpha, - discount = 0) + discount = 0, + MPFR = TRUE) } \arguments{ \item{N}{The sample size.} @@ -20,6 +22,8 @@ G_variance(N, \item{alpha}{The concentration parameter. Must be specified and must be strictly greater than \code{-discount}. The case \code{alpha=0} is accommodated. When \code{discount} is negative \code{alpha} must be a positive integer multiple of \code{abs(discount)}.} \item{discount}{The discount parameter for the Pitman-Yor process. Must be less than 1, but typically lies in the interval [0, 1). Defaults to 0 (i.e. the Dirichlet process). When \code{discount} is negative \code{alpha} must be a positive integer multiple of \code{abs(discount)}.} + +\item{MPFR}{Logical indicating whether the high-precision libraries \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} are invoked, at the expense of run-time. Defaults to \code{TRUE} and \strong{must} be \code{TRUE} for \code{\link{G_expected}} when \code{alpha=0} and \code{\link{G_variance}} when \code{discount} is non-zero. See \strong{\code{Note}}.} } \value{ The expected number of clusters under the specified prior conditions, or the variance of the number of clusters. @@ -31,18 +35,20 @@ Calculates the \emph{a priori} expected number of clusters or the variance of th All arguments are vectorised. Users can also consult \code{\link{G_priorDensity}} in order to solicit sensible priors. } \note{ -\code{G_variance} requires use of the \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} libraries for non-zero \code{discount} values. \code{G_expected} requires these libraries only for the \code{alpha=0} case. Despite the high precision arithmetic used, the functions can be unstable for small values of \code{discount}. +\code{G_variance} requires use of the \code{\link[Rmpfr]{Rmpfr}} and \code{gmp} libraries for non-zero \code{discount} values. \code{G_expected} requires these libraries only for the \code{alpha=0} case. Despite the high precision arithmetic used, the functions can still be unstable for small values of \code{discount}. See the argument \code{MPFR}. } \examples{ -G_expected(N=50, alpha=19.23356) -G_variance(N=50, alpha=19.23356) +G_expected(N=50, alpha=19.23356, MPFR=FALSE) +G_variance(N=50, alpha=19.23356, MPFR=FALSE) +G_expected(N=50, alpha=c(19.23356, 12.21619, 1), + discount=c(0, 0.25, 0.7300045), MPFR=FALSE) # require("Rmpfr") -# G_expected(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045)) -# G_variance(N=50, alpha=c(19.23356, 12.21619, 1), discount=c(0, 0.25, 0.7300045)) +# G_variance(N=50, alpha=c(19.23356, 12.21619, 1), +# discount=c(0, 0.25, 0.7300045), MPFR=c(FALSE, TRUE, TRUE)) # Examine the growth rate of the DP -DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i)) +DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i, MPFR=FALSE)) matplot(DP, type="l", xlab="N", ylab="G") # Examine the growth rate of the PYP diff --git a/man/IMIFA-package.Rd b/man/IMIFA-package.Rd index f418e23..91a0192 100644 --- a/man/IMIFA-package.Rd +++ b/man/IMIFA-package.Rd @@ -13,8 +13,8 @@ A package for Bayesian nonparameteric clustering of high-dimensional datasets, p \itemize{ \item{Type: }{Package} \item{Package: }{IMIFA} -\item{Date: }{2019-12-11 (this version), 2017-02-02 (original release)} \item{Version: }{2.1.2} +\item{Date: }{2020-03-30 (this version), 2017-02-02 (original release)} \item{Licence: }{GPL (>=2)} } } diff --git a/man/Ledermann.Rd b/man/Ledermann.Rd index 559c993..ebf9557 100644 --- a/man/Ledermann.Rd +++ b/man/Ledermann.Rd @@ -8,7 +8,7 @@ Ledermann(P, isotropic = FALSE) } \arguments{ -\item{P}{Integer number of variables in data set. This argument is vectorized.} +\item{P}{Integer number of variables in data set. This argument is vectorised.} \item{isotropic}{Logical indicating whether uniquenesses are constrained to be isotropic, in which case the bound is simply \eqn{P-1}{P-1}. Defaults to \code{FALSE}.} } @@ -20,5 +20,8 @@ Returns the maximum number of latent factors in a factor analysis model for data } \examples{ Ledermann(c(25, 50, 100)) + +data(olive) +Ledermann(ncol(olive[,-c(1,2)])) } \keyword{utility} diff --git a/man/PGMM_dfree.Rd b/man/PGMM_dfree.Rd index d788350..030f4e3 100644 --- a/man/PGMM_dfree.Rd +++ b/man/PGMM_dfree.Rd @@ -18,7 +18,7 @@ PGMM_dfree(Q, \item{G}{The number of clusters. This defaults to 1. Must be a single strictly positive integer.} -\item{method}{By default, calculation assumes the \code{UUU} model with unconstrained loadings and unconstrained diagonal uniquesseses (i.e. the factor analysis model). The seven other models detailed in McNicholas and Murphy (2008) are given too (of which currently the first four are accomodated within \code{\link{mcmc_IMIFA}}). The first letter denotes whether loadings are constrained/unconstrained across clusters; the second letter denotes the same for the uniquenesses; the final letter denotes whether uniquenesses are in turn constrained to be isotropic. Finally, the 4 extra 4-letter models from the EPGMM family (McNicholas and Murphy, 2010), are also included.} +\item{method}{By default, calculation assumes the \code{UUU} model with unconstrained loadings and unconstrained diagonal uniquesseses (i.e. the factor analysis model). The seven other models detailed in McNicholas and Murphy (2008) are given too (of which currently the first four are accommodated within \code{\link{mcmc_IMIFA}}). The first letter denotes whether loadings are constrained/unconstrained across clusters; the second letter denotes the same for the uniquenesses; the final letter denotes whether uniquenesses are in turn constrained to be isotropic. Finally, the 4 extra 4-letter models from the EPGMM family (McNicholas and Murphy, 2010), are also included.} \item{equal.pro}{Logical variable indicating whether or not the mixing mixing proportions are equal across clusters in the model (default = \code{FALSE}).} } @@ -29,7 +29,7 @@ A vector of length \code{length(Q)} giving the total number of parameters, inclu Estimates the dimension of the 'free' parameters in fully finite factor analytic mixture models, otherwise known as Parsimonious Gaussian Mixture Models (PGMM), typically necessary for the penalty term of various model selection criteria. } \note{ -This function is used to calculate the penalty terms for the \code{aic.mcmc} and \code{bic.mcmc} model selection criteria implemented in \code{\link{get_IMIFA_results}} for \emph{finite} factor models (though \code{\link{mcmc_IMIFA}} currently only implements the \code{UUU}, \code{UUC}, \code{UCU}, and \code{UCC} covariance structures). The function is vectorized with respect to the argument \code{Q}. +This function is used to calculate the penalty terms for the \code{aic.mcmc} and \code{bic.mcmc} model selection criteria implemented in \code{\link{get_IMIFA_results}} for \emph{finite} factor models (though \code{\link{mcmc_IMIFA}} currently only implements the \code{UUU}, \code{UUC}, \code{UCU}, and \code{UCC} covariance structures). The function is vectorised with respect to the argument \code{Q}. Though the function is available for standalone use, note that no checks take place, in order to speed up repeated calls to the function inside \code{\link{mcmc_IMIFA}}. } diff --git a/man/USPSdigits.Rd b/man/USPSdigits.Rd index de16290..13ccd4a 100644 --- a/man/USPSdigits.Rd +++ b/man/USPSdigits.Rd @@ -4,7 +4,8 @@ \name{USPSdigits} \alias{USPSdigits} \title{USPS handwritten digits} -\format{A list of length 2 with the following elements, each one a \code{data.frame}: +\format{ +A list of length 2 with the following elements, each one a \code{data.frame}: \describe{ \item{\code{train}}{The training set of 7,291 digits.} \item{\code{test}}{The test set of 2,007 digits.}} @@ -12,7 +13,8 @@ Each \code{data.frame} contains the known digit labels in its first column. The remaining 256 columns give the concatenation of the 16x16 grid. -Pixels are scaled such that [-1,1] corresponds to [white,black].} +Pixels are scaled such that [-1,1] corresponds to [white,black]. +} \usage{ data(USPSdigits) } diff --git a/man/Zsimilarity.Rd b/man/Zsimilarity.Rd index 50568aa..b3fbae6 100644 --- a/man/Zsimilarity.Rd +++ b/man/Zsimilarity.Rd @@ -12,7 +12,7 @@ Zsimilarity(zs) \value{ A list containing three elements: \item{z.avg}{The 'average' clustering, with minimum squared distance to \code{z.sim}.} -\item{z.sim}{The N x N similary matrix, in a sparse format (see \code{\link[slam]{simple_triplet_matrix}}).} +\item{z.sim}{The N x N similarity matrix, in a sparse format (see \code{\link[slam]{simple_triplet_matrix}}).} \item{MSE.z}{A vector of length M recording the MSEs between each clustering and the 'average' clustering.} } \description{ diff --git a/man/bnpControl.Rd b/man/bnpControl.Rd index d5298e1..96076f6 100644 --- a/man/bnpControl.Rd +++ b/man/bnpControl.Rd @@ -6,7 +6,7 @@ \usage{ bnpControl(learn.alpha = TRUE, alpha.hyper = c(2L, 4L), - discount = NULL, + discount = 0, learn.d = TRUE, d.hyper = c(1L, 1L), ind.slice = TRUE, @@ -28,7 +28,7 @@ In the special case of \code{discount < 0}, \code{alpha} must be a positive inte \item{alpha.hyper}{\describe{ \item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{A vector of length 2 giving hyperparameters for the prior on the Pitman-Yor / Dirichlet process concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(\code{2}, \code{4}). Choosing a larger rate is particularly important, as it encourages clustering. The prior is shifted to have support on (\code{-discount}, \code{Inf}) when non-zero \code{discount} is supplied and remains fixed (i.e. \code{learn.d=FALSE}) or when \code{learn.d=TRUE}.} -\item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{A vector of length 2 giving hyperparameters a and b for the prior on the Dirichlet concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(2, 4). Note that the suplied rate will be multiplied by \code{range.G}, to encourage clustering, such that the form of the prior is Ga(a, b * G).} +\item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{A vector of length 2 giving hyperparameters a and b for the prior on the Dirichlet concentration parameter \code{alpha}. If \code{isTRUE(learn.alpha)}, these are shape and rate parameters of a Gamma distribution. Defaults to Ga(2, 4). Note that the supplied rate will be multiplied by \code{range.G}, to encourage clustering, such that the form of the prior is Ga(a, b * G).} }} \item{discount}{The discount parameter used when generalising the Dirichlet process to the Pitman-Yor process. Defaults to 0, but typically must lie in the interval [0, 1). If greater than zero, \code{alpha} can be supplied greater than \code{-discount}. By default, Metropolis-Hastings steps are invoked for updating this parameter via \code{learn.d}. The special case of \code{discount < 0} is allowed, in which case \code{learn.d=FALSE} is forced and \code{alpha} must be a positive integer multiple of \code{abs(discount)}.} @@ -37,7 +37,7 @@ In the special case of \code{discount < 0}, \code{alpha} must be a positive inte \item{d.hyper}{Hyperparameters for the Beta(a,b) prior on the \code{discount} parameter. Defaults to Beta(1,1), i.e. Uniform(0,1).} -\item{ind.slice}{Logical indicitating whether the independent slice-efficient sampler is to be employed (defaults to \code{TRUE}). If \code{FALSE} the dependent slice-efficient sampler is employed, whereby the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the decreasingly ordered mixing proportions.} +\item{ind.slice}{Logical indicating whether the independent slice-efficient sampler is to be employed (defaults to \code{TRUE}). If \code{FALSE} the dependent slice-efficient sampler is employed, whereby the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the decreasingly ordered mixing proportions.} \item{rho}{Parameter controlling the rate of geometric decay for the independent slice-efficient sampler, s.t. \eqn{\xi=(1-\rho)\rho^{g-1}}{xi = (1 - rho)rho^(g-1)}. Must lie in the interval [0, 1). Higher values are associated with better mixing but longer run times. Defaults to 0.75, but 0.5 is an interesting special case which guarantees that the slice sequence \eqn{\xi_1,\ldots,\xi_g}{xi_1,...,xi_g} is equal to the \emph{expectation} of the decreasingly ordered mixing proportions. Only relevant when \code{ind.slice} is \code{TRUE}.} @@ -45,10 +45,10 @@ In the special case of \code{discount < 0}, \code{alpha} must be a positive inte \item{kappa}{The spike-and-slab prior distribution on the \code{discount} hyperparameter is assumed to be a mixture with point-mass at zero and a continuous Beta(a,b) distribution. \code{kappa} gives the weight of the point mass at zero (the 'spike'). Must lie in the interval [0,1]. Defaults to 0.5. Only relevant when \code{isTRUE(learn.d)}. A value of 0 ensures non-zero discount values (i.e. Pitman-Yor) at all times, and \emph{vice versa}. Note that \code{kappa} will default to exactly 0 if \code{alpha<=0} and \code{learn.alpha=FALSE}.} -\item{IM.lab.sw}{Logial indicating whether the two forced label switching moves are to be implemented (defaults to \code{TRUE}) when running one of the infinite mixture models.} +\item{IM.lab.sw}{Logical indicating whether the two forced label switching moves are to be implemented (defaults to \code{TRUE}) when running one of the infinite mixture models.} \item{zeta}{\describe{ -\item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{Tuning parameter controlling the acceptance rate of the random-walk proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}, where \code{2 * zeta} gives the full width of the uniform proposal distribution. These steps are only invoked when either \code{discount} is non-zero and fixed or \code{learn.d=TRUE}, otherwise \code{alpha} is learned by Gibbs updates. Must be strictly positive (if invoked). Defauts to \code{2}.} +\item{For the "\code{IMFA}" and "\code{IMIFA}" methods:}{Tuning parameter controlling the acceptance rate of the random-walk proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}, where \code{2 * zeta} gives the full width of the uniform proposal distribution. These steps are only invoked when either \code{discount} is non-zero and fixed or \code{learn.d=TRUE}, otherwise \code{alpha} is learned by Gibbs updates. Must be strictly positive (if invoked). Defaults to \code{2}.} \item{For the "\code{OMFA}" and "\code{OMIFA}" methods:}{Tuning parameter controlling the standard deviation of the log-normal proposal for the Metropolis-Hastings steps when \code{learn.alpha=TRUE}. Must be strictly positive (if invoked). Defaults to \code{0.75}.} }} diff --git a/man/coffee.Rd b/man/coffee.Rd index 2541b5f..e978e9e 100644 --- a/man/coffee.Rd +++ b/man/coffee.Rd @@ -4,7 +4,9 @@ \name{coffee} \alias{coffee} \title{Chemical composition of Arabica and Robusta coffee samples} -\format{A data frame with 43 observations and 14 columns. The first two columns contain Variety (either Arabica or Robusta) and Country, respectively, while the remaining 12 columns contain the chemical properties.} +\format{ +A data frame with 43 observations and 14 columns. The first two columns contain Variety (either Arabica or Robusta) and Country, respectively, while the remaining 12 columns contain the chemical properties. +} \usage{ data(coffee) } diff --git a/man/get_IMIFA_results.Rd b/man/get_IMIFA_results.Rd index 9707bce..8631835 100644 --- a/man/get_IMIFA_results.Rd +++ b/man/get_IMIFA_results.Rd @@ -61,7 +61,7 @@ The Frobenius norm is used in the computation of the PPRE, by default, but the \ \item{z.avgsim}{Logical (defaults to \code{FALSE}) indicating whether the clustering should also be summarised with a call to \code{\link{Zsimilarity}} by the clustering with minimum mean squared error to the similarity matrix obtained by averaging the stored adjacency matrices, in addition to the MAP estimate. -Note that the MAP clustering is computed \emph{conditional} on the estimate of the number of clusters (whether that be the modal estimate or the estimate according to \code{criterion}) and other parameters are extracted conditional on this estimate of \code{G}: however, in constrast, the number of distinct clusters in the summarised labels obtained by specifying \code{z.avgsim=TRUE} may not necessarily coincide with the MAP estimate of \code{G}, but it may provide a useful alternative summary of the partitions explored during the chain, and the user is free to call \code{\link{get_IMIFA_results}} again with the new suggested \code{G} value. +Note that the MAP clustering is computed \emph{conditional} on the estimate of the number of clusters (whether that be the modal estimate or the estimate according to \code{criterion}) and other parameters are extracted conditional on this estimate of \code{G}: however, in contrast, the number of distinct clusters in the summarised labels obtained by specifying \code{z.avgsim=TRUE} may not necessarily coincide with the MAP estimate of \code{G}, but it may provide a useful alternative summary of the partitions explored during the chain, and the user is free to call \code{\link{get_IMIFA_results}} again with the new suggested \code{G} value. Please be warned that this feature requires loading the \code{\link[mcclust]{mcclust}} package. This is liable to take considerable time to compute, and may not even be possible if the number of observations &/or number of stored iterations is large and the resulting matrix isn't sufficiently sparse. When \code{z.avgsim=TRUE}, both the summarised clustering and the similarity matrix are stored: the latter can be visualised as part of a call to \code{\link{plot.Results_IMIFA}}.} @@ -104,7 +104,7 @@ For the infinite factor methods, iterations where the maximum number of factors In all cases, only iterations with \code{G} non-empty components are retained. } \note{ -For the "\code{IMIFA}", "\code{IMFA}", "\code{OMIFA}", and "\code{OMFA}" methods, the retained mixing proportions are renormalised after conditioning on the modal \code{G}. This is especially necessary for the compution of the \code{error.metrics}, just note that the values on which posterior inference are conducted will ever so slightly differ from the actually sampled values. +For the "\code{IMIFA}", "\code{IMFA}", "\code{OMIFA}", and "\code{OMFA}" methods, the retained mixing proportions are renormalised after conditioning on the modal \code{G}. This is especially necessary for the computation of the \code{error.metrics}, just note that the values on which posterior inference are conducted will ever so slightly differ from the actually sampled values. Due to the way the offline label-switching correction is performed, different runs of this function may give \emph{very slightly} different results in terms of the cluster labellings (and by extension the parameters, which are permuted in the same way), but only if the chain was run for an extremely small number of iterations, well below the number required for convergence, and samples of the cluster labels match poorly across iterations (particularly if the number of clusters suggested by those sampled labels is high). } diff --git a/man/gumbel_max.Rd b/man/gumbel_max.Rd index d132e9a..be25a04 100644 --- a/man/gumbel_max.Rd +++ b/man/gumbel_max.Rd @@ -16,7 +16,7 @@ gumbel_max(probs, A vector of N sampled cluster labels, with the largest label no greater than G. } \description{ -Samples cluster labels for N observations from G clusters efficiently using log-probabilities and the so-called Gumbel-Max trick, without requiring that the log-probabilities be normalised; thus redunant computation can be avoided. +Samples cluster labels for N observations from G clusters efficiently using log-probabilities and the so-called Gumbel-Max trick, without requiring that the log-probabilities be normalised; thus redundant computation can be avoided. } \details{ Computation takes place on the log scale for stability/underflow reasons (to ensure negligible probabilities won't cause computational difficulties); in any case, many functions for calculating multivariate normal densities already output on the log scale. @@ -50,7 +50,7 @@ If the normalising constant is required for another reason, e.g. to compute the \references{ Murphy, K., Viroli, C., and Gormley, I. C. (2019) Infinite mixtures of infinite factor analysers, \emph{Bayesian Analysis}, 1-27. <\href{https://projecteuclid.org/euclid.ba/1570586978}{doi:10.1214/19-BA1179}>. -Yellot, J. I. Jr. (1977) The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution, \emph{Journal of Mathematical Psychology}, 15: 109-144. +Yellott, J. I. Jr. (1977) The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution, \emph{Journal of Mathematical Psychology}, 15: 109-144. } \seealso{ \code{\link{mcmc_IMIFA}}, \code{\link[matrixStats]{rowLogSumExps}} diff --git a/man/is.posi_def.Rd b/man/is.posi_def.Rd index 13b1d8c..47cd6dd 100644 --- a/man/is.posi_def.Rd +++ b/man/is.posi_def.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/FullConditionals.R \name{is.posi_def} \alias{is.posi_def} -\title{Check Postive-(Semi)definiteness of a matrix} +\title{Check Positive-(Semi)definiteness of a matrix} \usage{ is.posi_def(x, tol = NULL, @@ -18,7 +18,7 @@ is.posi_def(x, \item{semi}{Logical switch to test for positive-semidefiniteness when \code{TRUE} or positive-definiteness when \code{FALSE} (the default).} -\item{make}{Logical switch to return the nearest matrix which satisifies the test - if the test has been passed, this is of course just \code{x} itself, otherwise the nearest positive-(semi)definite matrix. Note that for reasons due to finite precision arithmetic, finding the nearest positive-definite and nearest positive-semidefinite matrices are effectively equivalent tasks.} +\item{make}{Logical switch to return the nearest matrix which satisfies the test - if the test has been passed, this is of course just \code{x} itself, otherwise the nearest positive-(semi)definite matrix. Note that for reasons due to finite precision arithmetic, finding the nearest positive-definite and nearest positive-semidefinite matrices are effectively equivalent tasks.} } \value{ If \code{isTRUE(make)}, a list with two components: diff --git a/man/mcmc_IMIFA.Rd b/man/mcmc_IMIFA.Rd index f8fdfe3..03068cc 100644 --- a/man/mcmc_IMIFA.Rd +++ b/man/mcmc_IMIFA.Rd @@ -43,7 +43,7 @@ mcmc_IMIFA(dat, } In principle, of course, one could overfit the "\code{MFA}" or "\code{MIFA}" models, but it is recommend to use the corresponding model options which begin with `O' instead. Note that the "\code{classify}" method is not yet implemented.} -\item{range.G}{Depending on the method employed, either the range of values for the number of clusters, or the conseratively high starting value for the number of clusters. Defaults to (and must be!) \code{1} for the "\code{FA}" and "\code{IFA}" methods. For the "\code{MFA}" and "\code{MIFA}" models this is to be given as a range of candidate models to explore. For the "\code{OMFA}", "\code{OMIFA}", "\code{IMFA}", and "\code{IMIFA}" models, this is the conservatively high number of clusters with which the chain is to be initialised (default = \code{max(25, ceiling(3 * log(N)))} for large N, or \code{min(N-1, ceiling(3 * log(N)))} for small N<=50). +\item{range.G}{Depending on the method employed, either the range of values for the number of clusters, or the conservatively high starting value for the number of clusters. Defaults to (and must be!) \code{1} for the "\code{FA}" and "\code{IFA}" methods. For the "\code{MFA}" and "\code{MIFA}" models this is to be given as a range of candidate models to explore. For the "\code{OMFA}", "\code{OMIFA}", "\code{IMFA}", and "\code{IMIFA}" models, this is the conservatively high number of clusters with which the chain is to be initialised (default = \code{max(25, ceiling(3 * log(N)))} for large N, or \code{min(N-1, ceiling(3 * log(N)))} for small N<=50). For the "\code{OMFA}", and "\code{OMIFA}" models this upper limit remains fixed for the entire length of the chain; the upper limit for the for the "\code{IMFA}" and "\code{IMIFA}" models can be specified via \code{trunc.G} (see \code{\link{bnpControl}}), which must satisfy \code{range.G <= trunc.G < N}. @@ -55,7 +55,7 @@ For methods ending in IFA, different clusters can be modelled using different nu If \code{length(range.G) * length(range.Q)} is large, consider not storing unnecessary parameters (via \code{\link{storeControl}}), or breaking up the range of models to be explored into chunks and sending each chunk to \code{\link{get_IMIFA_results}}. -See \code{\link{Ledermann}} for bounds on \code{range.Q}; this is useful in both the finite factor and infinite factor settings, as one may wish to ensure te fixed number of factors, or upper limits on the number of factors, respectively, respects this bound to yield indentifiable solutions, particularly in low-dimensional settings.} +See \code{\link{Ledermann}} for bounds on \code{range.Q}; this is useful in both the finite factor and infinite factor settings, as one may wish to ensure the fixed number of factors, or upper limits on the number of factors, respectively, respects this bound to yield indentifiable solutions, particularly in low-dimensional settings.} \item{MGP}{A list of arguments pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models "\code{IFA}", "\code{MIFA}", "\code{OMIFA}", and "\code{IMIFA}" only. Defaults are set by a call to \code{\link{mgpControl}}, with further checking of validity by \code{\link{MGP_check}} (though arguments can also be supplied here directly).} @@ -79,7 +79,7 @@ See \code{\link{bnpControl}} for further details of specifying \code{alpha} or s \item{x, object}{Object of class \code{"IMIFA"}, for the \code{print.IMIFA} and \code{summary.IMIFA} functions, respectively.} } \value{ -A list of lists of lists of class "\code{IMIFA}" to be passed to \code{\link{get_IMIFA_results}}. If the returned object is \code{x}, candidate models are accesible via subsetting, where \code{x} is of the following form: +A list of lists of lists of class "\code{IMIFA}" to be passed to \code{\link{get_IMIFA_results}}. If the returned object is \code{x}, candidate models are accessible via subsetting, where \code{x} is of the following form: \code{x[[1:length(range.G)]][[1:length(range.Q)]]}. diff --git a/man/mgpControl.Rd b/man/mgpControl.Rd index 73835ad..f801f48 100644 --- a/man/mgpControl.Rd +++ b/man/mgpControl.Rd @@ -33,7 +33,7 @@ mgpControl(alpha.d1 = 2.1, \item{prop}{Proportion of loadings elements within the neighbourhood \code{eps} of zero necessary to consider a loadings column redundant. Defaults to \code{floor(0.7 * P)/P}, where \code{P} is the number of variables in the data set. However, if the data set is univariate or bivariate, the default is \code{0.5} (see Note).} -\item{eps}{Neighbourhood epsilon of zero within which a loadings entry is considered negligible according to \code{prop}. Defaults to \code{0.1}.} +\item{eps}{Neighbourhood epsilon of zero within which a loadings entry is considered negligible according to \code{prop}. Defaults to \code{0.1}. Must be positive.} \item{adapt}{A logical value indicating whether adaptation of the number of cluster-specific factors is to take place when the MGP prior is employed. Defaults to \code{TRUE}. Specifying \code{FALSE} and supplying \code{range.Q} within \code{\link{mcmc_IMIFA}} provides a means to either approximate the infinite factor model with a fixed high truncation level, or to use the MGP prior in a finite factor context, however this is NOT recommended for the "\code{OMIFA}" and "\code{IMIFA}" methods.} diff --git a/man/olive.Rd b/man/olive.Rd index 7f0da3b..c0d7090 100644 --- a/man/olive.Rd +++ b/man/olive.Rd @@ -4,7 +4,9 @@ \name{olive} \alias{olive} \title{Fatty acid composition of Italian olive oils} -\format{A data frame with 572 observations and 10 columns. The first columns gives the area (one of Southern Italy, Sardinia, and Northern Italy), the second gives the region, and the remaining 8 columns give the variables. Southern Italy comprises the North Apulia, Calabria, South Apulia, and Sicily regions, Sardinia is divided into Inland Sardinia and Coastal Sardinia and Northern Italy comprises the Umbria, East Liguria, and West Liguria regions.} +\format{ +A data frame with 572 observations and 10 columns. The first columns gives the area (one of Southern Italy, Sardinia, and Northern Italy), the second gives the region, and the remaining 8 columns give the variables. Southern Italy comprises the North Apulia, Calabria, South Apulia, and Sicily regions, Sardinia is divided into Inland Sardinia and Coastal Sardinia and Northern Italy comprises the Umbria, East Liguria, and West Liguria regions. +} \usage{ data(olive) } @@ -20,5 +22,7 @@ pairs(olive[,-(1:2)], } \references{ Forina, M., Armanino, C., Lanteri, S. and Tiscornia, E. (1983). Classification of olive oils from their fatty acid composition, In Martens, H. and Russrum Jr., H. (Eds.), \emph{Food Research and Data Analysis}, Applied Science Publishers, London, pp. 189-214. + +Forina, M. and Tiscornia, E. (1982). Pattern recognition methods in the prediction of Italian olive oil origin by their fatty acid content, \emph{Annali di Chimica}, 72:143-155. } \keyword{datasets} diff --git a/man/pareto_scale.Rd b/man/pareto_scale.Rd index 45580c1..55513be 100644 --- a/man/pareto_scale.Rd +++ b/man/pareto_scale.Rd @@ -16,7 +16,7 @@ pareto_scale(x, The Pareto scaled version of the matrix \code{x}. } \description{ -Pareto scaling of a numeric matrix, with or without centering. Obserations are scaled by the square-root of their column-wise standard deviations. +Pareto scaling of a numeric matrix, with or without centering. Observations are scaled by the square-root of their column-wise standard deviations. } \examples{ dat <- pareto_scale(olive[,-c(1,2)]) diff --git a/man/shift_GA.Rd b/man/shift_GA.Rd index cc7cd00..97b9002 100644 --- a/man/shift_GA.Rd +++ b/man/shift_GA.Rd @@ -10,7 +10,7 @@ shift_GA(shape, param = c("rate", "scale")) } \arguments{ -\item{shape, rate}{Shape and rate parameters a and b, respectibely, of a Gamma(a, b) distribution. Both must be strictly positive.} +\item{shape, rate}{Shape and rate parameters a and b, respectively, of a Gamma(a, b) distribution. Both must be strictly positive.} \item{shift}{Modifier, such that the Gamma distribution has support on (\code{shift}, \eqn{\infty}). Can be positive or negative, though typically negative and small.} diff --git a/man/show_IMIFA_digit.Rd b/man/show_IMIFA_digit.Rd index de20231..2b0bd54 100644 --- a/man/show_IMIFA_digit.Rd +++ b/man/show_IMIFA_digit.Rd @@ -45,13 +45,13 @@ ind <- apply(train, 2, sd) > 0.7 dat <- train[,ind] \donttest{# Fit an IMIFA model (warning: quite slow!) -# sim <- mcmc_IMIFA(dat, n.iters=100, prec.mu=1e-03, z.init="kmeans", +# sim <- mcmc_IMIFA(dat, n.iters=1000, prec.mu=1e-03, z.init="kmeans", # centering=FALSE, scaling="none") # res <- get_IMIFA_results(sim, zlabels=ylab) # Examine the posterior mean image of the first two clusters -show_IMIFA_digit(res, dat=train, ind=ind) -show_IMIFA_digit(res, dat=train, ind=ind, G=2)} +# show_IMIFA_digit(res, dat=train, ind=ind) +# show_IMIFA_digit(res, dat=train, ind=ind, G=2)} } \seealso{ \code{\link{USPSdigits}}, \code{\link{show_digit}}, \code{\link{get_IMIFA_results}}, \code{\link{mcmc_IMIFA}}, \code{\link{mat2cols}}, \code{\link{plot_cols}} diff --git a/man/storeControl.Rd b/man/storeControl.Rd index 08689c2..c8ecfc5 100644 --- a/man/storeControl.Rd +++ b/man/storeControl.Rd @@ -38,7 +38,7 @@ Supplies a list of values for logical switches indicating whether parameters of \code{\link{storeControl}} is provided for assigning values for IMIFA models within \code{\link{mcmc_IMIFA}}. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning). } \note{ -Posterior inference and plotting won't be posssible for parameters not stored. +Posterior inference and plotting won't be possible for parameters not stored. Non-storage of parameters will almost surely prohibit the computation of posterior predictive checking error metrics within \code{\link{get_IMIFA_results}} also. In particular, if such error metrics are desired, \code{mu.switch} and \code{psi.switch} must be \code{TRUE} for all but the "\code{FA}" and "\code{IFA}" models, \code{load.switch} must be \code{TRUE} for all but the entirely zero-factor models, and \code{pi.switch} must be \code{TRUE} for models with clustering structure and unequal mixing proportions for all but the PPRE metric. \code{score.switch=TRUE} is not required for any posterior predictive checking. diff --git a/vignettes/IMIFA.Rmd b/vignettes/IMIFA.Rmd index 16aa648..b0cd1bc 100644 --- a/vignettes/IMIFA.Rmd +++ b/vignettes/IMIFA.Rmd @@ -97,7 +97,7 @@ There are many, many options for specifying hyperparameters, specifying running and `?mixfaControl`, `?mgpControl`, `?bnpControl` etc. as needed. Arguments to these control functions can actually be supplied, provided they are named, directly to `mcmc_IMIFA`, and this convention is adopted throughout this document. -Be warned that the `mcmc_IMIFA` function calls in this section may take quite some time to run. Let's begin by fitting a Mixture of Factor Analysers model (MFA) to the unit-scaled `olive` data. For this, we must specify sequences of values for `range.G`, the number of clusters, and `range.Q`, the number of latent factors. Let's assume that uniquenesses are `isotropic` rather than `unconstrained`. This isotropic constraint provides the link between factor analysis and the probabilistic principal component analysis model (PPCA): note that we could also constrain uniqueness across clusters (but still be diagonal within each cluster) by specifiying `uni.type="constrained"` or constrain uniquenesses to a single value (i.e. equal across all clusters and all variables) by specifying `uni.type="single"`. Let's elect not to store the latent factor scores, as this can be a huge drain on memory, with the caveat that posterior inference on the scores won't be possible. Let's also allow diagonal covariance as a special case where `range.Q` is `0`, and accept all other defaults (for instance, cluster labels will be initialised by `mclust`). +Be warned that the `mcmc_IMIFA` function calls in this section may take quite some time to run. Let's begin by fitting a Mixture of Factor Analysers model (MFA) to the unit-scaled `olive` data. For this, we must specify sequences of values for `range.G`, the number of clusters, and `range.Q`, the number of latent factors. Let's assume that uniquenesses are `isotropic` rather than `unconstrained`. This isotropic constraint provides the link between factor analysis and the probabilistic principal component analysis model (PPCA): note that we could also constrain uniqueness across clusters (but still be diagonal within each cluster) by specifying `uni.type="constrained"` or constrain uniquenesses to a single value (i.e. equal across all clusters and all variables) by specifying `uni.type="single"`. Let's elect not to store the latent factor scores, as this can be a huge drain on memory, with the caveat that posterior inference on the scores won't be possible. Let's also allow diagonal covariance as a special case where `range.Q` is `0`, and accept all other defaults (for instance, cluster labels will be initialised by `mclust`). ```{r, eval=FALSE} simMFA <- mcmc_IMIFA(olive, method="MFA", n.iters=10000, range.G=3:6, range.Q=0:3, centering=FALSE, @@ -115,7 +115,7 @@ MIFA doesn't entirely solve the issue of model choice, as you can see; `range.G` ```{r, eval=FALSE} simOMIFA <- mcmc_IMIFA(olive, method="OMIFA", n.iters=10000, range.G=10, alpha=0.8, - alpha.d1=3.5, nu=3, alpha.d2=7, prop=0.6, epsilon=0.12) + alpha.d1=3.5, nu=3, alpha.d2=7, prop=0.8, epsilon=0.01) ``` Finally, let's run the flagship IMIFA model, on which all subsequent demonstrations and results will be based, for a greater number of iterations, accepting the defaults for most arguments. Note that the `verbose` argument, which defaults to `TRUE` will ordinarily print a progress bar to the console window. The default implementation uses the independent rather than dependent slice-efficient sampler; we could override the default for the parameter governing the rate of geometric decay by specifying `rho`. @@ -124,7 +124,7 @@ Finally, let's run the flagship IMIFA model, on which all subsequent demonstrati simIMIFA <- mcmc_IMIFA(olive, method="IMIFA", n.iters=50000, verbose=FALSE) ``` -## Postprocessing and extracing optimum results +## Postprocessing and extracting optimum results In order to extract results, conduct posterior inference and compute performance metrics for MCMC samples of models from the __IMIFA__ family, we can pass the output of `mcmc_IMIFA` to the function `get_IMIFA_results()`. If, for instance, `simMFA` above was supplied, this function would find the pair of $G$ and $Q$ values which optimises a model selection criterion of our choosing and prepare results from that model only. If `simIMIFA` is supplied, this function finds the _modal_ estimates of $G$ and each $q_g$ (the cluster-specific number of latent factors), and likewise prepares results accordingly. This function can be re-ran at little computational cost in order to extract different models explored by the sampler used by `mcmc_IMIFA`, without having to re-run the model itself. New results objects using different numbers of clusters and different numbers of factors (if visited by the model in question), or using different model selection criteria (if necessary) can be generated with ease. The function also performs post-hoc corrections for label switching, as well as post-hoc Procrustes rotation to ensure sensible posterior parameter estimates, computes error metrics, and constructs credible intervals, the average similarity matrix, and the posterior confusion matrix. Please see the function's help manual by typing `?get_IMIFA_results` for further assistance with the various function arguments.