Release mice 3.17.0 · amices/mice

Major changes

Imputing categorical data by predictive mean matching. Predictive mean matching (PMM) is the default method of mice() for imputing numerical variables, but it has long been possible to impute factors. This enhancement introduces better support to work with categorical variables in PMM. The former system translated factors into integers by ynum <- as.integer(f). However, the order of integers in ynum may have no sensible interpretation for an unordered factor. The new system quantifies ynum and could yield better results because of higher $R^2$. The method calculates the canonical correlation between y (as dummy matrix) and a linear combination of imputation model predictors x. The algorithm then replaces each category of y by a single number taken from the first canonical variate. After this step, the imputation model is fitted, and the predicted values from that model are extracted to function as the similarity measure for the matching step.
The method works for both ordered and unordered factors. No special precautions are taken to ensure monotonicity between the category numbers and the quantifications, so the method should be able to preserve quadratic and other non-monotone relations of the predicted metric. It may be beneficial to remove very sparsely filled categories, for which there is a new trim argument. All you have to use the new technique is specify to mice(..., method = "pmm", ...). Both numerical and categorical variables will then be imputed by PMM.
Potential advantages are:
- Simpler and faster than fitting a generalised linear model, e.g., logistic regression or the proportional odds model;
- Should be insensitive to the order of categories;
- No need to solve problems with perfect prediction;
- Should inherit the good statistical properties of predictive mean matching.
Note that we still lack solid evidence for these claims. (#576). Contributed @stefvanbuuren
New system-independent method for pooling: This version introduces a new function pool.table() that takes a tidy table of parameter estimates stemming from m repeated analyses. The input data must consist of three columns (parameter name, estimate, standard error) and a specification of the degrees of freedom of the model fitted to the complete data. The pool.table() function outputs 14 pooled statistics in a tidy form. The primary use of pool.table() is to support parameter pooling for techiques that have no tidy() or glance() methods, either within R or outside R. The pool.table() function also allows for a novel workflows that 1) break apart the traditional pool() function into a data-wrangling part and a parameters-reducing part, and 2) does not necessarily depend on classed R objects. (#574). Contributed @stefvanbuuren
literanger: Adds support for the literanger package for rf imputation that is about twice as fast as ranger (#648). Thanks @stephematician for the contribution.

Breaking changes

The complete(..., action = "long", ...) command puts the columns named ".imp" and ".id" in the last two positions of the long data (instead of first two positions). In this way, the columns of the imputed data will have the same positions as in the original data, which is more user-friendly and easier to work with. Note that any existing code that assumes that variables ".imp" and ".id" are in columns 1 and 2 will need to be modified. The advice is to modify the code using the variable names ".imp" and ".id". If you want the old behaviour, specify the argument order = "first". (#569). Contributed @stefvanbuuren
Drops support for S4. Convert S4-related code to S3. Syntax as(df, "mids") is deprecated. Use as.mids(df) instead.
Adopts the broom-convention for naming lower and upper bounds of the confidence interval as "conf.low" and "conf.high". Do not use non-syntactic names anymore, like "2.5 %".

Minor changes

Adds support for the dots argument to ranger::ranger(...) in mice.impute.rf() (#563). Contributed @edbonneville
Prepares for the deprecation of the blocks argument at various places
Removes the need for blocks in initialize_chain()
In rbind(), when formulas are concatenated and duplicate names are found, also rename the duplicated variables in formulas by their new name
Solves problem with the package documentation link
Simplifies NEWS.md formatting to get correct version sequence on CRAN and in-package NEWS
Initialize single-variables blocks in make.method() in a more efficient way (resolves #672)
Prevent as.mids() from filling the imp object for complete variables
Defines S3 class constructors for mids, mads, mira and mipo objects

Bug fixes

Fixes the "large logo" problem. (#574). Contributed @hanneoberman
Patches a bug in complete() that auto-repeated imputed values into cells that should NOT be imputed (occurred as a special case of rbind(), where the first set of rows was imputed and the second was not).
Replaces the internal variable type by the more informative pred (currently active row of predictorMatrix)
Fixes a bug in filter.mids() that incorrectly removed empty components in the imp object
Fixes a bug in ibind() that incorrectly used length(blocks) as the first dimension of the chainMean and chainVar objects
Corrects the description visitSequence, chainMean and chainVar components of the mids object
Fixes problems with zero predictors (#588)
Fixes a problem with the minpuc argument in quickpred() (#634)
Fixes coef() not available on S4 object when using with lavaan (#615, #616)
Adds .github/dependabot.yml configuration to automate daily check (#598)
Update documentation tags to roxygen2 7.3.1 requirements
Repairs lost braces in the documentation
Fixes an installation problem when Rprofile prints to stdout on Fedora, R version 4.1.3 (#646, #647). Thanks @brookslogan for the fix.
Fixes a bug during initialization of factor values
Removes methods and rlang from Depends
Removes export of non-user facing ampute() helpers
Clears \link statements that do not pass CRAN checks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mice 3.17.0

Major changes

Breaking changes

Minor changes

Bug fixes

Contributors