Skip to content

Latest commit

 

History

History
54 lines (29 loc) · 2.21 KB

Reproducibility_Issues.md

File metadata and controls

54 lines (29 loc) · 2.21 KB

Reproducibility Notes For Monocle3 Analysis

Summary

There are a couple of functions within the Monocle3 package that employ the use of a random number generator which can cause variation in the analysis each time a script is run. In this document those commands are identified and work-arounds are given to ensure reproducibility and that outputs from the commands is consistent between runs.

This document will be updated as necessary if patches are introduced to the Monocle3 package. For more information, and to check if there are new patches, please visit the monocle3 github page.

reduce_dimensions()

The following options need to be added to the reduce_dimensions() command:

  • umap.fast_sgd = FALSE
  • cores=1
  • n_sgd_threads=1

Here is an example:

reduce_dimension(cds,umap.fast_sgd = FALSE,cores=1,n_sgd_threads=1)

cluster_cells()

The following commands/options need to be added when running cluster_cells():

  • the command set.seed('integer') needs to be immediately prior to cluster_cells()
  • within the cluster_cells() the option random_seed = 'integer' needs to be included

Here is an example:

set.seed(17)

cds = cluster_cells(cds, random_seed = 17)

find_gene_modules()

The following commands/options need to be added when running cluster_cells():

  • the command set.seed('integer') needs to be immediately prior to cluster_cells()
  • within the cluster_cells() the option random_seed = 'integer' needs to be included

Here is an example:

set.seed(17)

gene_module_df <- find_gene_modules(cds[pseu_pr_deg_ids,], random_seed = 17)

Additional Notes

The initial basis for these work-arounds is described in this issue page on the Monocle3 Github.

For the reduce_dimensions() command, setting the cores and sgd_threads to 1 can increase the computational time it takes to run this command on very large datasets.

The set.seed command needs to be immediately prior to the given commands above. It cannot be listed at the start of a script once. Although it is not required that all integers be the same value throughout the script, it is recommended for simplicity.