- Julia (recommended), including JSON.jl, ArgParse.jl
- Python (recommended)
- CmdStan, or any other Stan interface
The folder stan
contains the following models:
tg_zi_single
: Telegraph model, with zero inflation, single alleletg_zi_vol_single
: Telegraph model, with zero inflation and cell volume, single alleletg_zi_vol_pooled
: Telegraph model, with zero inflation and cell volume, two alleles (pooled)
These models can be run using CmdStan.
The input to Stan is provided as a JSON file containing the following fields:
int ncells
: Number of cellsint ngenes
: Number of genesint N
: Number of observationsint cells[N]
: Cell corresponding to each observationint genes[N]
: Gene corresponding to each observationint counts[N]
: Measured mRNA numbers for each observation
The script convert_data.jl
can be used to convert a count matrix (stored as a CSV file) into this format, skipping missing values.
The sampler computes the posterior over the following quantities:
mu
: Mean mRNA numbersb
: Mean burst sizedur
: Mean burst durationp0
: Amount of zero inflationalpha
: Volume dependence (if present)
The following values are computed for convenience:
rho
: Transcription ratesigma_on
: on switching ratesigma_off
: on switching rate
vol
: Volume scaling factor (if present)
Note: The outputs also contains the latent variables raw_betas
, one per observation. This requires a lot of memory or disk space for large batches. These, and other internal variables, can be removed using the script postprocess_csv.py
.
- Convert mRNA count matrix to JSON:
julia scripts/convert_data.jl countmat.csv countmat.json
The input must be a CSV file containing mRNA counts. Each row represents a cell, and each column a gene (except for the first column, which is ignored). Missing values are allowed.
- Run Stan:
stan/tg_zi_vol_single sample data file=countmat.json output file=raw_samples.csv
See the CmdStan user guide for more information
- (Optional) Trim output files:
python scripts/postprocess_csv.py <raw_samples_1.csv >samples_1.csv
...