Skip to content

changebio/scDenorm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scDenorm

scDenorm is an algorithm that reverts normalised single-cell omics data to raw counts, preserving the integrity of the original measurements and ensuring consistent data processing during integration.

Install

pip install scDenorm

#or

conda install -c changebio scdenorm

Dependency

numpy
pandas
matplotlib
scanpy
anndata
scipy
tqdm
pathlib
fastcore
colorlog

How to use

Using pbmc3k as an example dataset

import scanpy as sc
from scipy.io import mmwrite
from scDenorm.denorm import *
DEBUG:my_logger:This is a debug message
INFO:my_logger:This is an info message
WARNING:my_logger:This is a warning message
ERROR:my_logger:This is an error message
CRITICAL:my_logger:This is a critical message
ad=sc.datasets.pbmc3k()
ad.layers['count']=ad.X.copy()
ad
AnnData object with n_obs × n_vars = 2700 × 32738
    var: 'gene_ids'
    layers: 'count'
sc.pp.normalize_total(ad, target_sum=1e4)
sc.pp.log1p(ad)
smtx = ad.X.tocsr().asfptype()
smtx.data
array([1.6352079, 1.6352079, 2.2258174, ..., 1.7980369, 1.7980369,
       2.779648 ], dtype=float32)
ad.write_h5ad('data/pbmc3k_norm.h5ad')

write out as sparse matrix

mmwrite('data/scaled.mtx', smtx[1:10,])

In jupyter

Input Anndata

scdenorm('data/pbmc3k_norm.h5ad',fout='data/pbmc3k_denorm.h5ad',verbose=1)
INFO:my_logger:Reading input file: data/pbmc3k_norm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
INFO:my_logger:The dimensions of this data are (2700, 32738).
INFO:my_logger:Selecting base
INFO:my_logger:Denormlizing ...the base is 2.718281828459045

b is 2.718281828459045

100%|██████████| 2700/2700 [00:02<00:00, 1071.27it/s]
INFO:my_logger:Writing output file: data/pbmc3k_denorm.h5ad

return a new anndata if there is no output path.

new_ad=scdenorm('data/pbmc3k_norm.h5ad')
new_ad
View of AnnData object with n_obs × n_vars = 2700 × 32738
    var: 'gene_ids'
    uns: 'log1p'
ad.layers['count'].data
array([1., 1., 2., ..., 1., 1., 3.], dtype=float32)
new_ad.X.data
array([1.       , 1.       , 2.0000002, ..., 1.       , 1.       ,
       3.       ], dtype=float32)

Input sparse matrix with cell by gene

If it is gene by cell, set gxc=True.

scdenorm('data/scaled.mtx',fout='data/scd_scaled.h5ad')
100%|██████████| 9/9 [00:00<00:00, 2883.12it/s]

In command line

Input Anndata

!scdenorm data/pbmc3k_norm.h5ad --fout data/pbmc3k_denorm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
b is 2.718281828459045
100%|█████████████████████████████████████| 2700/2700 [00:02<00:00, 1090.85it/s]

Input sparse matrix with cell by gene

!scdenorm data/scaled.mtx --fout data/scd_scaled_c.h5ad
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1333.31it/s]

or output mtx format.

!scdenorm data/scaled.mtx --fout data/scd_scaled_c.mtx
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1290.78it/s]

Citation

Yin Huang, Anna Vathrakokili Pournara, Ying Ao, Lirong Yang, Hui Zhang, Yongjian Zhang, Sheng Liu, Alvis Brazma, Irene Papatheodorou, Xinlu Yang, Ming Shi, Zhichao Miao “scDenorm: a denormalisation tool for integrating single-cell transcriptomics data”(Under review)

About

scDenorm: a denormalization tool for single-cell transcriptomics data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published