scPropensity

In single-cell sequencing datasets, cells are often assigned meta data information (e.g., batch, sample ID, condition...). scPropensity is a global measure of the relationship between meta data assigment and molecular similarity across cells.

scPropensity was applied to the analysis of cancer clones (see [Nadalin et al.]): we asked whether two clones have a more or less similar transcriptional profile than expected by chance. Using scPropensity we computed a clone-clone transcriptional similarity, which we used to classify clones into distinct transcriptional groups (lineages) and evaluate their molecular heterogeneity.

Description

scPropensity is inspired from a concept in structural bioinformatics called statistical potential, which is useful to evaluate the likelihood of a protein complex model via a pseudo-energy function computed from a database of experimental protein structures.

In particular, pair propensity scores are derived from amino acid pairings at the protein-protein interface and are defined as p(x,y) = F(x,y)/G(x,y), where F(x,y) is the observed frequency of pair (x,y) and G(x,y) is the expected frequency of pair (x,y). Depending on the value of p(x,y), x and y are more (> 1), less (< 1) or equally (= 1) likely to be in contact with each other than expected by chance.

Here, x and y are cell labels. A cell-cell similarity measure is derived from the assay (gene expression, chromatin accessiblity state...) and is used to build a k-nn graph, where nodes are cells and a directed edge connects cell i with cell j if and only if j is one of the closest k cells to i according to this measure.

F(x,y) is defined as the number of edges (i,j) in the k-nn graph such that i is labelled with x and j is labelled with y; G(x,y) is the expected number of edges labelled with (x,y) given the neighbourhood size k and the number of cells labelled with x and y, respectively (see [Nadalin et al.] for details). Therefore, p(x,y) tells whether cells labelled with x tend to be more (> 1), less (< 1) or equally (= 1) similar to the cells labelled with y than expected by chance.

Requirements

R v4.0.3
Seurat v4.0.5

Instructions

scPropensity is implemented in R, it takes as input a Seurat object and a meta data field ID. To compute the pair propensity score on object.Rds with respect to sample.name, run:

scPropensity(object.file = "object.Rds", slot = "sample.name", outdir = "dir")

The above function builds a k-nn graph, computes the pair propensities of the labels in [email protected]$sample.name and creates a folder dir containing output files. It contains a n x n matrix M, where n is the number of distict values in sample.name, and M[x,y] is the log pair propensity of (x,y). It also contains the L2-normalised version of M.

Citing

If you find this software useful, please cite:

Nadalin et al. Multi-omic lineage tracing predicts the transcriptional, epigenetic and genetic determinants of cancer evolution. Nature Communications. https://doi.org/10.1101/2023.06.28.546923

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
R		R
man		man
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scPropensity

Description

Requirements

Instructions

Citing

About

Releases

Packages

Languages

License

fnadalin/scPropensity

Folders and files

Latest commit

History

Repository files navigation

scPropensity

Description

Requirements

Instructions

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages