CLUEY

This is an R package for estimating the number of clusters in uni and multi-modal single-cell data. CLUEY uses cell-type identity markers to guide the clustering process and performs recursive clusters to ensure that sub-populations are captured.

Dependencies

CLUEY requires both keras and tensorflow, please have both installed. You can follow the instructions provided at this link.

Installation

CLUEY can be installed using the following command:

library(devtools)
install_github("SydneyBioX/CLUEY")

Generating knowledge base

You can generate your own knowledge base using the generateKnowledgeBase function like below:

knowledgeBase <- generateKnowledgeBase(exprsMat=logcounts(sce), celltypes=sce$cellType)

Cluster data

In this example, we will upload an example knowledge base generated from the Mouse Cell Atlas (FACS) and cluster an example query dataset which was subsampled from Zilionis et al. using the runCLUEY function.

library(CLUEY)
library(scater)
library(ggplot2)
library(gridExtra)

set.seed(3435)

# Load example knowledge base
data(mcaFACS)

# Load example query data
data(exampleData)

# Run CLUEY
# If your logcounts matrix is in dgCMatrix format, then you'll need to convert it to a matrix using `as.matrix()`
clustering_results <- runCLUEY(exprsMatRNA=as.matrix(logcounts(exampleData)), knowledgeBase=mcaFACS, kLimit=10)
#> 25/25 - 0s - 148ms/epoch - 6ms/step
#> 13/13 - 0s - 94ms/epoch - 7ms/step
#> 25/25 - 0s - 108ms/epoch - 4ms/step
#> 13/13 - 0s - 77ms/epoch - 6ms/step

Viewing results

We can now view the results of the clustering performed by CLUEY. CLUEY predicts there to be 5 clusters in the data.

set.seed(3435)

# View the optimal number of clusters predicted by CLUEY
clustering_results$optimal_K
#> [1] 5

# We can store the results in the metadata of our SingleCellExperiment object. 
colData(exampleData) <- cbind(colData(exampleData), clustering_results$predictions)

# Run UMAP to visualise clusters
exampleData <- runPCA(exampleData)
exampleData <- runUMAP(exampleData)
umap <- data.frame(reducedDim(exampleData, "UMAP"))
umap$cluster <- as.factor(exampleData$cluster)
umap$correlation <- exampleData$correlation

ggplot(umap, aes(x=UMAP1, y=UMAP2, color=cluster)) + geom_point() + theme_classic()

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
R		R
data		data
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CLUEY.Rproj		CLUEY.Rproj
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLUEY

Dependencies

Installation

Generating knowledge base

Cluster data

Viewing results

About

Releases

Packages

Languages

SydneyBioX/CLUEY

Folders and files

Latest commit

History

Repository files navigation

CLUEY

Dependencies

Installation

Generating knowledge base

Cluster data

Viewing results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages