Skip to content

SydneyBioX/CLUEY

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLUEY

This is an R package for estimating the number of clusters in uni and multi-modal single-cell data. CLUEY uses cell-type identity markers to guide the clustering process and performs recursive clusters to ensure that sub-populations are captured.

Dependencies

CLUEY requires both keras and tensorflow, please have both installed. You can follow the instructions provided at this link.

Installation

CLUEY can be installed using the following command:

library(devtools)
install_github("SydneyBioX/CLUEY")

Generating knowledge base

You can generate your own knowledge base using the generateKnowledgeBase function like below:

knowledgeBase <- generateKnowledgeBase(exprsMat=logcounts(sce), celltypes=sce$cellType)

Cluster data

In this example, we will upload an example knowledge base generated from the Mouse Cell Atlas (FACS) and cluster an example query dataset which was subsampled from Zilionis et al. using the runCLUEY function.

library(CLUEY)
library(scater)
library(ggplot2)
library(gridExtra)

set.seed(3435)

# Load example knowledge base
data(mcaFACS)

# Load example query data
data(exampleData)

# Run CLUEY
# If your logcounts matrix is in dgCMatrix format, then you'll need to convert it to a matrix using `as.matrix()`
clustering_results <- runCLUEY(exprsMatRNA=as.matrix(logcounts(exampleData)), knowledgeBase=mcaFACS, kLimit=10)
#> 25/25 - 0s - 148ms/epoch - 6ms/step
#> 13/13 - 0s - 94ms/epoch - 7ms/step
#> 25/25 - 0s - 108ms/epoch - 4ms/step
#> 13/13 - 0s - 77ms/epoch - 6ms/step

Viewing results

We can now view the results of the clustering performed by CLUEY. CLUEY predicts there to be 5 clusters in the data.

set.seed(3435)

# View the optimal number of clusters predicted by CLUEY
clustering_results$optimal_K
#> [1] 5

# We can store the results in the metadata of our SingleCellExperiment object. 
colData(exampleData) <- cbind(colData(exampleData), clustering_results$predictions)

# Run UMAP to visualise clusters
exampleData <- runPCA(exampleData)
exampleData <- runUMAP(exampleData)
umap <- data.frame(reducedDim(exampleData, "UMAP"))
umap$cluster <- as.factor(exampleData$cluster)
umap$correlation <- exampleData$correlation

ggplot(umap, aes(x=UMAP1, y=UMAP2, color=cluster)) + geom_point() + theme_classic()

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages