-
Notifications
You must be signed in to change notification settings - Fork 34
[SCAVENGE] Preparing your scATAC seq data
FLYu edited this page Dec 1, 2022
·
5 revisions
How to generate lsimat50
,SE_gvar
and SE_gvar_bg
, which can be seamlessly fed into the SCAVENGE analysis.
If you use the ArchR pipeline to process your scATAC-seq data, you may follow the following code to produce these objects.
proj <- loadArchRProject(path = "./pbmc5k") # load your project
getAvailableMatrices(ArchRProj = projHeme5)
# [1] "GeneScoreMatrix" "PeakMatrix" "TileMatrix"
# embeding coordinates
umapdf=projHeme5@embeddings@listData[["UMAP"]]@listData[["df"] # if you performed batch effect correction, this might be a little different
# reduced dimension matrix; cell x LSI (30)
lsimat50=projHeme5@reducedDims@listData[["IterativeLSI"]]@listData[["matSVD"]] # if you performed batch effect correction, this might be a little different
# make sure that all the cells are in the same order
# colnames(peakbycellmat) %>% head
# rownames(umapdf) %>% head
save(lsimat50, file="pbmc5000_10x_lsimat50.rda")
# make the SummarizedExperiment files for SCAVENGE input
SE_gvar <- SummarizedExperiment(assays = list(counts = peakbycellmat),
rowRanges = rowRanges(proj_PeakMatrix),
colData = DataFrame(names = colnames(peakbycellmat)))
assayNames(SE_gvar) <- "counts"
SE_gvar <- addGCBias(SE_gvar, genome = BSgenome.Hsapiens.UCSC.hg19)
SE_gvar_bg <- getBackgroundPeaks(SE_gvar, niterations=200)
save(SE_gvar, SE_gvar_bg, file="pbmc5000_10x_SE_gvar.rda")
For details, please explore the sections on the right-hand side!