Skip to content

xinhua-lab/sc-hESCC

Repository files navigation

sc-hESCC

Example scripts to process and analyze data for the ESCC scRNA-seq article. Detailed information will be available from authors upon reasonable request.

Key software and algorithms: 10x Genomics Cell Ranger 3.0.1 version R 3.6

R package: Seurat (v3) cowplot limma Matrix umap dplyr 'scran' Rtsne RColorBrewer monocle scTHI SCENIC

Basic information of preprocessing

quality control, removing the batch effect

The 10x Genomics Cell Ranger pipeline was used to demultiplex raw files into FASTQ files, extract barcodes and UMI, filter, and map reads to the GRCh38 reference genome, and generate a matrix containing normalized gene counts versus cells per sample. Counts data was then imported into the Seurat object for quality control. Low-quality cells (<400 genes/cell and >10% mitochondrial genes) were excluded. To remove the batch effect, the datasets collected from different samples were integrated with default parameters.

Dimensionality reduction and clustering

The Seurat function ‘FindVariableFeatures’ was applied to identify the highly variable genes (HVG). The top 2000 HVGs were used for data integration. The data were scaled using ‘ScaleData’ and the first 20 principle components were adopted for auto-clustering analyses using ‘FindNeighbors’ and ‘FindClusters’ functions. For all cells, we identified clusters setting the resolution parameter as 1.5, and the clustering results were visualized with the UMAP scatter plot. The marker genes of each cell cluster were identified using the ROC analysis function provided by the Seurat ‘FindAllMarkers’ function for the top genes with the largest AUC (area under curve).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages