-
Notifications
You must be signed in to change notification settings - Fork 0
How to use cellxgene for manual annotations
Welcome to our hands-on introduction to the CELLxGENE tool, a key component of our upcoming STSM. This powerful tool is designed to facilitate your exploration of single-cell datasets, providing an array of features to enhance your research experience.
To get started, follow the steps outlined here to familiarize yourself with the tool. We also encourage you to dive in and experiment with CELLxGENE independently. Feel free to sample cells and explore its functionalities at your own pace. Enjoy!
CELLxGENE is a comprehensive toolkit designed for scientists exploring single-cell datasets and atlases. It offers a broad range of features to visualize the data, including giving a view over the different integration methods of diverse datasets, cluster visualization based on metadata annotations, and the ability to visualize individual genes and create/visualize gene sets. The user-friendly interface enhances the efficiency and accessibility of single-cell analysis, allowing for in-depth exploration and collaborative efforts in understanding complex biological information.
For our scRAFIKI (Single-cell RNA-seq Atlas Framework for Integration and Key Insights) project, we have curated two atlases that we invite you to explore using the CELLxGENE tool. You can access and analyze these atlases here:
The CELLxGENE interface offers a comprehensive overview of atlas data, with each cell represented as a point in the embedding plot at the center. We have implemented three distinct integration methods (harmony, scvi, scanvi), which you can select using the button located below the embedding plot in the left corner. CELLxGENE has a toolbar enabling:
- Setting populations to find marker genes
- Finding marker genes
- Subset cells based on a current selection
- Reset to all data
- Lasso selection tooltip
- Number coloring and move canvas tooltip
- Display categorical labels (when coloring by a category)
- Clip numerical values
- Undoing actions
- Redoing actions
On the left side, you'll find all categorical and numerical metadata of the data, starting with the categroical. You can explore all values by expanding the category (>) and color the embedding plot by selecting the "drop" icon on the side.
Scrolling down reveals statistical tests and histograms for further analysis of specific qualities. On the right side, you have options for selecting genes and generating gene sets, which will be explained in more detail in the following steps.
- Color the categorical metadata by selecting the "cell_type" of interest (e.g., Myeloid).
- Open all values by clicking on the categorical metadata ">" icon.
- Choose and select only the "Myeloid" cell type. You will observe that the selected "Myeloid" cells are now visually emphasized in the embedding plot.
- Click the "Subset cells based on a current selection" button (3) to create the subset.
Now, you can focus on and analyze genes and gene sets specifically within the subset of Myeloid cells, streamlining your exploration.
- Search for genes of interest, such as known marker genes.
- Expand the gene entry to visualize the distribution via a histogram.
- Color your plot based on the expression of the selected gene.
You can visualize their distribution and observe how they relate to specific cell populations. Additionally, explore bivariate plots to assess potential correlations between the expression of two different genes, by selecting “y” and “x” accordingly.
- Create personalized gene sets by adding a name, description, and a list of genes.
- Analyze the average distribution of values via the histogram
- Select genes for plotting and coloring to investigate co-regulation patterns and expression values.
This functionality empowers you to curate and explore sets of genes, allowing for a more comprehensive analysis of co-regulation patterns and the combined expression dynamics within the datasets.
- Utilize the toolbar to set populations for marker gene identification (as described above)
- You can also select them by using the “lasso selection toollift” (4)
- Click on the "Set populations to find marker genes" button (1)
- Click on the “Find marker genes" button (2)
This feature enables the identification of marker genes specific to selected populations and will provide gene sets containing the marker genes and their expression values and analyzed more in detail.
- create a new category with the button in the top left corner and name it
- add all labels inside the category without assigning any clusters: they should contain 0 cells
- select the cells corresponding to the labels individually
- assign the selected cells
- create the new category
- if you want to change an already existing category assignment, you can duplicate the labels and assignments from existing categories:
- create all labels according to the new distributions:
- color by the category you want to annotate
- select all clusters for one label:
- assign them to the corresponding label:
Here we assigned coarse cell type annotations based on the clustering resolution 2.0 which was generated by the integration method scANVI.
- create all labels according to the corresponding genes or gene sets:
- search for the gene/gene sets and clip the values to assign only cells containing them:
- assign the cells containing the gene(s) to the corresponding label:
Here we selected genes of our choice for a better visualisation of the distribution.
If these functionalities have sparked your interest, we recommend exploring the official CELLxGENE website and engaging with the tutorials. This will provide you with a comprehensive understanding of the tool's capabilities for your single-cell analysis.
If anything remains unclear, feel free to open a GitHub issue