Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to annotate cells in the STdeconvolve #19

Open
ysq1770368148 opened this issue Aug 10, 2022 · 6 comments
Open

How to annotate cells in the STdeconvolve #19

ysq1770368148 opened this issue Aug 10, 2022 · 6 comments

Comments

@ysq1770368148
Copy link

Hi, @bmill3r
Sorry to trouble you. I'm new to analyzing spatial transcriptomes. I am trying to do it with your tools to deconvolve. I have two questions.First, I see the tutorial in “https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/visium_10x.md”, but I don't quite understand how you annotate spots .After I finished running the following code, I got a graph that splits spots into several topics.Are you annotating the genes that are highly expressed in each topic?How do you present the annotated results of each cell? Which object to add to, can you give an example?

optLDA <- optimalModel(models = ldas, opt = 15)
results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconProp <- results$theta
deconGexp <- results$beta
plt <- vizAllTopics(theta = deconProp,
pos = pos,
r = 45,
lwd = 0,
showLegend = TRUE,
plotTitle = NA) +
ggplot2::guides(fill=ggplot2::guide_legend(ncol=2)) +

outer border

ggplot2::geom_rect(data = data.frame(pos),
ggplot2::aes(xmin = min(x)-90, xmax = max(x)+90,
ymin = min(y)-90, ymax = max(y)+90),
fill = NA, color = "black", linetype = "solid", size = 0.5) +
ggplot2::theme(
plot.background = ggplot2::element_blank()
) +

remove the pixel "groups", which is the color aesthetic for the pixel borders

ggplot2::guides(colour = "none")

1660136151594

Second,in the tutorial "https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/visium_10x.md#compare-to-transcriptional-clustering" ,I don't quite understand the relationship between the topics in the diagram above and the cluster in this tutorial.What if the number of topics and clusters doesn't match?Can I determine which topic belongs to which cluster based on best match (highest correlation) of clusters (coms) and topics (decon)?
1660137443147

@bmill3r
Copy link
Collaborator

bmill3r commented Aug 11, 2022

Hi @ysq1770368148 ,

Thank you so much for using STdeconvolve and for your questions! Happy to try and answer them the best I can.

If I understand your question correctly, you are asking how do we annotate the deconvolved topics, i.e. cell types returned by STdeconvolve?

In addition to deconvolving cell type proportions in each spot, STdeconvolve also returns deconvolved gene expression profiles of the cell types. For reference, this is the deconGexp object in the code you ran. We can use the gene expression profiles of the cell types as a means to predict their annotations. There are many possible ways to do this, and two of them we highlight in the tutorial Annotating deconvolved cell-types, which I encourage you to check out.

With respect to your second point, I think it is important to note that deconvolution and transcriptional clustering are doing two different things. Remember that the spots here are assumed to have more than one cell, and thus potentially more than one cell type. This means that the transcriptional profiles of the spots of the 10X Visium dataset are mixtures of transcriptional profiles of different cell types. STdeconvolve attempts to deconvolve, or recover, the constituent components of these mixtures. In contrast, transcriptional clustering effectively places transcriptionally similar spots into groups (i.e. clusters, or "communities") based on their overall transcriptional profiles. In the Analysis of 10X Visium data, the section Compare to transcriptional clustering demonstrates that while some clusters are highly enriched in one cell type, other clusters contain multiple cell types, and some cell types are present in more than one cluster. We can see this in the heatmap you generated, which shows the correlation between cell type spot proportions and spot transcriptional cluster assignments. (A spot can only have one cluster assignment, but can contain multiple cell types). Because the spots are assumed to be multi-cellular, I would not expect each deconvolved cell type to correspond exactly to a single transcriptional cluster, unless if each spot was spatially located in a region of distinct, transcriptionally identical cells.

Let me know if this doesn't make sense or if you have any more questions.
Brendan

@ysq1770368148
Copy link
Author

Hi,@bmill3r
Thank you for such a prompt reply! I've seen this tutorial Annotating deconvolved cell-types, but I'm not very clear on how to assign cell type to each spot. For example, Spot1 is a T cell, spot2 is a B cell, so how do I add a cell type to each spot with code?
As you wrote in the example, if I already know how to annotate every topic, how do I get the "annot"?

library(STdeconvolve)

load built in data

data(mOB)
pos <- mOB$pos
cd <- mOB$counts
annot <- mOB$annot

If I'm not mistaken,the transcriptional clustering is to treat each spot as a cell. The reality is that a spot may contain multiple cell types,we obtain the topic composition of each spot by deconvolution ( topic refers to a cell type). Ultimately, the highly expressed genes of each topic are used to determine what type of cell this topic is.
But the final annotation seems to be annotating each spot (as shown in the image below) and the color of the outermost circle represents the Pixel.Groups. To determine which type of cell a spot is, is it to identify a cell type with a relatively large proportion of spot as a type of cell that spot belongs to?

1660274360999

@bmill3r
Copy link
Collaborator

bmill3r commented Aug 15, 2022

Hi @ysq1770368148,

You are absolutely correct that in transcriptional clustering, each spot is treated as a single cell but in reality a spot may contain multiple cells and cell types. Further, the gene expression profiles of the deconvolved topics can be used to determine their cell type identity. In the function vizAllTopics(), spots are pie charts where the slices are shaded in based on the proportions of the deconvolved cell types in each spot. These are annotated in the legend under "Topics". We can also assign spots overall to different groups, which can be indicated by coloring the outer circle of each spot. These groups are annotated in the legend under "Pixel.Groups" (note that this is an optional argument). In the example, these are the tissue layers that each spot physically overlaps with in the mouse olfactory bulb. So based on this visualization, we can observe things like how Topic 5 is enriched in the center of the Granular Cell Layer. Pixel.Groups can also be the transcriptional clusters each spot was assigned to so one could see how different topics are represented in different transcriptional clusters. Perhaps one cluster is actually composed of a mixture of 2 different cell types.

In the example, annot is a factor that represents the tissue layer each spot is located in and was determined previously (available in the mob object). For STdeconvolve the theta matrix is a spot x topic proportion matrix. One could change the column names (deconvolved topics) to the annotated cell type labels. In this way, the vizAllTopics() "Topic" legend would have the cell type annotations instead of "Topic.1", "Topics.2" etc and may be more interpretable.

Let me know if you have more questions,
Brendan

@ysq1770368148
Copy link
Author

Thank you so much for such a detailed answer, it helped me a lot. Thank you for the great tool!

@bmill3r bmill3r added good first issue Good for newcomers useful info for new users and removed good first issue Good for newcomers labels Feb 8, 2023
@cathalgking
Copy link

Hi @bmill3r

Thank you for your work with this package. I have similar questions to this discussion.

In the annotation tutorial it says:
"For demonstration purposes, let’s use the 5 annotated tissue layer labels (i.e. “Granular Cell Layer”, “Mitral Cell Layer”, etc) assigned to each pixel"
Where exactly did those labels come from? Were they part of the SPE object when it was read in?

Can I use a public database for cell type annotation with this package? after spot deconvolution. I would like to use a public reference that contains certain cell types of interest and it is in a SPE object file format. If this can be done, can you show a simple re-producible example?

Thanks

@bmill3r
Copy link
Collaborator

bmill3r commented May 29, 2023

Hi @cathalgking

Thanks for your questions and for using STdeconvolve!

The 5 annotated tissue layer labels you refer to are part of the mOB dataset which we have included as a test dataset with this package.

In terms of using a publicly available database to help annotate the deconvolved cell types, you can certainly do that. In the Annotating deconvolved cell-types vignette, for demonstration purposes, we use the 5 annotated cell layers of the mOB, but you’ll notice that to match deconvolved cell types to know cell types based on transcriptional profiles, you will need a beta matrix for the known cell types. This is essentially a cell x gene expression matrix. You can find this in the assays slot of a SpatialExperiment object. You can use our provided function getCorrMtx() to see which deconvolved cell types are highly correlated with known cel types from your chosen database.

Alternatively, in Annotating deconvolved cell-types, we also perform GSEA as another strategy to identify deconvolved cell types. For GSEA, you need a list of chosen “marker” genes for each of your known cell types. For example:

gset <- list(
    Macrophages = c(“PTPRC”, “CD163”, “MSR1”, “CD14”, “CD68”, “AIF1”, “CSF1R”, “CD69”, “APOC1”),
    M1_Macrophage_General = c(“IFNG”, “IL1B”, “IL6”, “IL12”, “IL15”, “IL18”, “IL23”, “TNF”, “NOS2”, “IDO1”, “FAM26F”, “CXCL9”, “CXCL10”, “CXCL11”),
    M2_Macrophage_metastasis = c(“EGF”, “CSF1”, “CCL18”, “TGFB”, “MMP2”, “MMP3”, “MMP7”, “MMP9”, “CD163”, “MSR1”, “MRC1”, “TGM2”, “FABP4”, “CCL24”, “CCL26”),
    M2_TumorGrowth = c(“IL6”, “CXCL8”, “IL10”, “IL17”, “IL23”),
    M2_Angiogenesis = c(“VEGF”, “CXCL8”, “PDGF”, “MMP2”, “MMP3”, “MMP7”, “MMP9”, “ANG2”, “CCL5”, “CCL2”),
    M2_Immunosuppression = c(“ARG1”, “MRC1”, “ARG2”, “IL10”, “TGFB”, “CCL17”, “CCL18”, “CCL22”),
    Monocyte_M2 = c(“CSF1”, “CCL2”, “IL4”, “IL13”, “IL10”, “TGFB”),
    M2_polarization = c(“IL4”, “JAK1”, “STAT6”, “PI3K”, “PIK3CA”, “AKT1”, “MYC”, “PPARG”)
    )

Check out another issue here which goes into this in a little more detail: #35 (comment)

Hope this helps,
Brendan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants