-
Notifications
You must be signed in to change notification settings - Fork 0
/
04a-cell_localisation.qmd
300 lines (219 loc) · 13.5 KB
/
04a-cell_localisation.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
# Cell localisation between pairs of cell types
So far, we have segmented our cells, performed normalisation to mitigate batch effects, and annotated our cells. This completes our pre-processing stage, and we can move on to analysing our data. One of the primary motivations behind pursuing spatial technology (as opposed to space-agnostic technologies such as scRNAseq) is that it allows us to tease out whether changes are occurring spatially, i.e. are two cell types closer together in a disease state vs a non-disease state. Whilst these changes are often visually obvious, more advanced statistical modelling is required to quantify localisation and dispersion relationships. In this section, we demonstrate the use of two packages: `spicyR` and `Statial` for quantifying cell type localisation.
```{r 04a-loadLibraries, echo=FALSE, results="hide", warning=FALSE}
suppressPackageStartupMessages({
library(spicyR)
library(Statial)
library(ggplot2)
library(SpatialExperiment)
library(SpatialDatasets)
library(imcRtools)
library(dplyr)
library(survival)
library(tibble)
library(treekoR)
library(ggsurvfit)
})
```
```{r, eval = FALSE}
# load required libraries
library(spicyR)
library(Statial)
library(ggplot2)
library(SpatialExperiment)
library(SpatialDatasets)
library(imcRtools)
library(dplyr)
library(survival)
library(tibble)
library(treekoR)
library(ggsurvfit)
```
```{r, setParam}
# set parameters
set.seed(51773)
use_mc <- TRUE
if (use_mc) {
nCores <- max(parallel::detectCores()/2, 1)
} else {
nCores <- 1
}
BPPARAM <- simpleSeg:::generateBPParam(nCores)
theme_set(theme_classic())
```
## Quantifying cell type co-localisation with spicyR
[spicyR](https://www.bioconductor.org/packages/release/bioc/html/spicyR.html) provides a metric to quantify the degree of localisation or dispersion between two cell types. It then tests for changes in the co-localisation metric across different disease states or groups.
<img src="images/spicyR_fig1.jpeg" align="center" style="height: 300px; border: 0px"/>
Here, we will use the Keren 2018 dataset to demonstrate the use of `spicyR`. The data is stored as a `SpatialExperiment` object within the `SpatialDatasets` package and contains single-cell spatial data from 41 images for three types of breast cancer tumours (cold, compartmentalised, and mixed).
```{r 04a-loadKeren, warning=FALSE, message=FALSE}
kerenSPE <- SpatialDatasets::spe_Keren_2018()
# remove any missing data in our outcome columns
kerenSPE = kerenSPE[, complete.cases(colData(kerenSPE)[, c("Censored", "Survival_days_capped*",
"tumour_type")])]
```
The cell types in this dataset includes 11 immune cell types (double negative CD3 T cells, CD4 T cells, B cells, monocytes, macrophages, CD8 T cells, neutrophils, natural killer cells, dendritic cells, regulatory T cells), 2 structural cell types (endothelial, mesenchymal), 2 tumour cell types (keratin+ tumour, tumour) and one unidentified category.
### Linear modelling
We use the L-function to measure the degree of co-localisation between two cell types. The L-function is a variance-stabilised version of the K-function given by the equation
$$
\widehat{L_{ij}} (r) = \sqrt{\frac{\widehat{K_{ij}}(r)}{\pi}}
$$
with $\widehat{K_{ij}}$ defined as
$$
\widehat{K_{ij}} (r) = \frac{|W|}{n_i n_j} \sum_{n_i} \sum_{n_j} 1 \{d_{ij} \leq r \} e_{ij} (r)
$$
where $\widehat{K_{ij}}$ summarises the degree of co-localisation of cell type $j$ with cell type $i$, $n_i$ and $n_j$ are the number of cells of type $i$ and $j$, $|W|$ is the image area, $d_{ij}$ is the distance between two cells and $e_{ij} (r)$ is an edge correcting factor.
Specifically, the mean difference between the experimental function and the theoretical function is used as a measure for the level of localisation, defined as
$$
u = \sum_{r' = r_{\text{min}}}^{r_{\text{max}}} \widehat L_{ij, \text{Experimental}} (r') - \widehat L_{ij, \text{Poisson}} (r')
$$
where $u$ is the sum is taken over a discrete range of $r$ between $r_{\text{min}}$ and $r_{\text{max}}$. Differences of the statistic $u$ between two conditions is modelled using a weighted linear model.
### Test for changes in localisation for a specific pair of cells
Firstly, we can test whether one cell type tends to be more localised with another cell type in one condition compared to the other. This can be done using the `spicy()` function, where we specify the `condition` parameter.
In this example, we want to see whether or not neutrophils (`to`) tend to be found around CD8 T cells (`from`) in compartmentalised tumours compared to cold tumours. Given that there are 3 conditions, we can specify the desired conditions by setting the order of our `condition` factor. `spicy()` will choose the first level of the factor as the base condition and the second level as the comparison condition. `spicy()` will also naturally coerce the `condition` column into a factor if it is not already a factor. The radius over which to calculate the L-funcion can be specified using the `Rs` argument. Small radii will examine local spatial relationships, whereas larger radii will examine global spatial relationships. By default, `spicy()` calculates the L-function over a range of radii.
The column containing cell type annotations and image IDs can be specified using the `cellType` and `imageID` arguments respectively. By default, `spicy` uses the columns named `cellType` and `imageID`.
We obtain a `spicy` object which details the results of the modelling performed. The `topPairs()` function can be used to obtain the associated coefficients and p-value.
```{r 04a-spicyTestPair}
spicyTestPair <- spicy(
kerenSPE,
condition = "tumour_type",
from = "CD8_T_cell",
to = "Neutrophils",
BPPARAM = BPPARAM
)
topPairs(spicyTestPair)
```
As the `coefficient` in `spicyTestPair` is positive, we find that neutrophils are significantly more likely to be found near CD8 T cells in the compartmentalised tumours group compared to the cold tumour group.
### Test for changes in localisation for all pairwise cell combinations
We can perform what we did above for all pairwise combinations of cell types by excluding the `from` and `to` parameters in `spicy()`. Additional covariates can be added using the `covariates` argument.
```{r 04a-spicyTest}
spicyTest <- spicy(
kerenSPE,
condition = "tumour_type",
BPPARAM = BPPARAM
)
topPairs(spicyTest)
```
Again, we obtain a `spicy` object which outlines the result of the linear models performed for each pairwise combination of cell types.
We can also examine the L-function metrics of individual images by using the convenient `bind()` function on our `spicyTest` results object.
```{r 04a-bind}
bind(spicyTest)[1:5, 1:5]
```
The results can be represented as a bubble plot using the `signifPlot()` function.
```{r 04a-signifPlot}
signifPlot(
spicyTest,
breaks = c(-3, 3, 1),
marksToPlot = c("Macrophages", "DC_or_Mono", "dn_T_CD3", "Neutrophils",
"CD8_T_cell", "Keratin_Tumour")
)
```
Here, we can observe that the most significant relationships occur between macrophages and double negative CD3 T cells, suggesting that the two cell types are far more dispersed in compartmentalised tumours compared to cold tumours.
To examine a specific cell type-cell type relationship in more detail, we can use `spicyBoxplot()` and specify either `from = "Macrophages"` and `to = "dn_T_CD3"` or `rank = 1`.
```{r 04a-spicyBoxPlot}
spicyBoxPlot(results = spicyTest,
# from = "Macrophages",
# to = "dn_T_CD3"
rank = 1)
```
### Linear modelling for custom metrics
`spicyR` can also be applied to custom distance or abundance metrics. A kNN interactions graph can be generated with the function `buildSpatialGraph` from the `imcRtools` package. This generates a `colPairs` object inside of the `SpatialExperiment` object.
`spicyR` provides the function `convPairs` for converting a `colPairs` object into an abundance matrix by calculating the average number of nearby cells types for every cell type for a given `k`. For example, if there exists on average 5 neutrophils for every macrophage in image 1, the column `Neutrophil__Macrophage` would have a value of 5 for image 1.
```{r 04a-kNN}
kerenSPE <- imcRtools::buildSpatialGraph(kerenSPE,
img_id = "imageID",
type = "knn", k = 20,
coords = c("x", "y"))
pairAbundances <- convPairs(kerenSPE,
colPair = "knn_interaction_graph")
head(pairAbundances["B_cell__B_cell"])
```
The custom distance or abundance metrics can then be included in the analysis with the `alternateResult` parameter.
```{r 04a-spicyTestkNN}
spicyTestColPairs <- spicy(
kerenSPE,
condition = "tumour_type",
alternateResult = pairAbundances,
weights = FALSE,
BPPARAM = BPPARAM
)
topPairs(spicyTestColPairs)
```
```{r 04a-signifPlotkNN}
signifPlot(
spicyTestColPairs,
marksToPlot = c("Macrophages", "dn_T_CD3", "CD4_T_cell",
"B_cell", "DC_or_Mono", "Neutrophils", "CD8_T_cell")
)
```
### Mixed effects modelling
`spicyR` supports mixed effects modelling when multiple images are obtained for each subject. In this case, `subject` is treated as a random effect and `condition` is treated as a fixed effect. To perform mixed effects modelling, we can specify the `subject` parameter in the `spicy()` function.
To demonstrate spicyR's functionality with mixed effects models, we will use the Damond 2019 dataset.
```{r 04a-mixedEffectModel}
# load in data
data("diabetesData")
# mixed effects modelling with spicy
spicyMixedTest <- spicy(
diabetesData,
condition = "stage",
subject = "case",
BPPARAM = BPPARAM
)
```
As before, we generate a `spicy` results object, and we can use `topPairs` to identify the most significant cell type pairs.
```{r}
topPairs(spicyMixedTest)
```
We can use `signifPlot` to visualise the results.
```{r}
signifPlot(spicyMixedTest,
marksToPlot = c("beta", "delta", "B", "Th", "otherimmune",
"naiveTc", "macrophage", "Tc", "stromal"))
```
The graph shows a significant decrease in co-localisation between delta and beta cells in the pancreas within the onset diabetes group compared to the non-diabetes group. Additionally, there is a significant increase in co-localisation among certain immune cell groups, including B cells and Th cells, as well as naive Tc cells and other immune cells. These findings align with the results reported in the original study.
### Performing survival analysis
`spicy` can also be used to perform survival analysis to asses whether changes in co-localisation between cell types are associated with survival probability. `spicy` requires the `SingleCellExperiment` object being used to contain a column called `survival` as a `Surv` object.
```{r}
kerenSPE$event = 1 - kerenSPE$Censored
kerenSPE$survival = Surv(kerenSPE$`Survival_days_capped*`, kerenSPE$event)
```
We can then perform survival analysis using the `spicy` function by specifying `condition = "survival"`. We can then access the corresponding coefficients and p-values by accessing the `survivalResults` slot in the `spicy` results object.
```{r 04a-spicySurvival}
# Running survival analysis
spicySurvival = spicy(kerenSPE,
condition = "survival",
BPPARAM = BPPARAM)
# top 10 significant pairs
head(spicySurvival$survivalResults, 10)
```
### Accounting for tissue inhomogeneity
The `spicy` function can also account for tissue inhomogeneity to avoid false positives or negatives. This can be done by setting the `sigma =` parameter within the spicy function. By default, `sigma` is set to `NULL`, and `spicy` assumes a homogeneous tissue structure.
In the example below, we examine the degree of co-localisation between `Keratin_Tumour__Neutrophils` in one image using the `getPairwise` function, which returns the L-function values for each cell type pair. We set the radius over which the L-function should be calculated (`Rs = 100`) and specify `sigma = NULL`. The calculated L-function is positive, indicating attraction between the two cell types.
```{r 04a-sigmanull}
# filter SPE object to obtain image 24 data
kerenSubset = kerenSPE[, colData(kerenSPE)$imageID == "1"]
pairwiseAssoc = getPairwise(kerenSubset,
sigma = NULL,
Rs = 100) |>
as.data.frame()
pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
```
When we specify `sigma = 20` and re-calculate the L-function, it indicates that there is no relationship between `Keratin_Tumour` and `Neutrophils`, i.e., there is no major attraction or dispersion, as it now takes into account tissue inhomogeneity.
```{r 04a-sigma20}
pairwiseAssoc = getPairwise(kerenSubset,
sigma = 20,
Rs = 100) |> as.data.frame()
pairwiseAssoc[["Keratin_Tumour__Neutrophils"]]
```
To understand why this might be happening, we can take a closer look at the relationship between `Keratin_Tumour` and `Neutrophils`. The `plotImage` function allows us to plot any two cell types for a specific image. Below, we plot image 24 for the `Keratin_Tumour__Neutrophils` relationship by specifying `from = Keratin_Tumour` and `to` = `Neutrophils`.
```{r include = FALSE}
# needs to be removed
devtools::load_all("../spicyR")
```
```{r 04a-plotCellTypes}
plotImage(kerenSPE, imageToPlot = "24", from = "Keratin_Tumour", to = "Neutrophils")
```
Plotting image 24 shows that the supposed co-localisation occurs due to the dense cluster of cells near the bottom of the image.
## sessionInfo
```{r}
sessionInfo()
```