From a85d2d4fbddaeaba94028cac0193d1c0686b8b14 Mon Sep 17 00:00:00 2001 From: EllisPatrick Date: Fri, 1 Dec 2023 10:46:08 +1100 Subject: [PATCH] work on intro + segmenation --- vignettes/workshop_material.Rmd | 150 +++++++++++++++++++++++++------- 1 file changed, 118 insertions(+), 32 deletions(-) diff --git a/vignettes/workshop_material.Rmd b/vignettes/workshop_material.Rmd index eff5554..5101e2e 100644 --- a/vignettes/workshop_material.Rmd +++ b/vignettes/workshop_material.Rmd @@ -54,34 +54,32 @@ cellular heterogeneity in a tissue environment. ### Description -In this tutorial we will introduce an analytical framework for analysing -data from high dimensional spatial omics technologies such as, CODEX, -CycIF, IMC and High Definition Spatial Transcriptomics. This framework -makes use of functionality from our Bioconductor packages simpleSeg, -FuseSOM, scClassify, scHot, spicyR, listClust, statial, scFeatures and -ClassifyR. By the end of this tutorial attendees will be able to -implement and assess some of the key steps of a spatial analysis -pipeline including cell segmentation, feature normalisation, cell type -identification, microenvironment and cell-state characterisation, -spatial hypothesis testing and patient classification. Understanding -these key steps will provide attendees with the core skills needed to -interrogate the comprehensive spatial information generated by these -exciting new technologies. +In this workshop we will introduce some of the key analytical concepts +needed to analyse data from high dimensional spatial omics technologies +such as, PhenoCycler, IMC, Xenium and MERFISH. We will show how +functionality from our Bioconductor packages simpleSeg, FuseSOM, +scClassify, scHot, spicyR, listClust, statial, scFeatures and ClassifyR +can be used to address various biological hypotheses. By the end of this +workshop attendees will be able to implement and assess some of the key +steps of a spatial analysis pipeline including cell segmentation, +feature normalisation, cell type identification, microenvironment and +cell-state characterisation, spatial hypothesis testing and patient +classification. Understanding these key steps will provide attendees +with the core skills needed to interrogate the comprehensive spatial +information generated by these exciting new technologies. ### Pre-requisites It is expected that students will have: - basic knowledge of R syntax, -- familiarity with SingleCellExperiment and/or SpatialExperiment - objects, and - this workshop will not provide an in-depth description of cell-resolution spatial omics technologies. ### *R* / *Bioconductor* packages used Several single cell R packages will be used from the scdney package, for -more information visit: +more information visit: . ### Time outline @@ -163,34 +161,43 @@ options("restore_SingleCellExperiment_show" = TRUE) ## The data -In this workshop, we will be working through two datasets to explore how +In this workshop, we will be working with two datasets to explore how biological phenotypes, cellular interactions, and patterns of gene -expression are correlated with disease. +expression are correlated with disease. Both of these datasets will be +used in different contexts, hopefully these contexts are representative +of scenarios you will encounter in your own datasets. We will use two motivating datasets: - [Keren et al, 2018](https://www.cell.com/fulltext/S0092-8674(18)31100-0): A multiplexed ion beam imaging by time-of-flight (MIBI-TOF) dataset - profilining tissue from triple-negative breast cancer patients. Can - we predict risk of cancer recurrence and overall survival time based - on imaging data? + profilining tissue from triple-negative breast cancer patients. The + primary question we will address with this dataset is if we can + predict risk of cancer recurrence and overall survival time based on + imaging data? - [Lohoff et al, 2022](https://www.nature.com/articles/s41587-021-01006-2): A seqFISH study of early mouse organogenesis. We will use a subset of data - that is made available from the STExampleData package. Can we find - key transcriptomic drivers of the developing brain? + that is made available from the STExampleData package. The primary + question we will address with this dataset is if we can identify key + transcriptomic drivers of the developing brain? ## Data visualisation and exploration -Here we will download the datasets, examine the structure, visualise the -data and perform some exploratory analyses. +The purpose of the this section is primarily to introduce the +`SpatialExperiment` class which is used to store information from the +imaging experiments in R. The goal will be to get comfortable enough +manipulating and exploring these objects so that you can progress +through the remainder of the workshop comfortably. Here we will download +a dataset stored in the `STexampleData` R package , examine the +structure, visualise the data and perform some exploratory analyses. ### SeqFISH mouse embryo Here we download the seqFISH mouse embryo data. This comes in the format -of a `SpatialExperiment` object, where all the data from an IMC dataset -can be compiled and accessed with relative ease. +of a `SpatialExperiment` object, where summarized information from an +imaging dataset can be compiled and accessed with relative ease. ```{r seqFISHData} spe <- STexampleData::seqFISH_mouseEmbryo() @@ -325,7 +332,7 @@ Try starting off your exploration by answering the below questions. ```{r kerenQ1} # try to answer the above question using the imc object. -# you may want to check the SingleCellExperiment vignette. +# you may want to check the SpatialExperiment vignette. # https://www.bioconductor.org/packages/release/bioc/vignettes/SpatialExperiment/inst/doc/SpatialExperiment.html ``` @@ -359,8 +366,11 @@ To load in our images we use the `loadImages` function from example. ```{r loadImage5} + +imageLocation <- system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial") image5 = cytomapper::loadImages( - x = system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial") + x = imageLocation, + as.is = TRUE #Needed as 8-bit image ) mcols(image5) = data.frame(list("imageID" = "kerenPatient5")) @@ -371,7 +381,83 @@ channelNames(image5) = c("Au", "Background", "Beta catenin", "Ca", "CD11b", "CD1 ``` -### How do I perform segmentation in R? +::: question +**Questions** + +1. What class is image5? Hint: class() +2. How many images and markers are in image5? +3. Challenge: What is the dimension of the image5 image? +::: + +### Visualise an image + +We can visualise this image to see what we have read in. Lets highlight +4 markers.\ + +```{r plotImage} +# Visualise segmentation performance another way. +cytomapper::plotPixels( + image = image5[1], + colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"), + colour = list( + CD45 = c("black", "blue"), + `Pan-Keratin` = c("black", "yellow"), + SMA = c("black", "green"), + dsDNA = c("black", "red") + ) +) +``` + +We can manipulate the brightness, contrast and gamma levels as follows. +See if you can do a better job. + +```{r plotImage2} +# Visualise segmentation performance another way. +cytomapper::plotPixels( + image = image5[1], + colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"), + display = "single", + colour = list( + CD45 = c("black", "red"), + `Pan-Keratin` = c("black", "yellow"), + SMA = c("black", "green"), + dsDNA = c("black", "blue") + ) + , + # Adjust the brightness, contrast and gamma of each channel. + bcg = list( + CD45 = c(0, 4, 1), + `Pan-Keratin` = c(0, 3, 1), + SMA = c(0, 2, 1), + dsDNA = c(0, 2, 1) + ), + legend = NULL +) +``` + +### Can we identify the cells in the image? + +The `EBImage` package on Bioconductor provides a lot of useful functions +for manipulating imaging data in R. This includes functionality for +finding cells, process called cell segmentation. Lets work through an +example from their vignette. This will use some functionality that +complements that which you've already learnt. + +We start by loading the images of nuclei and cell bodies. To visualize +the cells we overlay these images as the green and the blue channel of a +false-color image. Notice, that with display you can zoom! + +```{r readEBImage} +nuc = readImage(system.file('images', 'nuclei.tif', package='EBImage')) +cel = readImage(system.file('images', 'cells.tif', package='EBImage')) +cells = rgbImage(green=1.5*cel, blue=nuc) +display(cells, all = TRUE) +``` + +We will next create a nuclei mask. The `nuc` channel contains +fluorescent intensities of a protein expressed in the nuclei of cells. +The nuclei mask will threshold this channel to separate signal from +noise and then clean this with som Images stored in a `list` or `CytoImageList` can be segmented using `simpleSeg`. Below `simpleSeg` will identify the nuclei in the image @@ -1827,7 +1913,8 @@ kerenCV_recurrence = crossValidate( ) ``` -Again, using `performancePlot`, this time for recurrence, we found better performance in select spatial metrics. +Again, using `performancePlot`, this time for recurrence, we found +better performance in select spatial metrics. ```{r perfPlot-recurrence} performancePlot(kerenCV_recurrence, @@ -1842,4 +1929,3 @@ performancePlot(kerenCV_recurrence, ```{r sessionInfo} sessionInfo() ``` -