Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

06 explore infercnv for Wilms tumor -06 #828

Merged
merged 28 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
991ff2a
Start exploring infercnv results
maud-p Oct 16, 2024
f8fff21
update notebook README.md file
maud-p Oct 16, 2024
07973d3
Update README.md
maud-p Oct 16, 2024
ff057bb
changes to PR#828
maud-p Oct 18, 2024
f06b6a9
PR#828 Add corrected html notebooks
maud-p Oct 19, 2024
617d822
corrections PR#828
maud-p Oct 19, 2024
84286eb
improve PR#828 and try inter-patient normal reference
maud-p Oct 21, 2024
52cf2aa
add ggdist dependencie
maud-p Oct 28, 2024
eced971
Merge pull request #9 from maud-p/origin/06_explore_infercnv
maud-p Oct 28, 2024
e4cff23
few changes to PR828
maud-p Oct 28, 2024
2e6fd9a
few changes to PR828
maud-p Oct 28, 2024
d25e463
Merge pull request #10 from maud-p/origin/06_explore_infercnv
maud-p Oct 28, 2024
176d5ff
update notebook
maud-p Oct 28, 2024
a40ca8c
Merge pull request #11 from maud-p/origin/06_explore_infercnv
maud-p Oct 28, 2024
5f28e8b
add comment on 06_infercnv.R on saved output
maud-p Oct 29, 2024
7c1a5b4
Merge pull request #12 from maud-p/origin/06_explore_infercnv
maud-p Oct 29, 2024
06088ef
Update the main README.md file
maud-p Oct 29, 2024
d7f4cd8
Merge branch 'main' into 06_explore_infercnv
sjspielman Oct 29, 2024
a7d2db2
Update analyses/cell-type-wilms-tumor-06/scripts/06_infercnv.R
maud-p Oct 29, 2024
775e8f4
Update analyses/cell-type-wilms-tumor-06/scripts/06_infercnv.R
maud-p Oct 29, 2024
6c4755f
Update analyses/cell-type-wilms-tumor-06/scripts/06_infercnv.R
maud-p Oct 29, 2024
e4635f2
Update analyses/cell-type-wilms-tumor-06/scripts/06b_build-normal_ref…
maud-p Oct 29, 2024
19829a9
Update analyses/cell-type-wilms-tumor-06/scripts/06_infercnv.R
maud-p Oct 29, 2024
a7927dc
Update analyses/cell-type-wilms-tumor-06/scripts/06b_build-normal_ref…
maud-p Oct 29, 2024
98839eb
Update analyses/cell-type-wilms-tumor-06/notebook_template/06_cnv_inf…
maud-p Oct 29, 2024
e248253
update html reports
maud-p Oct 29, 2024
b1d6b64
Merge pull request #13 from maud-p/06_updated_branch2
maud-p Oct 29, 2024
46d91df
Merge branch 'main' into 06_explore_infercnv
sjspielman Oct 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 27 additions & 3 deletions analyses/cell-type-wilms-tumor-06/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ The analysis is/will be divided as the following:
- [x] Script: clustering of cells across a set of parameters for few samples
- [x] Script: label transfer from the fetal kidney atlas reference using runAzimuth
- [x] Script: run copykat and inferCNV
- [ ] Notebook: explore results from steps 2 to 4 for about 5 to 10 samples
- [x] Notebook: explore results from steps 2 to 4 for about 5 to 10 samples
- [ ] Script: compile scripts 2 to 4 in a RMardown file with required adjustements and render it across all samples
- [ ] Notebook: explore results from step 6, integrate all samples together and annotate the dataset using (i) metadatafile, (ii) CNV information, (iii) label transfer information
- [x] Notebook: explore results from step 6, integrate all samples together and annotate the dataset using (i) metadatafile, (ii) CNV information, (iii) label transfer information

## Usage
From Rstudio, run the Rmd reports or render the R scripts (see below R studio session set up).
Expand Down Expand Up @@ -158,8 +158,25 @@ The `00_run_workflow.R` contains the following steps:
- `Azimuth` label transfer from the fetal kidney reference (Stewart et al.): `02b_label-transfer_fetal_kidney_reference_Stewart.Rmd` in `notebook_template`

- Exploration of clustering, label transfers, marker genes and pathways: `03_clustering_exploration.Rmd` in `notebook_template`

- CNV inference using [`infercnv`](https://github.com/broadinstitute/inferCNV/wiki) with endothelial and immune cells as reference from either the same patient or a pool of upfront resection Wilms tumor samples: `06_infercnv.R` in `script`


While we only selected the `infercnv` method with endothelium and immune cells as normal reference for the main workflow across samples, our analysis includes an exploration of cnv inference methods based on `copykat` and `infercnv` on a subselection of samples:
the `script` `explore-cnv-methods.R` calls the independent scripts `05_copyKAT.R` and `06_infercnv.R` for the samples
- "SCPCS000179",
- "SCPCS000184",
- "SCPCS000194",
- "SCPCS000205",
- "SCPCS000208".

In addition, we explored the results for all samples in one notebook twice during the analysis:

- the notebook `04_annotation_Across_Samples_exploration.Rmd` explored the annotations obtained by label transfer in all samples

- the notebook `07_annotation_Across_Samples_exploration.Rmd` explored the potential of combining label transfer and cnv to finalize the annotation of the Wilms tumor dataset.


For each sample and each of the step, an html report is generated and accessible in the directory `notebook`.

### Justification
Expand Down Expand Up @@ -192,12 +209,17 @@ Here we will use `Azimuth` to transfer labels from the reference.
We start with the `_process.Rds` data to run `01_seurat-processing.Rmd`.
The output of `01_seurat-processing.Rmd` is saved in `results` in a subfolder for each sample and is the input of the second step `02a_label-transfer_fetal_full_reference_Cao.Rmd`.
The output of `02a_label-transfer_fetal_full_reference_Cao.Rmd` is then the input of `02b_label-transfer_fetal_kidney_reference_Stewart.Rmd`.
Following the same approach, the output of `02b_label-transfer_fetal_kidney_reference_Stewart.Rmd` is the input of `03_clustering_exploration.Rmd`.
Following the same approach, the output of `02b_label-transfer_fetal_kidney_reference_Stewart.Rmd` is the input of `03_clustering_exploration.Rmd` and `06_infercnv.R`.
The outputs of `06_infercnv.R` `06_infercnv_HMM-i3_{sample_id}_{reference-type}.rds` is finally the input of `07_annotation_Across_Samples_exploration.Rmd`.

All inputs/outputs generated and used in the main workflow are saved in the `results/{sample_id}` folder.
Results in subfolders such as `results/{sample_id}/05_copyKAT` or `results/{sample_id}/06_infercnv` have been obtained for a subselection of samples in the exploratory analysis, and are thus kept separated from the results of the main workflow.

At the end of the workflow, we have a `Seurat`object that contains:
- normalization and clustering, dimensional reductions
- label transfer from the fetal full reference
- label transfer from the fetal kidney reference
- cnv predictions using `infercnv`

## Software requirements

Expand All @@ -210,6 +232,8 @@ The main packages used are:
- DT for table visualization
- DElegate for differential expression analysis

The dependencies required are saved in `components/dependencies.R`

### Docker

To build the Docker image, run the following from this directory:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ library(Azimuth) # remotes::install_github("satijalab/azimuth")
library(SCpubr)
library(ggplotify)
library(edgeR)
library(fetusref.SeuratData)
library(ggdist)
library(fetusref.SeuratData)
18 changes: 12 additions & 6 deletions analyses/cell-type-wilms-tumor-06/notebook/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,25 @@ As part of the `00b_characterize_fetal_kidney_reference_Stewart.Rmd` notebook te
Each of the sample of the Wilms tumor dataset SCPCP000006 as been pre-processed and characterized as the following.
Reports for each of the steps are found in the notebook/{sample_id} directory:

- `01_seurat_processing_{sample-id}.html` is the output of the [`01_seurat-processing.Rmd`](../notebook_template/01_seurat-processing.Rmd) notebook template.
- [x] `01_seurat_processing_{sample-id}.html` is the output of the [`01_seurat-processing.Rmd`](../notebook_template/01_seurat-processing.Rmd) notebook template.
In brief, the `_processed.rds` `sce object` is converted to `Seurat` and normalized using `SCTransform`.
Dimensionality reduction (`RunPCA` and `RunUMAP`) and clustering (`FindNeighbors` and `FindClusters`) are performed before saving the `Seurat` object.

- `02a_fetal_full_label-transfer_{sample-id}.html` is the output of the [`02a_label-transfer_fetal_full_reference_Cao.Rmd`](../notebook_template/02a_label-transfer_fetal_full_reference_Cao.Rmd) notebook template.
- [x] `02a_fetal_full_label-transfer_{sample-id}.html` is the output of the [`02a_label-transfer_fetal_full_reference_Cao.Rmd`](../notebook_template/02a_label-transfer_fetal_full_reference_Cao.Rmd) notebook template.
In brief, we used `Azimuth` to transfer labels from the `Azimuth` fetal full reference (Cao et al.)

- `02b_fetal_kidney_label-transfer_{sample-id}.html` is the output of the [`02b_label-transfer_fetal_kidney_reference_Stewart.Rmd`](../notebook_template/02b_label-transfer_fetal_kidney_reference_Stewart.Rmd) notebook template.
- [x] `02b_fetal_kidney_label-transfer_{sample-id}.html` is the output of the [`02b_label-transfer_fetal_kidney_reference_Stewart.Rmd`](../notebook_template/02b_label-transfer_fetal_kidney_reference_Stewart.Rmd) notebook template.
In brief, we used `Azimuth` to transfer labels from the fetal kidney reference (Stewart et al.)

- `03_clustering_exploration_{sample-id}.html` is the output of the [`03_clustering_exploration.Rmd`](../notebook_template/03_clustering_exploration.Rmd) notebook template.
- [x] `03_clustering_exploration_{sample-id}.html` is the output of the [`03_clustering_exploration.Rmd`](../notebook_template/03_clustering_exploration.Rmd) notebook template.
In brief, we explore the clustering results, we look into some marker genes, pathways enrichment and label transfer.


## Global analysis

The next step in analysis is to identify tumor vs. normal cells.

- `04_annotation_Across_Samples_exploration.html` is the output of the [`04_annotation_Across_Samples_exploration.Rmd`](../notebook/04_annotation_Across_Samples_exploration.Rmd) notebook.
- [x] `04_annotation_Across_Samples_exploration.html` is the output of the [`04_annotation_Across_Samples_exploration.Rmd`](../notebook/04_annotation_Across_Samples_exploration.Rmd) notebook.
In brief, we explored the label transfer results across all samples in the Wilms tumor dataset SCPCP000006 in order to identify a few samples that we can begin next analysis steps with.

One way to evaluate the label transfer is to look at the `predicted.score` for each label being transfered, which more or less correspond to the certainty for a label transfer to be _TRUE_. More informations on the cell-level metric `predicted.score` can be found in the [mapping QC](https://azimuth.hubmapconsortium.org/#Mapping%20QC) section of `Azimuth` documentation.
Expand All @@ -59,8 +59,14 @@ We selected in [`04_annotation_Across_Samples_exploration.Rmd`](../notebook/04_a
- sample SCPCS000205 has > 89 % of cells predicted as kidney and 92 + 76 endothelium and immune cells.
- sample SCPCS0000208 has > 95 % of cells predicted as kidney and 18 + 35 endothelium and immune cells.

We wanted to test `copykat` results obtained with or without normal cells as reference, using either an euclidean or statistical (spearman) method for CNV heatmap clustering.
- [x] `05_copykat_exploration_{sample_id}.html` is the output of the [`05_copykat_exploration.Rmd`](../notebook_template/05_copykat_exploration.Rmd) notebook template.

In brief, we wanted to test `copykat` results obtained with or without normal cells as reference, using either an euclidean or statistical (spearman) method for CNV heatmap clustering.
This impact the final decision made by `copykat` for each cell to be either aneuploid or diploid, and it is thus crucial to explore the results using the different methods.
For each of the selected samples, we explore the results in the template `notebook` [`05_copykat_exploration.Rmd`](../notebook_template/05_copykat_exploration.Rmd), which creates a notebook `05_cnv_copykat_exploration_{sample_id}.html` for each sample.
These `notebooks` are inspired by the plots written for the Ewing Sarcoma analysis in [`03-copykat.Rmd`](https://github.com/AlexsLemonade/OpenScPCA-analysis/blob/main/analyses/cell-type-ewings/exploratory_analysis/03-copykat.Rmd).

- [x] `06_infercnv_exploration_{sample_id}.html` is the output of the [`06_infercnv_exploration.Rmd`](../notebook_template/06_infercnv_exploration.Rmd) notebook template.

In brief, we wanted to test `infercnv` results obtained with or without endothelium and/or immune cells as reference.
We also explore the potential of using a HMM model to assign CNV scores for each cells and discriminate normal from cancer cells.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Loading