Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

06_explore cnv results using copykat and infercnv for a subselection of Wilms tumor from SCPCP000006 #802

Closed
wants to merge 5 commits into from

Conversation

maud-p
Copy link
Contributor

@maud-p maud-p commented Oct 8, 2024

Purpose/implementation Section

This PR is following the discussion from the PR#776.
It explores the results generated in PR#801.

Please link to the GitHub issue that this pull request addresses.

#790

What is the goal of this pull request?

We wanted to test copykat and infercnv results generated in PR#801.

copykat results have been obtained with or without normal cells as reference, using either an euclidean or statistical (spearman) method for CNV heatmap clustering.
This impact the final decision made by copykat for each cell to be either aneuploid or diploid, and it is thus crucial to explore the results using the different methods.
For each of th eselected samples, we explore the results in the notebooks 05_cnv_copykat_{distance_parameter}_exploration_{sample_id}.html.
These notebooks are inspired by the plots written for the Ewing Sarcoma analysis in 03-copykat.Rmd.

We also tested infercnv results obtained with or without normal cells as reference.
As we are not sure how exhaustive the normal reference cell list as to be, we tested the sensitivity of infercnv in regard to the definition of the normal cells, using either only immune cells, only endothelial cells or both of them as healthy reference.

For each of the samples, we compare the heatmap of infered CNV in the notebook 06_cnv_exploration_{sample_id}.html

Briefly describe the general approach you took to achieve this goal.

I get inspiration from the Ewing Sarcoma analysis in 03-copykat.Rmd to quickly check the results of copykat.

If known, do you anticipate filing additional pull requests to complete this analysis module?

yes, afetr selection of th ebest method to infer CNV and aneuploidy, we should run it over the entire wilms-06 dataset and explore the results.

What is your summary of the results?

Looking at copykat heatmaps, I can really see how the function annotated a cell as aneuploid or diploid when using spearman as a clustering distance. The results seems to be more meaningfull when using an euclidean distance (i.e. we see CNV in aneuploid predicted cells and not diploid, mostly).
For that reason, I think that using spearman clustering distance should be avoided.

The CNV heatmap output from infercnv seems to be easier to interprete and do correspond in some extend to the output of copykat. We identified CNV that were not described initially as Wilms tumor specific CNV (in my README.md file). However, I looked into the litterature and found a usefull table of segmental copy number changes. Here we see for example that 18 gain is reported in Wilms tumor with a prevalence of 10%. Loss in chr4, gain in chr7 and 6 are also reported, which fits with our results.

image

Provide directions for reviewers

I think we will need to explore these results a bit more, like for 03-copykat.Rmd but I am wondering if we could already decrease the number of condition to explore based on the first observations.

I would suggest only using the euclidean distance for copykat with a healthy reference when possible and infercnv also with a healthy reference when possible.

Author checklists

Check all those that apply.
Note that you may find it easier to check off these items after the pull request is actually filed.

Analysis module and review

Reproducibility checklist

  • Code in this pull request has been added to the GitHub Action workflow that runs this module.
  • The dependencies required to run the code in this pull request have been added to the analysis module Dockerfile.
  • If applicable, the dependencies required to run the code in this pull request have been added to the analysis module conda environment.yml file.
  • If applicable, R package dependencies required to run the code in this pull request have been added to the analysis module renv.lock file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant