-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCPCP000006_05_explore_COPYKAT #813
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm convinced - let's not use copyKAT 😄
I'm also really glad that you were able to adapt code from a different analysis module, too - it's a great way to work within the OpenScPCA project!
I left a few very small wording comments, but otherwise this seems good to go! Once you make the small changes I suggested, I will approve this and we can move onto looking at inferCNV.
analyses/cell-type-wilms-tumor-06/notebook_template/05_copykat_exploration.Rmd
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/notebook_template/05_copykat_exploration.Rmd
Outdated
Show resolved
Hide resolved
|
||
From the heatmap of CNV and the mean CNV detection plots, it is difficult to see any CNV pattern that drive the identification of aneuploid cells. | ||
The assignment of the aneuploidy/diploidy value might relies on very few CNV and/or an arbitrary threshold. | ||
This might be why the assignement of aneuploidy/diploidy values differs between condition (and between runs!!). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - does it really differ between runs? I'm surprised to hear this (but I believe you!!), because the copyKAT
code itself actually hardcodes a random seed; I actually filed an issue about this last week: navinlabcode/copykat#121. So, I'd expect it to be very reproducible. This means that it must be calling something else that is somehow un-setting the seed.
Nothing we can do about it, it's just interesting to note...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I figured this out when running multiple times to save the heatmap not in jpeg but png format. I can send you examples tomorrow (I struggle with the vpn connexion now 😞, hopefully better tomorrow! )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjspielman some example to illustrate.
For the sample -208, I show you the heatmap of CNV for copykat
ran with an euclidean distance, no reference, two days apart (to then save the heatmap in png instead of jpeg:
This is not completly different but not identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
analyses/cell-type-wilms-tumor-06/notebook_template/05_copykat_exploration.Rmd
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/notebook_template/05_copykat_exploration.Rmd
Outdated
Show resolved
Hide resolved
Co-authored-by: Stephanie Spielman <[email protected]>
…_exploration.Rmd Co-authored-by: Stephanie Spielman <[email protected]>
…_exploration.Rmd Co-authored-by: Stephanie Spielman <[email protected]>
…_exploration.Rmd Co-authored-by: Stephanie Spielman <[email protected]>
…_exploration.Rmd Co-authored-by: Stephanie Spielman <[email protected]>
@maud-p can you also regenerate the html notebooks since the Rmd has changed? Thanks! |
Sure, it is running! Sorry for missing this part! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is ready to go! The GitHub Action failure is not something to worry about - or at least, it wasn't caused by your code, so it doesn't prevent this from moving forward.
Hm actually, we do need to try and get that check to pass, which is a matter of "re-triggering" the github action. Can you make a commit with a small "meaningless" change, like something very small (even just a space) somewhere in the README file? That will hopefully get the action to work. |
@jashapiro can you have a look at what's up with this pre-commit action (I don't have permissions)?This PR is otherwise approved. |
We seem to be timing out on cloning the repo. This could be a more significant problem than just this PR: There is a time limit of 30s for cloning and we seem to be taking ~25 seconds in other PRs, meaning we are probably quite close to the repo size limit for the service. |
@maud-p We're going to have to look into this action failure on our end a bit more. Until then, you can feel free to file the next PR looking at infer CNV. |
Wow I was out of office few hours and problem already solved! Thank you so much @sjspielman @jashapiro 🚀 |
Purpose/implementation Section
This PR is following the discussion from the PR#776.
It explores the results generated in PR#801.
Please link to the GitHub issue that this pull request addresses.
#790
What is the goal of this pull request?
We wanted to explore the
copykat
results generated in PR#801.copykat
results have been obtained with or without normal cells as reference, using either an euclidean or statistical (spearman) method for CNV heatmap clustering.This impact the final decision made by
copykat
for each cell to be either aneuploid or diploid, and it is thus crucial to explore the results using the different methods.For each of theselected samples, we explore the results in the
notebooks
05_cnv_copykat_{distance_parameter}_exploration_{sample_id}.html
.These
notebooks
are inspired by the plots written for the Ewing Sarcoma analysis in03-copykat.Rmd
.Briefly describe the general approach you took to achieve this goal.
I get inspiration from the Ewing Sarcoma analysis in
03-copykat.Rmd
to quickly check the results ofcopykat
.If known, do you anticipate filing additional pull requests to complete this analysis module?
Yes, same kind of notebook for exploration of
infercnv
results.What is your summary of the results?
Looking at
copykat
heatmaps, I can really see how the function annotated a cell as aneuploid or diploid when usingspearman
as a clustering distance. The results seems to be more meaningfull when using aneuclidean
distance (i.e. we see CNV in aneuploid predicted cells and not diploid, mostly).For that reason, I think that using
spearman
clustering distance should be avoided.Provide directions for reviewers
I am not confident about continuing with
copykat
.I'll do the same kind of exploration of the
infercnv
results in another PR, maybe adding some additional umap of predicted_cell.types to have everything in the same notebook.Author checklists
Analysis module and review
README.md
has been updated to reflect code changes in this pull request.Reproducibility checklist
Dockerfile
.environment.yml
file.renv.lock
file.