Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Discussion: possible improvements for copy number processing / analyses #486

Open
jaclyn-taroni opened this issue Jan 29, 2020 · 0 comments

Comments

@jaclyn-taroni
Copy link
Member

We at the CCDL have been thinking quite a bit about the copy number data this sprint (#479 #485 #480 #482 #463 #467 #476). Here I want to document some potential improvements that have come up during various discussions. I don't know if anything here will rise to the level of "must-have" and it may also be more appropriate to split individual notes/points into their own, more fleshed out tickets.

  • From a discussion with @jashapiro (please tell me if I've got this wrong 🙂 ) - anything that goes through copy_number_consensus_call with a copy number of 2 in CNVkit is essentially considered a neutral change because CNVkit doesn't take into account ploidy (A Note on Ploidy). So it's possible in some cases we are underreporting losses (or gains if there are haploid samples, but we don't think this is the case). Here's the distribution of tumor_ploidy from v13 of pbta-histologies.tsv for reference:
> histologies_df %>% 
+   filter(experimental_strategy == "WGS", 
+          sample_type == "Tumor") %>% 
+   group_by(tumor_ploidy) %>% 
+   tally()
# A tibble: 3 x 2
  tumor_ploidy     n
         <dbl> <int>
1            2   690
2            3   134
3            4   116

I'll note that #387 is also related to all of this!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant