Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about "AMLmut_923cells.RData" #5

Open
Zifeng-L opened this issue Jul 25, 2021 · 6 comments
Open

Question about "AMLmut_923cells.RData" #5

Zifeng-L opened this issue Jul 25, 2021 · 6 comments

Comments

@Zifeng-L
Copy link

Hi, here.
Thanks for your great work on AML. I want to use these methods for my own data, so I downloaded the dem matrix from your GEO database and tried to repeat your analysis. I wonder to know that how can I get "AMLmut_923cells.RData" from GEO? I checked the data but could not found it. Thanks!

@modalaigh
Copy link

Hi @Zifeng-L, did you manage to find out which cells are in the "AMLmut_923cells.RData" file? I currently have the same issue that you did. I downloaded the 35 anno.txt files from GEO for the AML patients and saw that there was a column labelled "MutTranscripts". I initially thought that the cells with entries in this column would represent those cells belonging to "AMLmut_923cells.RData" but when I counted all the cells with entries, I ended up with 939 cells which is a bit more than what I was expecting.

@petervangalen
Copy link
Collaborator

If you want to use the mutation data, I suggest loading the .anno.txt files. The MutTranscripts and WtTranscripts columns contain the mutation calls and the number of supporting reads (separated by /)

@yi6kim
Copy link

yi6kim commented Sep 21, 2024

I had the same observation, @modalaigh .
When I constructed the dataset myself using the 35 anno.txt files from GEO (as the .RData itself is not given), I ended up with 939 cells not 923 cells.

@Yousuk-Song
Copy link

@yi6kim @modalaigh

Hi, I'm working on the same process, and I'm struggling to build AMLmut_923cells.RData too. Could you tell me how did you ended up with 939?

I see this information on a*nnot.txt.gz
MutTranscripts WtTranscripts
normal malignant
normal malignant
normal malignant
malignant normal

and counting cell_ids in which MutTranscripts==malignant can't make results even close to 900s

I really appreciate for your reply

@yi6kim
Copy link

yi6kim commented Oct 18, 2024

@Yousuk-Song Are you sure you added up all 35 annotation files?

When I iterate through the files:

AML.anno.filenames <- list.files("my_directory", full.names = TRUE, pattern = "AML.*\.anno\.txt$")[1:35]
mut_counts <- c()
for (i in 1:35){
mut_counts[i]= sum(table(read.delim(AML.anno.filenames[i], header = T, na.strings = "")$MutTranscripts))
}
print(mut_counts)

15 45 0 0 3 1 3 27 92 0 0 6 0 146 6 0 0 6 1 367 8 2 37 0 3 0 0 1 0 0 0 21 143 6 0

print(sum(mut_counts))

939

@Yousuk-Song
Copy link

Yousuk-Song commented Oct 21, 2024

Thank You so much! I followed your method, chose cells without "MutTranscripts == normal" and got the same results, 939.

I don't understand why are there no explanation or any mentions about the number '923' in the article.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants