-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAB40B has the opposite phenotypes of PIK3R3/INSYN1: exploration for MorphMap paper (ORF) #4
Comments
Email correspondence so far... Hello, To summarize a lot of work, we knocked down 8,000 genes one by one in U2OS human cells, and then clustered the genes based on having similar morphological impact (using the Cell Painting microscopy assay that labels major organelles). We found a tight cluster of RAB40B and XLOC_I2_008134 which have the opposite morphological effects of a cluster of PIK3R3 and INSYN1. I wonder if all of these connections are well-known already? If not, we would be delighted to work together if you'd like to design an experiment to followup/confirm the connection and add to a paper we are beginning to write up about the large dataset. It always helps in such papers to make a new discovery that can be confirmed (even if it's not a dramatic finding). All the best, please let me know if you would like to talk further! Cheers, Anne E. Carpenter, Ph.D. (she/her) phone: (617) 714-7750 Anne Carpenter [email protected] P.s. other interactions include: a connection between INSYN1 and RNF41, STYK1 and HOOK2. just in case those ring any bells or seem worth pursuing! Scott Soderling, Ph.D. Hi Anne, Sounds like a REALLY cool project! We actually don’t have any active projects right now on Insyn1 in the lab. These look like totally new links. Our work on Insyn1 was related to its function in neuronal inhibitory synapses and I don’t recall these coming out in the proximity proteomics from neurons. It looks like STYK1 and PIK3R3 may both be related to the regulation of the PI3K pathway, so that is interesting. We have some reagents in the lab for Insyn1 and would be happy to send you any of them. We could potentially perform BioID U2OS cells for InSyn1 if that was helpful for you. We have a new method for doing proximity proteomics using CRISPR engineering of the native gene/protein. Or we would also be happy to send you constructs for InSyn1 BioID. You may already be thinking about this, but we also have a fantastic new faculty member here from the MIT CSAIL that has developed new computational tools for PPI prediction using language models and protein structure (Rohit Singh). Might be interesting to run your clusters at large scale for PPIs. He would probably be interested in collaborating with you. Best, Scott Anne Carpenter [email protected] Ah, too bad it's no longer a focus for you. My lab is computational, so sending reagents here won't help :D Ideally, we'd test some functional relationships among these proteins - I'm not sure we would expect a direct protein protein interaction (at the least we ought to check in PPI databases for existing data to see if that is already known!). Will put that on my list. Do you know other labs who still focus on InSyn1 function? |
STATUS: waiting to be sure that these results are actually in the freshest version of the data before proceeding. |
info provided here now #8 |
Unfortunately,
Some additional facts: |
I Pinged Soderling today to find another lab who studies this, because it sounds like you are confirming the relationship DOES exist in JUMP ORF data. |
Yes, relationship does exists in JUMP ORF: JUMP ORF:
JUMP CRISPR:
|
If we can find someone to do followup experiments, we should check:
|
@holgerhennig Can your team answer that last Q - is there KG support for these interactions? That will help us know if we should write this up as already-known validation or increase our efforts to pursue followup experiments if it is novel. |
@AnneCarpenter We'll look into it, whether there's KG support for these interactions, and get back to you asap |
Got a response from Rytis in the meantime: No, we did not know about potential connection between Rab40b and INSYN1 or PIK3R3. Would be happy to chat with you to see whether we can do couple experiments to confirm the connections. Let me know what times work for you next week and I can set up zoom link. How about other Rab40 isoforms (Rab40b and Rab40c)? Do they cluster as well? From our experience, there is substantial amount of functional redundancy between Rab40a, Rab40b and Rab40c. We will schedule a meeting and depending whether Alan or Niranj can attend I can ask one of them to look up the other Rab40's. |
I will meet with Rytis Jan 24th to brainstorm followup experiments. @tjetkaARD can you please edit your comment above? I think it's a typo because two lines say "RAB40B vs. INSYN1". Given we are only looking at ORFs here I think my previous query to Niranj about XLOC_I2_008134 is now irrelevant but if you can confirm XLOC_I2_008134 is not in the ORF data that would be helpful! He also wanted to see if we have data for Rab40b and Rab40c (ORF or CRISPR?) It would be great to show him a heatmap of all of these genes (while marking which ones pass our threshold for "has a phenotype"). Probably for ORF data only but if it's easy to make CRISPR we can do that just to be complete, in case Rab40b or c is interesting here. |
@afermg Given this finding is in ORF data only, it would be great to use your web tool to look at images and most-similar/anti-similar genes. Please LMK when it's available. You're welcome to join the Jan 24 meeting as well if you like, similar to that FOXO one we had recently. |
@AnneCarpenter @holgerhennig For the three genes PIK3R3, INSYN1, RAB40B we do not find "unsupervised" explanations in KG for any pair of these genes, so any connection between them is new from KG perspective. Of note, INSYN1 is not in the list of 4850 genes having a signal in ORF. As for XLOC_I2_008134, I can not find any trace of this gene (or pseudogene?) identity, can you point to its NCBI Gene ID? |
I copied that one from an early heatmap (that likely is 'wrong' / outdated) so I guess let's abandon it. I believe someone else tried to find it in our data and the reagent doesn't exist (making it possibly a typo). But anyway if we are not seeing it as a top-similar gene in ORF or CRISPR we can abandon it. Thank you for checking! |
@AnneCarpenter Answering in order:
Hence, showing correlation plot for ORFs (all replicable): and correlation plot for CRISPRs (only RAB40B p-value replicable):
Unless Niranji points that the above file is incorrect one, all the genes in the ORFs heatmap are replicable. |
@AnneCarpenter The ORF data interface is now available on broad.io/orf. The easiest way to search for things is querying genes on the search box, but you can also run queries by editing the gene name in the following URL: https://lite.datasette.io/?install=datasette-json-html&parquet=https://zenodo.org/api/records/10542737/files/orf.parquet/content#/data/content?Gene%2FCompound__exact=RAB40B&_sort=Distance Just replace "RAB40B with other genes of interest. |
AMAZING! I wish Github allowed gifs to express Kermit-the-frog level excitement https://giphy.com/clips/buzzfeed-buzzfeed-celeb-the-muppets-find-out-which-muppet-they-really-are-zTF0aDwhF239JQzIXw We talked about shifting distance back to the -1 to +1 similarity metric that we've always used... but now I see values of 1300 for RBA40B so I'm wondering if what you're showing is not just our correlation metric * 1000? |
Yes, actually I started using the metric Alex uses (cosine distance), ranging from 0 to 2 to be consistent his code. I was testing if it made a big difference to keep it in 1e3 units, but it doesn'tlook ideal. I'll set it back to decimals, but it will contain 10 digits (we can't limit them because it comes from the lite.datasette internal code. I'll update it and set a reminder here. |
Hi @tjetkaARD, just to clarify; which correlation metric did you use for the analysis? Thanks! |
Maybe even more to the point: where is the file with the similarity metrics that we currently recommend using for this paper? |
Sorry, step 3. Feel free to tag Alex for his input when you've written it up. |
I believe this issue is still waiting for three things (though the 2nd one is for ORF only, no crispr analysis needed at least for now because those profiles aren't finalized and we're pretty sure there aren't phenotypes there due to isoforms):
|
I don't think we are certain that we are using the same ORF dataset, I pointed towards it here. The simplest way to do so is for someone on the other side to check if the checksum of their file is |
Here how these connections look like with the newest set of ORF and CRISPR profiles ORF
The previously seen relationships between the genes are still there, though some connections are weaker. CRISPR Most of these connections are absent in CRISPR because they are either not present in the dataset or do not have a phenotype. Only PIK3R3-RAB40B connection is present in CRISPR but the connection is weak (cosine similarity: -0.11) KG |
These results are the same as what I said in the above comment: #4 (comment) The heatmap shows the percentile of the cosine similarities (1 → similar, 0 → anti-similar). The text is the maximum of the absolute KG score ( ORFCRISPR |
This seems worth finalizing (i.e. re-creating the clusters based on what are the nearest neighbors of the genes involved rather than including genes just because they were in the original clusters with old profiles.) |
Here are the recreated clusters |
This latest re-created cluster is from the ORF data only. Is it worth querying the same genes ["INSYN1", "PIK3R3","RAB40B", "RAB40C"] in the CRISPR data? |
Yes! It will be important to comment on whether the same pattern exists or
doesn’t exist in the crispr data, so either way we should take a look at
how it’s looking.
|
Also I read through a bit of the Scott Soderling Science paper and they specifically knocked down INSYN1 with CRISPR to determine its function. They provide a list of proteins that were perturbed by doing this ... would be interesting to compare. |
@niranjchandrasekaran The basic story here is that other than INSYN1, all genes have high-quality in vitro functional genomics and in vivo transcriptomics data linking their expression to cell proliferation, tumor size, and cancer prognosis. Many of these genes are specifically related to cell migration (invadosome, cytoskeleton projections, etc) and endosome/vesicle formation. The only functional data for INSYN1 is the Soderling paper linked above, which implicates it in neuronal inhibitory synapses. It's interesting that these data suggest that INSYN1 may also have a link to cell proliferation/cancer. I searched INSYN1 in all of the cancer-focused databases that Anne had listed, and didn't come up with anything. Most of these resources profile a targeted list of several thousand genes/proteins to increase throughput, and INSYN1 isn't included in these lists. My notes for this story are here: https://docs.google.com/document/d/1zKkDpBWbb3NnQhlX34LEWuuZxy5Rotre5uuuBMTxvxY/edit I'm not sure what the next steps are. Are we waiting until @afermg and I go through all stories, and then write up the most interesting ones? Are we doing more follow-up with any wet lab researchers? I'm also not sure how limited we are for space. |
I think this resource shows an association of INSYN1 with glioma, which would link the Soderling results with cancer... could you dig in to that? https://www.proteinatlas.org/ENSG00000205363-INSYN1/pathology I also see on this page https://www.ncbi.nlm.nih.gov/gtr/genes/388135/ that this paper https://cgp.iiarjournals.org/content/11/4/201.long links INSYN1 to cancer, but INSYN1 doesn't show up in a search of the article and at first glance some of the Supp Data isn't available anymore! And finally I think this paper is saying an antisense RNA is associated with glioma (brain cancer) which could provide another link: And this one more generally: https://europepmc.org/article/med/36552797 Does any of that help? I think for this story the most ideal scenario is to find some supporting info from papers or from databases that make it more promising a results (which we do not pursue by lab work here in this paper, but that has 'enough' to suggest to others to do so). |
Thanks for the articles, @AnneCarpenter. I found that https://cgp.iiarjournals.org/content/11/4/201.long was published before INSYN1 had any characterized function, and uses the synonym C15orf59. I did a search specifically of C15orf59 and found more hits. New info on INSYN1 that tentatively links it to cancer:
Some random facts about INSYN1:
I think overall this is promising! We have 6 genes strongly linked to cell proliferation/cancer, plus INSYN1. Targeted searches of INSYN1 show tentative links to cancer in fairly low profile studies that re-analyzed large datasets to look for novel associations with cancer tissues. Almost all of these studies specifically call out INSYN1 as having no known function (when it was called by its ORF ID) or no known link to cancer. |
Great! I think that’s enough to write about it! Please write a paragraph
with references . You’d asked before how much room, and there’s not much
but I’d rather shorten a paragraph than lengthen it later!
|
The pattern doesn't seem to exist in CRISPR. Note: Missing genes either don't have a phenotype or are not present in the CRISPR dataset. |
I wouldn't say that exactly - of course we cannot say anything about INSYN1 which isn't in the CRISPR data, but for example NRBP1 and RAB40B are strongly anti-correlated here which is the same in ORF. There are some strongly opposite relationships too which are meaningful (RAB40B and PIK3R3 are strongly correlated in CRISPR and strongly anti-correlated in ORF). Ok I just realized I messed up most of that directionality because I mis-read the color schemes, but anyway @jessica-ewald can take this plot and add a sentence explaining this - maybe it's literally just "Although INSYN1 isn't in the CRISPR dataset, several of the other genes' relationships can also be seen in CRISPR, albeit sometimes with reversed directionality"? |
I looked at the similarity of features, grouped by feature groups, compartment and channels, between the two clusters in ORFs. The two clusters are antisimilar across all feature groups and channels. |
@niranjchandrasekaran, I'm going to wait here also until you pull the location info for these genes. |
I am comfortable with the location of these genes in both ORF and CRISPR plates ORF
CRISPR
|
@jessica-ewald Can you confirm this 'story' is done? (with possible exception of ensuring the final figures match the final text, which is a step we will do for all stories later) If so you can close the issue, change from "Need to come up with a story" to one of the other options - either 'confirms' or 'new story we will include' |
Cluster of RAB40B and XLOC_I2_008134 have the opposite phenotypes of a cluster of PIK3R3 and INSYN1 - googled most connections and got nothing so it may be novel.
Emailed Scott Soderling Duke Dec 1 https://mail.google.com/mail/u/0/#sent/QgrcJHsHnNjJtWszNdwGctjDfRPknnWjVHl
Backups
XLOC_I2_008134 - nothing in google search
The text was updated successfully, but these errors were encountered: