Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find Evotec gene connections to pursue: exploration for MorphMap paper Evotec; [ORFs only] OR [CRISPRs only] #11

Closed
tjetkaARD opened this issue Jan 8, 2024 · 10 comments
Assignees
Labels
crispr Uses crispr data internal Internal discussions (but publicly accessible) orf Uses ORF data

Comments

@tjetkaARD
Copy link
Collaborator

tjetkaARD commented Jan 8, 2024

Here, I report the heatmap for Top similarity and top anti-similarity pairs without known evidence behind it.

Procedure:

  1. Select top 20 similar pairs according to ORF profiles cosine similarity (from excel file)
  2. Select top 20 anti-similar pairs according to ORF profiles cosine similarity (from excel file)
  3. Remove a pair if it has (Knowledge Graph average score above 0.5) OR (at least one Knowledge Graph above 0.8)

orfs_heatmap_cosine_Unknown Top_Anti 20_labels
Code: The value in the square indicate average Evotec KG score.

Unfortunately, I am unable to recreate similar plot for CRISPRs without data from Evotec (data exists only for pairs intersected with ORFs). Alternatively, I can recreate using STRINGdb Knowledge Graph, but it will not be replicable.

Edit: I updated the plot with the updated data from Evotec KG. Major change: there was a change in estimate of known association between HOOK2 and NDE1/NDEL1/ PAFAH1B1 - hence it was removed.

@AnneCarpenter
Copy link
Contributor

Ok, what looks worth pursuing from the ORFs-only plot are the following:

@AnneCarpenter
Copy link
Contributor

(perhaps @tjetkaARD can add the CRISPRs-only plot to this issue and adjust its title, since the ORFs were a simple story split to other issues now, and i referred to this spot being where the CRISPR-only will go!)

@tjetkaARD
Copy link
Collaborator Author

tjetkaARD commented Jan 19, 2024

@AnneCarpenter

  1. yes, let's do it here. For CRISPR, I would need additional KG data for the gene pairs from the file, @auranic :
    crispr-cp-replicable-top-correlated.csv
    Thanks a lot!

  2. After the KG update, one major change for ORFs in HOOK2/NDE1 cluster, reported in details in HOOK2 opposite effect than PAFAH1B1, NDE1, NDEL1: exploration for MorphMap paper (ORF) #5

@AnneCarpenter
Copy link
Contributor

For your first question I just forwarded an email with the needed information. This was my oversight in not sending it earlier, so sorry!

For the 2nd question, I think what you're saying is we previously thought this was a novel discovery but now it seems the connection among these genes is well known? Can we move that discussion over to #5 ?

@auranic
Copy link
Collaborator

auranic commented Jan 20, 2024

@tjetkaARD
Please find the KG scores for the CRISPR gene pairs :
https://drive.google.com/drive/folders/1QWY8itTMeR3pGIt2kWIS5NiOPLhazg4S?usp=sharing
(let me know if you need all of them from your top list)

For 2 - I answered here #5 (comment) )

@auranic
Copy link
Collaborator

auranic commented Jan 20, 2024

For an overview of the ORF links to pursue, I would like to attract your attention to this slide :

image

It should somehow match the heatmaps above

it should be self-explainable but I will be happy to provide more info. The network file is here https://drive.google.com/file/d/16J5D4Wiuh-r2IuT3hA8grAQUggwS8Rc7/view?usp=sharing

@tjetkaARD tjetkaARD changed the title Find Evotec gene connections to pursue: exploration for MorphMap paper Evotec; ORFs only Find Evotec gene connections to pursue: exploration for MorphMap paper Evotec; [ORFs only] OR [CRISPRs only] Jan 20, 2024
@tjetkaARD
Copy link
Collaborator Author

tjetkaARD commented Jan 20, 2024

Here, I report the heatmap for Top similarity and top anti-similarity pairs of CRISPR similarity without known evidence behind it (as defined by Knowledge Graph).

Procedure:

  • Select top 10 similar pairs according to CRISPR profiles cosine similarity (from excel file)
  • Select top 10 anti-similar pairs according to CRISPR profiles cosine similarity (from excel file)
  • Remove a pair if it has (Knowledge Graph average score above 0.5) OR (at least one Knowledge Graph above 0.8)

Heatmap

crispr_heatmap_cosine_Unknown Top_Anti 10_labels

Code: The value in the square indicate average Evotec KG score.

Clusters

  1. KAT5 and ZSCAN9 - Histone acetyltransferase and Zinc finger. They are both co-located in Nucleus. In general, there are scarce data on ZSCAN9 in general. Nonetheless IntAct database indicate some evidence behind physical interaction between the two in yeast (https://www.ebi.ac.uk/intact/search?query=ENSG00000137185).
    They are both low-level / DNA-related expression regulators. Similar phenotypic profiles could in fact indicate common mechanism behind it. There are however nothing to start with in this respect.

  2. ABCC10, MYH13, SRD5A1, GRK5, TBXA2R anti-correlated versus POLR2A, PSMD14, MDM2, USPL1, NXF1, EIF4A3, VCP

  • The second gene cluster is responsible of cellular and protein metabolic process; proteolysis - includes polymerase, proteasome, translational process proteins.
  • The first gene cluster is much less specified. TBXA2R, MYH13, SRD5A1 are involved in response to extracellular stimuli (not very sensitive). Only, link between GRK5 and TBXA2R is somehow known. ABCC10 is Multidrug resistance-associated protein , transporter; TBXA2R is platelet aggregation receptor, while GRK a GPCR kinase.
  1. CHST8, APOE, DYRK1B, LAIR1, NLRP9, KLK15, SLC17A7, CYP2A7, ECH1, SLC1A5, LIPE, LILRB4 anti-correlated versus DLX5, ZNF689

@auranic
Copy link
Collaborator

auranic commented Jan 20, 2024

@tjetkaARD I wonder why in the heatmap there are remaining question marks? In the CRISRP there must be these pairs (eg, I have just manually checked GRK5-ECH1 pair, small similarity score, not connected in KG)

@tjetkaARD
Copy link
Collaborator Author

@auranic - yes indeed, under optimal procedure on my side. I will correct and update it in the spare time.

@AnneCarpenter
Copy link
Contributor

We realized the CRISPR data needs chromosome arm correction, so will re-make the heatmaps (but the top correlation relationships aren't likely to change much, we hope)

@afermg afermg added orf Uses ORF data crispr Uses crispr data labels Feb 1, 2024
@shntnu shntnu added the internal Internal discussions (but publicly accessible) label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crispr Uses crispr data internal Internal discussions (but publicly accessible) orf Uses ORF data
Projects
None yet
Development

No branches or pull requests

6 participants