Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The explicit definition of "spectator" and its impact on RDChiral #45

Open
0m1n0 opened this issue Jul 26, 2023 · 8 comments
Open

The explicit definition of "spectator" and its impact on RDChiral #45

0m1n0 opened this issue Jul 26, 2023 · 8 comments

Comments

@0m1n0
Copy link

0m1n0 commented Jul 26, 2023

Hi,

I work in the bioinformatics field and I found your tool very interesting.
I saw that a reaction can be written as:

  • reactant>>product
  • reactant>spectator>product

In biology and biochemistry, a cofactor can be present in a reaction and it
is defined as follow (from Wikipedia):

A cofactor is a non-protein chemical compound or metallic ion that is required for an enzyme's role as a catalyst. Cofactors can be considered "helper molecules" that assist in biochemical transformations.

Cofactors can be divided into two major groups: organic cofactors, such as flavin or heme; and inorganic cofactors, such as the metal ions Mg2+, Cu+, Mn2+ and iron–sulfur clusters.

I assume that this term does not correspond to spectator.

So here are my questions:

  1. Could you give me an explicit definition about spectator in your code?
  2. Are the results significantly different with or without a spectator?
  3. Considering the cofactor property, is it better to integrate it as reactant and product (cofactor is present on both sides of the reaction with a minor modification)? I'd like to extract templates that don't depend on a cofactor.

I'm sorry to bother you with these beginner's questions.
Any advice is welcome and appreciated.
Thank you :)

Min

@0m1n0 0m1n0 changed the title The explicit definition of "spectator" and its impact on R The explicit definition of "spectator" and its impact on RDChiral Jul 26, 2023
@connorcoley
Copy link
Owner

From the perspective of templates, spectators are completely ignored right now. A spectator would be a component that does not contribute heavy atoms to the product molecule (according to the atom mapping). If a co-factor is present on both sides with no modification, I would add it in as a post-processing step

@0m1n0
Copy link
Author

0m1n0 commented Jul 31, 2023

Thank you for your reply!

Clarifying chemoinformatics temrs

  • A reaction can be write as (based on Daylight 3. SMILES - A Simplified Chemical Language, section "3.5 Extensions for Reactions"):
    • reactant > agent > product
    • reactant >> product
  • I also saw "reagent" somewhere and it looks more like a substance that detecting/indicating a reaction (based on definition IUPAC Compendium of Chemical Terminology). I assume that this term is not used in this tool (and is rarely used in other chemoinformatics tools, except for metadata).

So the "spectator" is part of the "agent"?

How to place a cofactor in chemoinformatics?

If a co-factor is present on both sides with no modification, I would add it in as a post-processing step

There are modifications of co-factor, as you can see some examples in this plot:

  • NADH -> NAD+
  • ATP -> ADP
  • GTP -> GDP
  • ...etc

Do you think that these modifications (such as the addition of energy by deprotonation) can be excluded from template extraction?

image

Thank you,
Min

@connorcoley
Copy link
Owner

connorcoley commented Jul 31, 2023 via email

@0m1n0
Copy link
Author

0m1n0 commented Jul 31, 2023

It's really kind of you to have replied so quickly, and it's very clear!

As I'm interested in the reaction chain (metabolic pathway), I'll proceeded as follows:

  1. Annotate principal metabolites and co-factors and get canonical SMILES (using RDKit)
  2. Put them together in the reaction as reactants
  3. Atom numbering (I've used RXNMapper but if you have another suggestion, I'd love to hear from you.)
  4. Run template_extractor of RDChiral
  5. Analysis (I can re-index principal metabolites and co-factors here using step 1 and exclude co-factors if needed )

Thank you,
Min

@connorcoley
Copy link
Owner

I'm not positive what the "Analysis" step would involve for you, but this workflow sounds okay to me!

@0m1n0
Copy link
Author

0m1n0 commented Jul 31, 2023

The final aim would be to group different reactions by their template. Then, given a compound (i.e. reactant), find out whether there's a match in the templates in order to obtain a potential product (compound template).

So the main steps of analysis would be:

  1. Understand why all reactions cannot provide templates. There are several possible causes: wrong annotation in public databases; non canonical SMILES format; issues during atom numbering (by the way, I don't think RXNMapper works with wildcard *.) ...etc
  2. Re-index principal metabolites and co-factors
  3. Group reactions by template (and probably by principal metabolites)
  4. Global analysis of reaction and template relationships (e.g. number of unique reactions by template)
  5. Check whether there is a hierarchy between templates (e.g. template B is a sub-group of template A)
  6. Given a compound X, explore which template it may belong to, then retrieve potential products

I saw ASKCOS and I think it's basically the same idea. But I wanted to understand the central part and I especially wanted to work with data from biochemical reactions.

@connorcoley
Copy link
Owner

Understood, thanks for the elaboration. You might be interested in some related work:

@0m1n0
Copy link
Author

0m1n0 commented Aug 1, 2023

Wow, thank you!
I'll read it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants