Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct Denominators #38

Open
cbizon opened this issue Dec 11, 2020 · 1 comment
Open

Correct Denominators #38

cbizon opened this issue Dec 11, 2020 · 1 comment

Comments

@cbizon
Copy link
Contributor

cbizon commented Dec 11, 2020

When we are doing the enrichment calculations, we use the type of the coalesced nodes. So if we we are merging on chemicals, we suppose that any chemical could be in that spot, and so we say how likely is it e.g. to have X of those chemicals to have a particular property.

That's not wrong, exactly, but it is probably not specific enough. So for instance consider
(asthma)<-[treats]-(chemical)

We'll usually find an enriched property for the chemical like "drug" or features that tend to be more common in druglike space (like heterocyclic organic compounds). And that is correct, it's more likely than by chance that drugs treat a disease rather than just random chemicals. But it's not terribly interesting.

Instead, I think we'd rather use the denominator of how many chemicals could have inhabited that spot in a graph. So something like, out of all the chemicals with a 'treats' edge, how likely is it that you would have this many with property X. Now the chance of having 'drug' is pretty high in that group, so it's not returned, which is what we want.

That would be doable in this case, and we could precache counts by edge. But in the general case (where there are an arbitrary number of edges coming out of the merging node) then we'd need to actually cache the identities of nodes with each edge so that we could intersect them to find the appropriate denominator.

@patrickkwang
Copy link

This makes sense. We do something similar with the specificity weighting on edges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants