You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we are doing the enrichment calculations, we use the type of the coalesced nodes. So if we we are merging on chemicals, we suppose that any chemical could be in that spot, and so we say how likely is it e.g. to have X of those chemicals to have a particular property.
That's not wrong, exactly, but it is probably not specific enough. So for instance consider
(asthma)<-[treats]-(chemical)
We'll usually find an enriched property for the chemical like "drug" or features that tend to be more common in druglike space (like heterocyclic organic compounds). And that is correct, it's more likely than by chance that drugs treat a disease rather than just random chemicals. But it's not terribly interesting.
Instead, I think we'd rather use the denominator of how many chemicals could have inhabited that spot in a graph. So something like, out of all the chemicals with a 'treats' edge, how likely is it that you would have this many with property X. Now the chance of having 'drug' is pretty high in that group, so it's not returned, which is what we want.
That would be doable in this case, and we could precache counts by edge. But in the general case (where there are an arbitrary number of edges coming out of the merging node) then we'd need to actually cache the identities of nodes with each edge so that we could intersect them to find the appropriate denominator.
The text was updated successfully, but these errors were encountered:
When we are doing the enrichment calculations, we use the type of the coalesced nodes. So if we we are merging on chemicals, we suppose that any chemical could be in that spot, and so we say how likely is it e.g. to have X of those chemicals to have a particular property.
That's not wrong, exactly, but it is probably not specific enough. So for instance consider
(asthma)<-[treats]-(chemical)
We'll usually find an enriched property for the chemical like "drug" or features that tend to be more common in druglike space (like heterocyclic organic compounds). And that is correct, it's more likely than by chance that drugs treat a disease rather than just random chemicals. But it's not terribly interesting.
Instead, I think we'd rather use the denominator of how many chemicals could have inhabited that spot in a graph. So something like, out of all the chemicals with a 'treats' edge, how likely is it that you would have this many with property X. Now the chance of having 'drug' is pretty high in that group, so it's not returned, which is what we want.
That would be doable in this case, and we could precache counts by edge. But in the general case (where there are an arbitrary number of edges coming out of the merging node) then we'd need to actually cache the identities of nodes with each edge so that we could intersect them to find the appropriate denominator.
The text was updated successfully, but these errors were encountered: