You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This gives an under-estimate for the diameter, but possibly not a very good one depending on our choice of ref_history.
I propose it may be pretty good if the reference history is a topological outlier, such as what we get from dag.trim_optimal_sum_rf_distance(dag, optimal_func=max).sample()
Over-estimate for diameter:
Since RF distance is a proper distance, the triangle inequality tells us that given any tree a in the DAG, and a tree b that's as far as possible from a in the DAG, the diameter of the set of trees in the DAG is no more than twice the distance from a to b. This is true regardless of the tree a that we start with, but which tree we choose for a will determine how close the over-estimate (twice the distance from a to b) is to the true value of diameter.
It seems that a good choice for a would be the topological median tree, because that should have generally smaller maximum distance to any other tree in the DAG. So, perhaps a good over-estimate of RF distance diameter for a dag would be
It would be interesting to implement these over/under-estimates, and compare them to the true diameters for a bunch of DAGs that are small enough to compute the true diameters of.
The text was updated successfully, but these errors were encountered:
True RF distance diameter:
For a DAG which doesn't contain too many histories, it's possible to compute the true RF distance diameter with something like:
However, this is prohibitively slow for large DAGs. It's possible that there's a way to get a good estimate for diameter much more efficiently.
Under-estimate for diameter:
We can always get the maximum RF distance between a reference history and any other history in a DAG with
This gives an under-estimate for the diameter, but possibly not a very good one depending on our choice of
ref_history
.I propose it may be pretty good if the reference history is a topological outlier, such as what we get from
dag.trim_optimal_sum_rf_distance(dag, optimal_func=max).sample()
Over-estimate for diameter:
Since RF distance is a proper distance, the triangle inequality tells us that given any tree
a
in the DAG, and a treeb
that's as far as possible froma
in the DAG, the diameter of the set of trees in the DAG is no more than twice the distance froma
tob
. This is true regardless of the treea
that we start with, but which tree we choose fora
will determine how close the over-estimate (twice the distance froma
tob
) is to the true value of diameter.It seems that a good choice for
a
would be the topological median tree, because that should have generally smaller maximum distance to any other tree in the DAG. So, perhaps a good over-estimate of RF distance diameter for a dag would beEvaluating these estimates:
It would be interesting to implement these over/under-estimates, and compare them to the true diameters for a bunch of DAGs that are small enough to compute the true diameters of.
The text was updated successfully, but these errors were encountered: