Why are there NAs in the feature costs? #51

berndbischl · 2014-03-05T16:55:12Z

eg Sat11-Indu

And how are we then supposed to handle those when adding feature costs in the evaluation?

mlindauer · 2014-03-06T08:59:11Z

I cite our specification:
"Put ? as an entry if the feature computation was not successful due to cut-off or unusual abort."

Unfortunately, the original SAT sets indeed have NAs as feature runtime :-(
In the SAT data, NAs should only occur when the step presolved the instance.
I assume in my tool, the cost is 0 in this case.
However, this cannot be the truth
because also for presolving there should be some kind of cost.

berndbischl · 2014-03-06T12:14:15Z

I cite our specification:

Yeah, I knew that you would say that. It was late.

So, to have an NA in the costs, implies that the feature VALUES are NA as well? Because the computation did not finish?
Which would mean I cannot the corresponding features (eg have to impute them).
So a cost of 0 would be totally OK IMHO in that case.

But if THAT step PRESOLVES the instance, this completely contradicts the line of argumentation above ....?

mlindauer · 2014-03-06T12:43:20Z

my official opinion:
we can only use features if all used steps have "ok" as status.
otherwise the features cannot be used and do not have to consider the feature costs
(with the exception of timeouts where the feature runtime cutoff should be specified in description.txt)

my real opinion:
as soon as I try to execute a step to get features,
I have to pay something;
also when it crashes/memout/... or presolved the instance.
So there cannot be a NA in feature costs.

There is only one case I could imagine for NAs:
If all steps are executed in a specific order
so that the next step can only be executed if the previous step was successful ("ok"),
we could have NAs at steps which were never executed because the previous step crashed.
example:
steps s1,s2,s3 in the following order:
s1 -> s2 -> s3
s2 crashes (or presolves or whatever except "ok")
then s3 was never executed and the cost is unknown but also not relevant.

In the case of the SAT data,
there should be a cost,
but it was not recorded so we have to assume 0.

berndbischl · 2014-03-28T18:53:39Z

I think we handle this currently in the most reasonable way.

What I do not like:
Is stuff like this explained in our READMEs?
I mean, why there are NAs in the costs? Because we demand this in the specs.

And if WE dont do it, it sets a very bad example....

berndbischl · 2014-03-28T18:54:25Z

After it is explained for the tasks where such NAs occur, we can close IMHO.

berndbischl assigned mlindauer Mar 5, 2014

berndbischl added the data integrity / format problem label Apr 22, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are there NAs in the feature costs? #51

Why are there NAs in the feature costs? #51

berndbischl commented Mar 5, 2014

mlindauer commented Mar 6, 2014

berndbischl commented Mar 6, 2014

mlindauer commented Mar 6, 2014

berndbischl commented Mar 28, 2014

berndbischl commented Mar 28, 2014

Why are there NAs in the feature costs? #51

Why are there NAs in the feature costs? #51

Comments

berndbischl commented Mar 5, 2014

mlindauer commented Mar 6, 2014

berndbischl commented Mar 6, 2014

mlindauer commented Mar 6, 2014

berndbischl commented Mar 28, 2014

berndbischl commented Mar 28, 2014