Skip to content

How will evaluators decide when we need to abort

Esta Nagy edited this page Nov 5, 2020 · 1 revision

We need to start with a disclaimer. Evaluators can decide any way they want, we can only document how each of them do. Please find this list in case of the built-in evaluator(s) below. Also, it is worth noting that in case abort suppression is active, we are only counting the abortions which really happened and don't log the suppressed cases.

Percentage based evaluator

Countdown abort decision is made if the answer to all of the following questions is yes.

  • Did we observe enough countdown start events to pass burn-in?
  • Did we see no countdown completed events at all?

Mission abort decisions are a bit more complicated than that. There are two, equally important, parts we need to figure out as you will see below:

  1. Did we have enough countdown starts (or countdown completions in case there is no post-processing) to pass the burn-in?
  2. What is the percentage of failed or aborted tests compared to the total test runs?

The first question should be trivial to answer as it is a simple comparison between the burn-in threshold and the higher value of the two countdown metrics (started or completed). To answer the second question, this evaluator will sum the number of test failures (including post-processing failures), the number of times we made a mission abort decision and the number of successful test runs. This will be the 100% mark. If we discard percentage of the the successful test runs from this, we will get the percentage of failed or aborted test runs. Tests will be aborted if this percentage is higher than the abort threshold we have set.