How to rerun failed trial when experiment is done ? #3835
-
Consider a certain scenario, there are some error in the code when some parameters used, it cause half of trials failed . It only need to change a litter in the code , how can I rerun the failed trials ? can it roll back to the failed point ? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
@kvartet - this seems a common question that worth update our documentation accordingly. |
Beta Was this translation helpful? Give feedback.
-
hello @nothingeasy, sorry for my late reply. If you want to rerun failed trials, you can use the However, there are other issues existing: In some algorithms(like TPE), the generation of hyperparameters is dependent on the historical trial results. Since part of the trials previously failed, the hyperparameters of rerun trials may be meaningless. Besides, the results of customized trials will not be added to the history. In some training services, only after you stop and resume the experiment, would this modified code work. In a word, maybe rerun the experiment is a better choice in some situations. |
Beta Was this translation helpful? Give feedback.
hello @nothingeasy, sorry for my late reply.
If you want to rerun failed trials, you can use the
Customized trial
button, refer here. If the status of the experiment isstopped
, you need to resume it and if the status isdone
, you need to change themaxTrialNumber
. By the way, NNI does not provide a mechanism to rerun all failed trials, so users can only submit each manually.However, there are other issues existing: In some algorithms(like TPE), the generation of hyperparameters is dependent on the historical trial results. Since part of the trials previously failed, the hyperparameters of rerun trials may be meaningless. Besides, the results of customized trials will not be added to the…