Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic detector tuning #27

Open
Tveten opened this issue Nov 15, 2024 · 0 comments
Open

Generic detector tuning #27

Tveten opened this issue Nov 15, 2024 · 0 comments

Comments

@Tveten
Copy link
Collaborator

Tveten commented Nov 15, 2024

All detectors in skchange have either a penalty_scale or threshold_scale parameter. They mostly come with decent default values given that the data is standardised with respect to the within-segment mean and standard deviation, and that these quantities are stationary.
Thus, finding good mean and standard deviation estimates, or, correspondingly, good values for the penalty or threshold scales, can often be challenging.

Generic ways of tuning penalty_scale, threshold_scale and possibly other hyperparameters automatically from training data should therefore be developed. There should be both supervised and unsupervised/semi-supervised tuners.

Supervised tuning

In this setting, there are labels for changepoint or anomaly locations available for training/tuning.

Requires:

  • Metrics to measure the performance of detections vs. labeled detections. There are different metrics for change and anomaly detection.
  • A search method for testing different hyperparameters. Could optuna be useful?

Open questions:

  • Can/should some form of cross-validation be used for this?

Unsupervised or semi-supervised tuning

In this setting, there are no labels for changepoint or anomaly locations available for training/tuning. This is often the case in practice.

A possible way of tuning the methods nonetheless is to specify the desired number of detections in the training data, and search for parameters that result in this number of detections.

A common example: There is a training set that is assumed change- or anomaly-free. I.e., there should be no detections. It should be possible to find hyperparameters that give no detections on the training data in a "tight" way. By "tight", I mean that the resulting penalty or threshold is only as conservative as it needs to give no detections, but not more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant