Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detector components (costs, scores etc.) as classes #24

Merged
merged 238 commits into from
Nov 26, 2024

Conversation

Tveten
Copy link
Collaborator

@Tveten Tveten commented Oct 20, 2024

Goal: Unify the detector components. Make them safer. Make the extension pattern simpler and clearer.

See #23 for discussions.

New features:

  • The BaseIntervalScorer class, inheriting from sktime.BaseEstimator. Public methods:

    • fit(self, X, y=None) -> self
    • evaluate(self, cuts: ArrayLike) -> np.ndarray
  • Four sub base classes inheriting from BaseIntervalScorer:

    • skchange.costs.BaseCost. Expects 2 columns in cuts: start, end.
    • skchange.change_scores.BaseChangeScore. Expects 3 columns in cuts: start, split, end.
    • skchange.anomaly_scores.BaseSaving. Expects 2 columns in cuts: start, end.
    • skchange.anomaly_scores.BaseLocalAnomalyScore: Expects 4 columns in cuts: outer_start, inner_start, inner_end, outer_end.
  • Classes for automatically converting costs to any of the three other score classes.

  • Convenience functions allowing either costs or an appropriate score to be used as input to all the detectors.

All existing functionality is implemented within the new design + additions.

Tveten and others added 30 commits October 17, 2024 21:45
Leave it to the numba compiler for now. Checking it in a good way is complicated due to the generic classes such as CostBasedChangeScore
Decoupled from numba and the detectors as opposed to other suggested designs.
@Tveten
Copy link
Collaborator Author

Tveten commented Nov 25, 2024

Naming decision:

  • Rename BaseIntervalEvaluator to BaseIntervalScorer based on discussion with @fkiraly

  • Rename the intervals argument to .evaluate to cuts based on offline discussion with @johannvk .

    • intervals suggests the input should be two entries (start, end], and hides the potential splitting info.
    • splits suggests the whole data should be split in two or more parts, and hides the interval subsetting info.
    • cuts is a word that can cover both interval subsetting and splitting. It's up to each sub base class to define what the cuts mean and how they are used internally during .evaluate.

This was referenced Nov 25, 2024
@Tveten Tveten merged commit 9591d33 into main Nov 26, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants