Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate options for InterRowMSAS #669

Open
npatki opened this issue Nov 18, 2024 · 0 comments
Open

Investigate options for InterRowMSAS #669

npatki opened this issue Nov 18, 2024 · 0 comments
Labels
data:sequential Related to timeseries datasets feature request Request for a new feature

Comments

@npatki
Copy link
Contributor

npatki commented Nov 18, 2024

Problem Description

Right now, the InterRowMSAS metric takes the direct difference between a value in row n and row n+1. Then, it averages out all these differences. As a result, the computation effectively cancels out all terms besides the first and last

(row 2 - row 1) + (row 3 - row 2) +  (row 4 - row 3) + ... + (row n - row n-1)
= row n - row 1

I'm filing this issue to track whether there is a different form of computation that would be more appropriate for this metric. Alternatives:

  • Do not average out the differences between each sequence. Instead, add the differences to an overall distribution D_r or D_s.
  • (Similar to taking a log) Apply a transform each number. Eg. Squaring all values, and identifying the square root of the differences, eg. sqrt((r+x)**2 - (r)**2)
@npatki npatki added feature request Request for a new feature data:sequential Related to timeseries datasets labels Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:sequential Related to timeseries datasets feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant