Adds test cases for sum and delta system macros #137

popematt · 2024-11-21T07:21:28Z

Issue #, if available:

Description of changes:

Adds test cases for sum and delta.

As I was writing the test cases, it seemed really odd that sum can have zero-to-many argument values, but delta cannot. So, I made a decision to change that, and in these test cases, delta accepts zero-to-many argument values, and the implicit initial value is zero. (Just like sum.)

If there's disagreement, I can revert back to the currently spec'd behavior. If we agree, then I can update the spec accordingly.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

zslayton · 2024-11-21T14:42:24Z

conformance/system_macros/delta.ion

+               (binary "EF 12 02 07 FF FF FF 01")
+               (text "(:delta -1 -1 -1)")
+               (produces -1 -2 -3))
+         (each "4 arguments"


Could you throw in one or two simple tests showing that delta-of-delta works too?

zslayton · 2024-11-21T14:54:36Z

EDIT: Per offline discussion, I was confusing delta with a past version of sum that took a bias and a stream of numbers to encode with that bias. The new version of delta should work fine!

You can disregard the below, but I'll leave it for historians.

The tests look good, so I've gone ahead and approved this. We should talk about the change to delta's arguments, though.

As I was writing the test cases, it seemed really odd that sum can have zero-to-many argument values, but delta cannot. So, I made a decision to change that, and in these test cases, delta accepts zero-to-many argument values, and the implicit initial value is zero. (Just like sum.)

For others' reference, here's the original description of delta:

I think the reason we had the initial value specified separately was in case the first value in the sequence was an outlier. If the first value is -8 and every value that follows it is 1_000_000_000, the encoding will be really suboptimal. Ideally, the seed value would be the median (or mode?) of the sequence.

The version you propose is reasonable if the writer can't/won't do any analysis of the sequence before writing. A writer aiming for compactness would probably take the time to sample the data before encoding it.

The original version of delta (with a distinct seed) can be used to define a new macro that sets 0 as the initial value.

I think modifying the definition so the second parameter (deltas) was * instead of + would be ok. If you set the seed and deltas is empty, it expands to the empty stream.

What do you think?

popematt · 2024-11-21T19:48:32Z

Summary of offline conversation in which we decided to keep delta as it is defined in this PR.

The version in this PR is aligned with the commonly known concept of directed delta encoding. It is intended to be a compact representation of groups of numbers that are relatively close to each other.

In practice, pre-analysis of the numbers to select a median/mode as the initial seed is unlikely to result in any significant savings over this method since delta encoding is designed to take advantage of things that have small differences. (As an aside, pre-analysis can be useful to determine whether delta encoding is actually beneficial—this strategy is used in some audio codecs.)

The "difference from initial seed" function can be constructed using a template macro such as this:

(macro biased_ints (initial biases*) (.for (b (%biases)) (.sum (%initial) (%b))))

popematt · 2024-11-21T23:01:58Z

As per offline conversation about the usefulness of allowing an arbitrary number of operands for sum, I have simplified sum to accept exactly 2 operands instead of an arbitrary number of operands.

Adds test cases for sum and delta system macros

b881e19

zslayton approved these changes Nov 21, 2024

View reviewed changes

Adds changes based on PR feedback

5c78033

Improve sum test cases

7c38205

tgregg approved these changes Nov 21, 2024

View reviewed changes

popematt merged commit 14e8ce8 into amazon-ion:main Nov 21, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds test cases for sum and delta system macros #137

Adds test cases for sum and delta system macros #137

popematt commented Nov 21, 2024

zslayton Nov 21, 2024

zslayton commented Nov 21, 2024 •

edited

Loading

popematt commented Nov 21, 2024 •

edited

Loading

popematt commented Nov 21, 2024

Adds test cases for sum and delta system macros #137

Adds test cases for sum and delta system macros #137

Conversation

popematt commented Nov 21, 2024

zslayton Nov 21, 2024

Choose a reason for hiding this comment

zslayton commented Nov 21, 2024 • edited Loading

popematt commented Nov 21, 2024 • edited Loading

popematt commented Nov 21, 2024

zslayton commented Nov 21, 2024 •

edited

Loading

popematt commented Nov 21, 2024 •

edited

Loading