Compare `any` arrays #582

bizywizy · 2024-02-27T17:38:48Z

We can try to fix #371 with nested comparison of arrays' values but I'm not sure if it's worth it to increase amount of switch-cases for such rare (imo) case.

Also, I found that expr always goes to reflect.DeepEqual for bools and fixed it.

antonmedv · 2024-02-27T20:26:29Z

Nice catch with bools! But yes, I think we should not add cases for arrays. More research is needed here. Maybe a builtin deepEqual is needed which will be ignoring types of arrays (and maps?).

bizywizy · 2024-02-28T00:55:32Z

Okay, I opened separate PR for bools here.

But yes, I think we should not add cases for arrays.

We can add equalArrays just before reflect.DeepEqual and then we won't affect main switch-case statement

func Equal(a, b any) {
switch x := a.(type) {
    ...
}
if res, ok := equalArrays(a, b); ok {
    return res
}
return reflect.DeepEqual(a, b)

but decrease performance of non-primitive comparisons. Do you have any thoughts how frequently it happens?

Maybe a builtin deepEqual is needed which will be ignoring types of arrays (and maps?).

Could you please give me couple examples? I think I didn't get it.

antonmedv · 2024-02-28T09:58:01Z

We can add equalArrays just before reflect.DeepEqual and then we won't affect main switch-case statement

Sounds like a good solution. Make sure it is a deep equal as well: [1,[2,[3]]]==[1,[2,[3]]]

Do you have any thoughts how frequently it happens?

I don't know ¯\_(ツ)_/¯
I will be cool actually to vigure out what reflect.DeepEqual covers and add tests for those cases to make sure we will not break those in the future.

I also have an idea to test: add OpEqualInt & OpEqualBool & OpEqualString opcodes. I belive those will be the most common once. BUT, I'm not sure what adding more opcodes will be faster than OpEqual switch, so those changes should be benchmarked.

bizywizy · 2024-03-04T08:22:48Z

I don't know ¯_(ツ)_/¯
I will be cool actually to vigure out what reflect.DeepEqual covers and add tests for those cases to make sure we will not break those in the future.

The only case I found in my codebase involves fmt.Stringer objects. We can handle them at compile time (by calling .String() and then checking for equality) or simply add another switch-case to runtime.Equal.

I also have an idea to test: add OpEqualInt & OpEqualBool & OpEqualString opcodes. I belive those will be the most common once. BUT, I'm not sure what adding more opcodes will be faster than OpEqual switch, so those changes should be benchmarked.

Expr already includes OpEqualString and OpEqualInt opcodes so I compared these with OpEqual which is handled by the switch-case in runtime.Equal.

OpEqualString + OpEqualInt slightly better than OpEqual

❯ benchstat master.txt only-op-equal.txt                                                                                                                                                                                     (gp-eu-central-1/delta)
goos: darwin
goarch: arm64
pkg: github.com/expr-lang/expr
                │ master.txt  │         only-op-equal.txt          │
                │   sec/op    │   sec/op     vs base               │
_expr-8           63.39n ± 0%   64.28n ± 0%  +1.40% (p=0.000 n=10)
_expr_reuseVm-8   44.30n ± 1%   45.11n ± 0%  +1.83% (p=0.001 n=10)
geomean           52.99n        53.85n       +1.62%

So if you find a ~2% improvement worthwhile, I can add OpEqualBool. Similarly, we can do the same with OpLess/OpMore for ints and floats.

antonmedv · 2024-03-04T13:43:00Z

So if you find a ~2% improvement worthwhile

I think we can postpone implementing it. We can focus on features with >10% performance boost.

antonmedv · 2024-03-04T13:44:48Z

So what is the status of this PR? Will we separate array equal cases to a func deepArrayEquals?

bizywizy · 2024-03-04T13:56:31Z

Firstly, I'm gonna write benchmarks for runtime.Equal to see how the quantity of cases impacts the switch-case statement. Looks like that dividing several cases into separate function isn't really beneficial.

bizywizy · 2024-03-04T19:01:16Z

I compared master vs extended switch-case:

❯ benchstat master.txt switch-case.txt                                                                                                               
goos: darwin
goarch: arm64
pkg: github.com/expr-lang/expr/vm/runtime
                                         │  master.txt  │           switch-case.txt            │
                                         │    sec/op    │    sec/op     vs base                │
Equal/int_==_int-8                          2.081n ± 2%    2.119n ± 1%   +1.83% (p=0.003 n=10)
Equal/int_!=_int-8                          2.149n ± 1%    2.151n ± 0%        ~ (p=0.592 n=10)
Equal/int_==_int8-8                         2.185n ± 1%    2.201n ± 0%   +0.76% (p=0.003 n=10)
Equal/int_==_int16-8                        2.198n ± 1%    2.224n ± 1%   +1.18% (p=0.001 n=10)
Equal/int_==_int32-8                        2.179n ± 0%    2.209n ± 1%   +1.42% (p=0.000 n=10)
Equal/int_==_int64-8                        2.171n ± 1%    2.206n ± 1%   +1.59% (p=0.000 n=10)
Equal/float_==_float-8                      2.153n ± 1%    2.172n ± 1%   +0.91% (p=0.022 n=10)
Equal/float_!=_float-8                      2.162n ± 1%    2.174n ± 1%   +0.53% (p=0.027 n=10)
Equal/float_==_int-8                        2.266n ± 0%    2.317n ± 1%   +2.25% (p=0.000 n=10)
Equal/float_!=_int-8                        2.282n ± 0%    2.317n ± 0%   +1.58% (p=0.000 n=10)
Equal/string_==_string-8                    4.031n ± 0%    4.032n ± 0%        ~ (p=0.956 n=10)
Equal/string_!=_string-8                    4.308n ± 0%    4.285n ± 0%   -0.53% (p=0.002 n=10)
Equal/bool_==_bool-8                        2.253n ± 1%    2.168n ± 2%   -3.75% (p=0.000 n=10)
Equal/bool_!=_bool-8                        2.255n ± 0%    2.152n ± 1%   -4.57% (p=0.000 n=10)
Equal/[]any_==_[]int-8                      7.353n ± 1%   12.950n ± 1%  +76.13% (p=0.000 n=10)
Equal/[]any_!=_[]int-8                      7.368n ± 0%   12.920n ± 1%  +75.36% (p=0.000 n=10)
Equal/deep_[]any_==_[]any-8                103.90n ± 0%    83.84n ± 0%  -19.31% (p=0.000 n=10)
Equal/deep_[]any_!=_[]any-8                104.55n ± 1%    60.29n ± 1%  -42.34% (p=0.000 n=10)
Equal/map[string]any_==_map[string]any-8    219.1n ± 0%    214.3n ± 1%   -2.19% (p=0.000 n=10)
Equal/map[string]any_!=_map[string]any-8    56.47n ± 1%    53.26n ± 1%   -5.69% (p=0.000 n=10)
geomean                                     5.752n         5.842n        +1.57%

And extended switch-case vs separate function for arrays:

❯ benchstat switch-case.txt separate-func.txt                                                                                                        
goos: darwin
goarch: arm64
pkg: github.com/expr-lang/expr/vm/runtime
                                         │ switch-case.txt │          separate-func.txt          │
                                         │     sec/op      │   sec/op     vs base                │
Equal/int_==_int-8                             2.119n ± 1%   2.093n ± 1%        ~ (p=0.085 n=10)
Equal/int_!=_int-8                             2.151n ± 0%   2.127n ± 0%   -1.14% (p=0.000 n=10)
Equal/int_==_int8-8                            2.201n ± 0%   2.194n ± 1%        ~ (p=0.128 n=10)
Equal/int_==_int16-8                           2.224n ± 1%   2.212n ± 1%        ~ (p=0.171 n=10)
Equal/int_==_int32-8                           2.209n ± 1%   2.191n ± 1%   -0.84% (p=0.024 n=10)
Equal/int_==_int64-8                           2.206n ± 1%   2.189n ± 1%   -0.75% (p=0.008 n=10)
Equal/float_==_float-8                         2.172n ± 1%   2.172n ± 0%        ~ (p=1.000 n=10)
Equal/float_!=_float-8                         2.174n ± 1%   2.183n ± 0%   +0.44% (p=0.014 n=10)
Equal/float_==_int-8                           2.317n ± 1%   2.282n ± 1%   -1.49% (p=0.000 n=10)
Equal/float_!=_int-8                           2.317n ± 0%   2.277n ± 1%   -1.75% (p=0.000 n=10)
Equal/string_==_string-8                       4.032n ± 0%   4.024n ± 0%        ~ (p=0.324 n=10)
Equal/string_!=_string-8                       4.285n ± 0%   4.288n ± 0%        ~ (p=0.590 n=10)
Equal/bool_==_bool-8                           2.168n ± 2%   2.164n ± 1%        ~ (p=0.077 n=10)
Equal/bool_!=_bool-8                           2.152n ± 1%   2.162n ± 2%        ~ (p=0.304 n=10)
Equal/[]any_==_[]int-8                         12.95n ± 1%   16.38n ± 1%  +26.45% (p=0.000 n=10)
Equal/[]any_!=_[]int-8                         12.92n ± 1%   16.41n ± 1%  +26.97% (p=0.000 n=10)
Equal/deep_[]any_==_[]any-8                    83.84n ± 0%   98.89n ± 1%  +17.96% (p=0.000 n=10)
Equal/deep_[]any_!=_[]any-8                    60.29n ± 1%   74.37n ± 1%  +23.36% (p=0.000 n=10)
Equal/map[string]any_==_map[string]any-8       214.3n ± 1%   217.8n ± 0%   +1.66% (p=0.000 n=10)
Equal/map[string]any_!=_map[string]any-8       53.26n ± 1%   56.91n ± 1%   +6.86% (p=0.000 n=10)
geomean                                        5.842n        6.098n        +4.37%

The difference is just a tiny fraction of a nanosecond. I think we can say they're pretty much the same. What do you think?

antonmedv · 2024-03-05T07:52:06Z

I percentage on ns is definitely not major. Probably is an artifact of benchmarking. Let's just choose the more readable solution.

bizywizy · 2024-03-05T16:17:35Z

Well, IMO this version is better

Compare any arrays

f1b68c8

Add bench and tests for runtime.Equal

ef57900

bizywizy force-pushed the compare-arrays branch from 7b4e393 to ef57900 Compare March 5, 2024 16:02

antonmedv merged commit ef57900 into expr-lang:master Apr 13, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare `any` arrays #582

Compare `any` arrays #582

bizywizy commented Feb 27, 2024

antonmedv commented Feb 27, 2024

bizywizy commented Feb 28, 2024

antonmedv commented Feb 28, 2024

bizywizy commented Mar 4, 2024

antonmedv commented Mar 4, 2024

antonmedv commented Mar 4, 2024

bizywizy commented Mar 4, 2024

bizywizy commented Mar 4, 2024

antonmedv commented Mar 5, 2024

bizywizy commented Mar 5, 2024

Compare any arrays #582

Compare any arrays #582

Conversation

bizywizy commented Feb 27, 2024

antonmedv commented Feb 27, 2024

bizywizy commented Feb 28, 2024

antonmedv commented Feb 28, 2024

bizywizy commented Mar 4, 2024

antonmedv commented Mar 4, 2024

antonmedv commented Mar 4, 2024

bizywizy commented Mar 4, 2024

bizywizy commented Mar 4, 2024

antonmedv commented Mar 5, 2024

bizywizy commented Mar 5, 2024

Compare `any` arrays #582

Compare `any` arrays #582