Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check target array length before allocating #4159

Merged
merged 5 commits into from
Dec 8, 2024

Conversation

counter2015
Copy link
Contributor

The current implementation of the array inside ArrayStack does not verify the length of the new array, and this may lead to overflow problem.

private[this] def checkAndGrow(): Unit =
if (index >= buffer.length) {
val len = buffer.length
val buffer2 = new Array[AnyRef](len * 2)
System.arraycopy(buffer, 0, buffer2, 0, len)
buffer = buffer2
}

Here we add a check on

  • Int overflow
  • Array length exceed VM limit

see also: #4135

@armanbilge
Copy link
Member

Thanks for the PR! See also:

@djspiewak
Copy link
Member

Looks like we need to run prePR. But apart from that, I'd like to sanity check this with a benchmark run. This specific function is very, very easy to knock out of its JIT fastpath, and adding this type of branching very much runs that risk. Apart from that, I think this is a very good change. If we are knocking the JIT around, we might have to either experiment with different ways of encoding the branching to avoid tripping over the optimizer, or we may need to do the check in some other area.

@counter2015
Copy link
Contributor Author

Sure, I will provide a benchmark of this change later this week.

@counter2015
Copy link
Contributor Author

counter2015 commented Oct 30, 2024

I'm not sure my implementation of benchmark is right or not. Since some branch only excuted at a very large parameter, and it cant be test on old code since it will lead to overflow problem.

And I'm not familiar with how to do comparative benchmarks between versions. benchmarks/run-benchmark ArrayStackBenchmark seems doesn't work rightly, may be I missed something.

So I just do this compare manually, here is the benchmark result on my laptop.

[info] running (fork) org.openjdk.jmh.Main -i 10 -wi 10 -f 2 -t 1 cats.effect.benchmarks.ArrayStackBenchmark
[info] # JMH version: 1.37
[info] # VM version: JDK 21.0.2, OpenJDK 64-Bit Server VM, 21.0.2+13-LTS
[info] # VM options: -Dcats.effect.tracing.mode=none -Dcats.effect.tracing.exceptions.enhanced=false
[info] # Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)

# new version
Jmh / run -i 10 -wi 10 -f 2 -t 1 cats.effect.benchmarks.ArrayStackBenchmark
[info] Benchmark                              (size)   Mode  Cnt    Score   Error  Units
[info] ArrayStackBenchmark.push  1000000  thrpt   20  126.439 3.123  ops/s

# base line
Jmh / run -i 10 -wi 10 -f 2 -t 1 cats.effect.benchmarks.ArrayStackBenchmark
[info] Benchmark                              (size)   Mode  Cnt    Score   Error  Units
[info] ArrayStackBenchmark.push  1000000  thrpt   20  126.187  2.612  ops/s

This result indicates that when the size is small, the performance of the two is comparable.

I try to test on large size such as 1147483639 which is bigger than VM_MaxArraySize / 2, but it will cause OOM probelm during jmh test. it need 8G memory I guess ?

I changed javaOptions inside project benchmarks in build.sbt, but my laptop cant allocate such memory to it.

@counter2015
Copy link
Contributor Author

If there's anything that needs improvement or additional information, please let me know. I'm happy to make any necessary changes.

@djspiewak
Copy link
Member

sbt clean prePR might do the trick on the build. Scalafix and scalafmt both have some weird caching issues.

I'll run some benchmarks on this via IO when a get a chance so we can see how it impacts inlining. The ArrayStack direct results are promising.

@djspiewak
Copy link
Member

Just when I think the JVM can't surprise me anymore...

Before

[info] Benchmark                                             (cpuTokens)   (size)   Mode  Cnt     Score     Error    Units
[info] DeepBindBenchmark.async                                       N/A    10000  thrpt   10  2648.872 ±  71.889    ops/s
[info] DeepBindBenchmark.delay                                       N/A    10000  thrpt   10  7472.865 ± 263.016    ops/s
[info] DeepBindBenchmark.pure                                        N/A    10000  thrpt   10  8595.472 ± 256.137    ops/s
[info] MapCallsBenchmark.batch120                                    N/A      N/A  thrpt   10   187.446 ±   1.039    ops/s
[info] MapCallsBenchmark.batch30                                     N/A      N/A  thrpt   10    47.649 ±   1.047    ops/s
[info] MapCallsBenchmark.one                                         N/A      N/A  thrpt   10     1.597 ±   0.021    ops/s
[info] MapStreamBenchmark.batch120                                   N/A      N/A  thrpt   10  5017.096 ±  24.569    ops/s
[info] MapStreamBenchmark.batch30                                    N/A      N/A  thrpt   10  2411.243 ±  49.510    ops/s
[info] MapStreamBenchmark.one                                        N/A      N/A  thrpt   10  2883.050 ±  21.626    ops/s
[info] ParallelBenchmark.parTraverse                               10000     1000  thrpt   10   747.384 ±  37.987    ops/s
[info] ParallelBenchmark.traverse                                  10000     1000  thrpt   10    73.515 ±   1.896    ops/s
[info] ShallowBindBenchmark.async                                    N/A    10000  thrpt   10  1939.453 ±  48.402    ops/s
[info] ShallowBindBenchmark.delay                                    N/A    10000  thrpt   10  7428.462 ±  25.963    ops/s
[info] ShallowBindBenchmark.pure                                     N/A    10000  thrpt   10  8098.391 ±  53.194    ops/s
[info] WorkStealingBenchmark.alloc                                   N/A  1000000  thrpt   10    12.481 ±   0.114  ops/min
[info] WorkStealingBenchmark.manyThreadsSchedulingBenchmark          N/A  1000000  thrpt   10    28.263 ±   5.321  ops/min
[info] WorkStealingBenchmark.runnableScheduling                      N/A  1000000  thrpt   10  2503.856 ± 157.461  ops/min
[info] WorkStealingBenchmark.runnableSchedulingScalaGlobal           N/A  1000000  thrpt   10  1957.221 ±  65.798  ops/min
[info] WorkStealingBenchmark.scheduling                              N/A  1000000  thrpt   10    28.080 ±   3.298  ops/min

After

[info] Benchmark                                             (cpuTokens)   (size)   Mode  Cnt     Score     Error    Units
[info] DeepBindBenchmark.async                                       N/A    10000  thrpt   10  2668.144 ± 114.274    ops/s
[info] DeepBindBenchmark.delay                                       N/A    10000  thrpt   10  7552.944 ± 179.589    ops/s
[info] DeepBindBenchmark.pure                                        N/A    10000  thrpt   10  8691.767 ±  39.296    ops/s
[info] MapCallsBenchmark.batch120                                    N/A      N/A  thrpt   10   191.787 ±   2.813    ops/s
[info] MapCallsBenchmark.batch30                                     N/A      N/A  thrpt   10    48.432 ±   0.281    ops/s
[info] MapCallsBenchmark.one                                         N/A      N/A  thrpt   10     1.631 ±   0.036    ops/s
[info] MapStreamBenchmark.batch120                                   N/A      N/A  thrpt   10  5206.007 ± 191.124    ops/s
[info] MapStreamBenchmark.batch30                                    N/A      N/A  thrpt   10  2414.824 ±  66.890    ops/s
[info] MapStreamBenchmark.one                                        N/A      N/A  thrpt   10  2986.513 ±  16.887    ops/s
[info] ParallelBenchmark.parTraverse                               10000     1000  thrpt   10   780.594 ±   4.945    ops/s
[info] ParallelBenchmark.traverse                                  10000     1000  thrpt   10    73.811 ±   0.322    ops/s
[info] ShallowBindBenchmark.async                                    N/A    10000  thrpt   10  1980.663 ±  78.412    ops/s
[info] ShallowBindBenchmark.delay                                    N/A    10000  thrpt   10  7543.541 ± 285.638    ops/s
[info] ShallowBindBenchmark.pure                                     N/A    10000  thrpt   10  8130.712 ±  21.583    ops/s
[info] WorkStealingBenchmark.alloc                                   N/A  1000000  thrpt   10    12.626 ±   0.110  ops/min
[info] WorkStealingBenchmark.manyThreadsSchedulingBenchmark          N/A  1000000  thrpt   10    35.189 ±   4.395  ops/min
[info] WorkStealingBenchmark.runnableScheduling                      N/A  1000000  thrpt   10  2715.206 ± 115.029  ops/min
[info] WorkStealingBenchmark.runnableSchedulingScalaGlobal           N/A  1000000  thrpt   10  2034.349 ±  40.435  ops/min
[info] WorkStealingBenchmark.scheduling                              N/A  1000000  thrpt   10    28.354 ±   2.660  ops/min

@djspiewak djspiewak merged commit 9feee5c into typelevel:series/3.x Dec 8, 2024
29 of 33 checks passed
@counter2015 counter2015 deleted the patch/array-size-check branch December 8, 2024 23:05
@armanbilge armanbilge changed the title Check target array length before allocating. Check target array length before allocating Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants