Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pthread support for Emscripten build. #842

Open
aleph1 opened this issue Nov 22, 2024 · 17 comments
Open

Add pthread support for Emscripten build. #842

aleph1 opened this issue Nov 22, 2024 · 17 comments

Comments

@aleph1
Copy link

aleph1 commented Nov 22, 2024

I opened a pull request [#841] before seeing the note regarding the preference for issues.

The current CMakeLists.txt doesn’t include the required flags for Emscripten’s pthread support, which causes the built .wasm file to fail the MultithreadingTest when run in a browser (tested in Chrome, Safari, and Firefox for MacOS). Adding these flags results in all tests passing when running in a browser environment.

@erincatto
Copy link
Owner

erincatto commented Nov 29, 2024

Are trying to build the unit tests in WASM? Did you actually get performance gains compiling enkiTS in WASM?

This does not appear to be a supported platform for enkiTS. https://github.com/dougbinks/enkiTS

I don't think it makes sense for me to try to support this without a github action that verifies it builds and runs correctly. You might also work with the enkiTS team so they run tests in WASM. I don't see such a target:

https://github.com/dougbinks/enkiTS/blob/master/.github/workflows/build.yml

I would like to support WASM better, but it is a difficult platform to work with.

@aleph1
Copy link
Author

aleph1 commented Nov 29, 2024

Yes, I was trying to build the unit tests in WASM as the first step in writing WASM bindings for this library (similar to https://github.com/Birch-san/box2d-wasm for Box2D 2.x). Despite enkiTS not specifying it is WASM compatible, it appears to be with the changes I made to the Emscripten build script in [#841], as the unit tests run successfully in most modern browsers with the change. The change to the build script enables pthread support https://emscripten.org/docs/porting/pthreads.html, which is all that seems to be necessary to get enki’s TaskScheduler working as expected in WASM. However, I aim to dig deeper into this as I look at writing bindings for the WASM version.

One more caveat, Emscriptens implementation for pthread requires SharedArrayBuffer https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer in a browser context, which has been sandboxed due to timing attacks in Spectre. So, in order to successfully load the WASM into a browser you need to do it via a web server with the page embedding the code served with the correct CORS settings. I have a working version of this using NPM and Express [https://github.com/expressjs/express]. Would it be helpful to provide a repo with this setup?

The following is the console output from Chrome Version 131.0.6778.86 (Official Build) (arm64) for MacOS both before and after the changes to the Emscripten build process enabling pthread support. The MultithreadingTest that was failing before this change now passes, which is why I am assuming enkiTS’s TaskScheduler is successfully compiled to WASM.

Before (failing):

test.js:546 Aborted(native code called abort())
test.js:564 Uncaught (in promise) RuntimeError: Aborted(native code called abort())
at abort (http://localhost:3000/js/test.js:564:11)
at __abort_js (http://localhost:3000/js/test.js:1060:7)
at test.wasm.abort (http://localhost:3000/js/test.wasm:wasm-function[1175]:0xdb538)
at test.wasm.std::__2::__throw_system_error(int, char const*) (http://localhost:3000/js/test.wasm:wasm-function[1180]:0xdb60f)
at test.wasm.std::__2::thread::thread<void (&)(enki::ThreadArgs const&), enki::ThreadArgs, void>(void (&)(enki::ThreadArgs const&), enki::ThreadArgs&&) (http://localhost:3000/js/test.wasm:wasm-function[885]:0xca7b9)
at test.wasm.enki::TaskScheduler::StartThreads() (http://localhost:3000/js/test.wasm:wasm-function[877]:0xc9c89)
at test.wasm.enki::TaskScheduler::Initialize(enki::TaskSchedulerConfig) (http://localhost:3000/js/test.wasm:wasm-function[967]:0xd0794)
at test.wasm.enkiInitTaskSchedulerWithConfig (http://localhost:3000/js/test.wasm:wasm-function[1044]:0xd3a59)
at test.wasm.TiltedStacks (http://localhost:3000/js/test.wasm:wasm-function[21]:0x3bcc)
at test.wasm.MultithreadingTest (http://localhost:3000/js/test.wasm:wasm-function[19]:0x2321)

After (passing):

test.js:2077 Starting Box2D unit tests
test.js:2077 ======================================
test.js:2077 test passed: BitSetTest
test.js:2077 subtest passed: AABBTest
test.js:2077 test passed: CollisionTest
test.js:2659 run complete
test.js:2660 Object
test.js:2659 run complete
test.js:2660 Object
test.js:2659 run complete
test.js:2660 Object
test.js:2077 subtest passed: MultithreadingTest
test.js:2077 step = 281, hash = 0x7efc22e7
test.js:2077 subtest passed: CrossPlatformTest
test.js:2077 test passed: DeterminismTest
test.js:2077 subtest passed: SegmentDistanceTest
test.js:2077 subtest passed: ShapeDistanceTest
test.js:2077 subtest passed: ShapeCastTest
test.js:2077 subtest passed: TimeOfImpactTest
test.js:2077 test passed: DistanceTest
test.js:2077 test passed: IdTest
test.js:2077 test passed: MathTest
test.js:2077 subtest passed: ShapeMassTest
test.js:2077 subtest passed: ShapeAABBTest
test.js:2077 subtest passed: PointInShapeTest
test.js:2077 subtest passed: RayCastShapeTest
test.js:2077 test passed: ShapeTest
test.js:2077 set: count = 50086, b2ContainsKey = 0.00000 ms, ave = 0.00000 us
test.js:2077 item count = 50086, probe count = 15540, ave probe count 0.31
test.js:2077 test passed: TableTest
test.js:2077 subtest passed: TestForAmy
test.js:2077 subtest passed: HelloWorld
test.js:2077 subtest passed: EmptyWorld
test.js:2077 subtest passed: DestroyAllBodiesWorld
test.js:2077 subtest passed: TestIsValid
test.js:2077 subtest passed: TestWorldRecycle
test.js:2077 test passed: WorldTest
test.js:2077 ======================================
test.js:2077 All Box2D tests passed!

@erincatto
Copy link
Owner

I think it makes sense to run the unit tests on WASM. I would like to know if building Box2D for WASM with enkiTS yields better performance than single threaded. If not, I could simply add a cmake option to exclude enkiTS from the unit test build.

It is also not clear to me if enkiTS is a good choice of a task schedular for WASM. Maybe there are other options that work better.

Maybe @dougbinks has thoughts on this.

@dougbinks
Copy link

@erincatto I'm afraid I know very little about WASM. If it supports threads and atomics (as it seems to from @aleph1's tests ) then there is little reason why it would not give a similar performance boost to enkiTS on a native platform.

@aleph1
Copy link
Author

aleph1 commented Nov 30, 2024 via email

@erincatto
Copy link
Owner

If you are able to build the unit tests on WASM, you should also be able to build the benchmark application. This allows you to specify a thread count and generates csv results.

@aleph1
Copy link
Author

aleph1 commented Dec 1, 2024

I have managed to get the benchmarks compiling as WASM, however, there are two distinct issues:

  1. The logging seems suspect in that run durations are always reported as 0 (ms), and fps as inf. The same is true when running the compiled tests in Chrome and Firefox.

Log from Chrome Version 131.0.6778.86 (Official Build) (arm64) on MacOS 12.7.4 (21H1123):

Starting Box2D benchmarks
benchmark.js:2224 ======================================
benchmark.js:2224 benchmark: joint_grid, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 100 / shape 100 / contact 0 / joint 180 / stack 24192
benchmark.js:2224
benchmark.js:2224 benchmark: large_pyramid, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 211 / shape 211 / contact 590 / joint 0 / stack 99136
benchmark.js:2224
benchmark.js:2224 benchmark: many_pyramids, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 1376 / shape 1380 / contact 3625 / joint 0 / stack 609024
benchmark.js:2224
benchmark.js:2224 benchmark: rain, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 67 / shape 681 / contact 0 / joint 60 / stack 20288
benchmark.js:2224
benchmark.js:2224 benchmark: smash, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 201 / shape 201 / contact 712 / joint 0 / stack 52288
benchmark.js:2224
benchmark.js:2224 benchmark: spinner, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 501 / shape 860 / contact 80 / joint 1 / stack 130016
benchmark.js:2224
benchmark.js:2224 benchmark: tumbler, steps = 10
benchmark.js:2224 thread count: 1
benchmark.js:2224 run 0 : 0 (ms), inf (fps)
benchmark.js:2224 run 1 : 0 (ms), inf (fps)
benchmark.js:2224 run 2 : 0 (ms), inf (fps)
benchmark.js:2224 run 3 : 0 (ms), inf (fps)
benchmark.js:2224 body 402 / shape 404 / contact 0 / joint 1 / stack 105056
benchmark.js:2224
benchmark.js:2224 ======================================
benchmark.js:2224 All Box2D benchmarks complete!

Log from Firefox 133.0 (aarch64) on MacOS 12.7.4 (21H1123):

Starting Box2D benchmarks benchmark.js:2224:16
====================================== benchmark.js:2224:16
benchmark: joint_grid, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 100 / shape 100 / contact 0 / joint 180 / stack 24192 benchmark.js:2224:16
benchmark.js:2224:16
benchmark: large_pyramid, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 211 / shape 211 / contact 590 / joint 0 / stack 99136 benchmark.js:2224:16
benchmark.js:2224:16
benchmark: many_pyramids, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 1376 / shape 1380 / contact 3625 / joint 0 / stack 609024 benchmark.js:2224:16
benchmark.js:2224:16
benchmark: rain, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 67 / shape 681 / contact 0 / joint 60 / stack 20288 benchmark.js:2224:16
benchmark.js:2224:16
benchmark: smash, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 201 / shape 201 / contact 712 / joint 0 / stack 52288 benchmark.js:2224:16
benchmark.js:2224:16
benchmark: spinner, steps = 10 benchmark.js:2224:16
thread count: 1 benchmark.js:2224:16
run 0 : 0 (ms), inf (fps) benchmark.js:2224:16
run 1 : 0 (ms), inf (fps) benchmark.js:2224:16
run 2 : 0 (ms), inf (fps) benchmark.js:2224:16
run 3 : 0 (ms), inf (fps) benchmark.js:2224:16
body 501 / shape 860 / contact 80 / joint 1 / stack 130016 benchmark.js:2224:16
benchmark.js:2224:16

  1. There seems to be a conflict between specifying the number of threads that the Emscripten compiled WASM should use -s PTHREAD_POOL_SIZE=number and the code that limits the max number of threads in (https://github.com/erincatto/box2d/blob/main/benchmark/main.c). It would seem the GetNumberOfCores function should include a check for EMSCRIPTEN and pass the number of threads from PTHREAD_POOL_SIZE.

@erincatto
Copy link
Owner

I'm guessing b2Timer is not getting implemented for emscripten. You could try changing this line to:

#elif defined( __linux__ ) || defined( __APPLE__ ) || defined(__EMSCRIPTEN__)

#elif defined( __linux__ ) || defined( __APPLE__ )

Similar for GetNumberOfCores. I simply haven't tested this stuff on WASM.

@aleph1
Copy link
Author

aleph1 commented Dec 2, 2024

Thanks! I will try both of these and report back.

@aleph1
Copy link
Author

aleph1 commented Dec 4, 2024

Making that change and modifying the following to include || defined(__EMSCRIPTEN__) seems to result in things running as expected in WASM in the browser:

box2d/include/box2d/base.h

Lines 118 to 120 in 2c939c2

#elif defined( __linux__ ) || defined( __APPLE__ )
unsigned long long start_sec;
unsigned long long start_usec;

Are these changes something you would be willing to commit if I update the pull request or would I need to maintain a fork?

@erincatto
Copy link
Owner

I can add your changes to cmake, but I will leave a note that they are not supported until I have an emscripten build in GitHub actions. There current build script is here: https://github.com/erincatto/box2d/blob/main/.github/workflows/build.yml

Did you run the benchmarks again? Are you seen performance improve with threading?

@aleph1
Copy link
Author

aleph1 commented Dec 4, 2024

I am getting improved results. I temporarily hardcoded a value for GetNumberOfCores when EMSCRIPTEN is defined in as I don’t think sysconf can be relied upon, and the Emscripten setting -s PTHREAD_POOL_SIZE=number should probably be used instead.

The following are is the log from Chrome with the WASM compiled with support for 4 threads running on my M1 MacBook Pro. The only result that seems strange is the benchmark for smash, as it still reports 0 for ms and inf for fps.

Starting Box2D benchmarks
benchmark.js:2244 ======================================
benchmark.js:2244 benchmark: joint_grid, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 13 (ms), 769.231 (fps)
benchmark.js:2244 run 1 : 13 (ms), 769.231 (fps)
benchmark.js:2244 run 2 : 13 (ms), 769.231 (fps)
benchmark.js:2244 run 3 : 13 (ms), 769.231 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 1 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 2 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 3 : 8 (ms), 1250 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 1 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 2 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 3 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 1 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 2 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 3 : 5 (ms), 2000 (fps)
benchmark.js:2244 body 100 / shape 100 / contact 0 / joint 180 / stack 24192
benchmark.js:2244
benchmark.js:2244 benchmark: large_pyramid, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 106 (ms), 94.3396 (fps)
benchmark.js:2244 run 1 : 105 (ms), 95.2381 (fps)
benchmark.js:2244 run 2 : 104 (ms), 96.1538 (fps)
benchmark.js:2244 run 3 : 105 (ms), 95.2381 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 61 (ms), 163.934 (fps)
benchmark.js:2244 run 1 : 60 (ms), 166.667 (fps)
benchmark.js:2244 run 2 : 60 (ms), 166.667 (fps)
benchmark.js:2244 run 3 : 61 (ms), 163.934 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 57 (ms), 175.439 (fps)
benchmark.js:2244 run 1 : 52 (ms), 192.308 (fps)
benchmark.js:2244 run 2 : 49 (ms), 204.082 (fps)
benchmark.js:2244 run 3 : 50 (ms), 200 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 40 (ms), 250 (fps)
benchmark.js:2244 run 1 : 44 (ms), 227.273 (fps)
benchmark.js:2244 run 2 : 47 (ms), 212.766 (fps)
benchmark.js:2244 run 3 : 49 (ms), 204.082 (fps)
benchmark.js:2244 body 211 / shape 211 / contact 590 / joint 0 / stack 99136
benchmark.js:2244
benchmark.js:2244 benchmark: many_pyramids, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 607 (ms), 16.4745 (fps)
benchmark.js:2244 run 1 : 626 (ms), 15.9744 (fps)
benchmark.js:2244 run 2 : 612 (ms), 16.3399 (fps)
benchmark.js:2244 run 3 : 616 (ms), 16.2338 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 325 (ms), 30.7692 (fps)
benchmark.js:2244 run 1 : 327 (ms), 30.581 (fps)
benchmark.js:2244 run 2 : 326 (ms), 30.6748 (fps)
benchmark.js:2244 run 3 : 327 (ms), 30.581 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 235 (ms), 42.5532 (fps)
benchmark.js:2244 run 1 : 250 (ms), 40 (fps)
benchmark.js:2244 run 2 : 262 (ms), 38.1679 (fps)
benchmark.js:2244 run 3 : 263 (ms), 38.0228 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 231 (ms), 43.29 (fps)
benchmark.js:2244 run 1 : 242 (ms), 41.3223 (fps)
benchmark.js:2244 run 2 : 239 (ms), 41.841 (fps)
benchmark.js:2244 run 3 : 246 (ms), 40.6504 (fps)
benchmark.js:2244 body 1376 / shape 1380 / contact 3625 / joint 0 / stack 609024
benchmark.js:2244
benchmark.js:2244 benchmark: rain, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 13 (ms), 769.231 (fps)
benchmark.js:2244 run 1 : 12 (ms), 833.333 (fps)
benchmark.js:2244 run 2 : 10 (ms), 1000 (fps)
benchmark.js:2244 run 3 : 10 (ms), 1000 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 1 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 2 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 3 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 1 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 2 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 3 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 1 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 2 : 7 (ms), 1428.57 (fps)
benchmark.js:2244 run 3 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 body 67 / shape 681 / contact 0 / joint 60 / stack 20288
benchmark.js:2244
benchmark.js:2244 benchmark: smash, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 0 (ms), inf (fps)
benchmark.js:2244 run 1 : 0 (ms), inf (fps)
benchmark.js:2244 run 2 : 1 (ms), 10000 (fps)
benchmark.js:2244 run 3 : 0 (ms), inf (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 1 (ms), 10000 (fps)
benchmark.js:2244 run 1 : 0 (ms), inf (fps)
benchmark.js:2244 run 2 : 0 (ms), inf (fps)
benchmark.js:2244 run 3 : 0 (ms), inf (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 0 (ms), inf (fps)
benchmark.js:2244 run 1 : 0 (ms), inf (fps)
benchmark.js:2244 run 2 : 0 (ms), inf (fps)
benchmark.js:2244 run 3 : 0 (ms), inf (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 0 (ms), inf (fps)
benchmark.js:2244 run 1 : 0 (ms), inf (fps)
benchmark.js:2244 run 2 : 1 (ms), 10000 (fps)
benchmark.js:2244 run 3 : 0 (ms), inf (fps)
benchmark.js:2244 body 201 / shape 201 / contact 712 / joint 0 / stack 52288
benchmark.js:2244
benchmark.js:2244 benchmark: spinner, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 21 (ms), 476.19 (fps)
benchmark.js:2244 run 1 : 21 (ms), 476.19 (fps)
benchmark.js:2244 run 2 : 20 (ms), 500 (fps)
benchmark.js:2244 run 3 : 20 (ms), 500 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 17 (ms), 588.235 (fps)
benchmark.js:2244 run 1 : 18 (ms), 555.556 (fps)
benchmark.js:2244 run 2 : 17 (ms), 588.235 (fps)
benchmark.js:2244 run 3 : 17 (ms), 588.235 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 17 (ms), 588.235 (fps)
benchmark.js:2244 run 1 : 16 (ms), 625 (fps)
benchmark.js:2244 run 2 : 15 (ms), 666.667 (fps)
benchmark.js:2244 run 3 : 15 (ms), 666.667 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 15 (ms), 666.667 (fps)
benchmark.js:2244 run 1 : 15 (ms), 666.667 (fps)
benchmark.js:2244 run 2 : 16 (ms), 625 (fps)
benchmark.js:2244 run 3 : 16 (ms), 625 (fps)
benchmark.js:2244 body 501 / shape 860 / contact 80 / joint 1 / stack 130016
benchmark.js:2244
benchmark.js:2244 benchmark: tumbler, steps = 10
benchmark.js:2244 thread count: 1
benchmark.js:2244 run 0 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 1 : 8 (ms), 1250 (fps)
benchmark.js:2244 run 2 : 9 (ms), 1111.11 (fps)
benchmark.js:2244 run 3 : 8 (ms), 1250 (fps)
benchmark.js:2244 thread count: 2
benchmark.js:2244 run 0 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 1 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 2 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 run 3 : 6 (ms), 1666.67 (fps)
benchmark.js:2244 thread count: 3
benchmark.js:2244 run 0 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 1 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 2 : 4 (ms), 2500 (fps)
benchmark.js:2244 run 3 : 4 (ms), 2500 (fps)
benchmark.js:2244 thread count: 4
benchmark.js:2244 run 0 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 1 : 4 (ms), 2500 (fps)
benchmark.js:2244 run 2 : 5 (ms), 2000 (fps)
benchmark.js:2244 run 3 : 5 (ms), 2000 (fps)
benchmark.js:2244 body 402 / shape 404 / contact 0 / joint 1 / stack 105056
benchmark.js:2244
benchmark.js:2244 ======================================
benchmark.js:2244 All Box2D benchmarks complete!

@erincatto
Copy link
Owner

erincatto commented Dec 4, 2024

Ah, it thinks you are running debug. Can you make a release build?

#ifdef NDEBUG
		int stepCount = benchmarks[benchmarkIndex].totalStepCount;
#else
		int stepCount = 10;
#endif

This also affects the benchmark content:

	int columns = BENCHMARK_DEBUG ? 20 : 120;
	int rows = BENCHMARK_DEBUG ? 10 : 80;

Maybe this will fix the Smash benchmark.

@aleph1
Copy link
Author

aleph1 commented Dec 4, 2024

I managed to compile a release build, but in order to do so I had to add another compiler setting for Emscripten -s ALLOW_MEMORY_GROWTH, otherwise the an error is thrown while attempting to run the first test, abortOnCannotGrowMemory. My CMakeLists.txt now includes this for Emscripten:

if(EMSCRIPTEN)
    set(EMSCRIPTEN_PTHREADS_COMPILER_FLAGS "-pthread -s USE_PTHREADS=1")
    set(EMSCRIPTEN_PTHREADS_LINKER_FLAGS "${EMSCRIPTEN_PTHREADS_COMPILER_FLAGS} -s ALLOW_MEMORY_GROWTH")
    string(APPEND CMAKE_C_FLAGS " ${EMSCRIPTEN_PTHREADS_COMPILER_FLAGS}")
    string(APPEND CMAKE_CXX_FLAGS " ${EMSCRIPTEN_PTHREADS_COMPILER_FLAGS}")
    string(APPEND CMAKE_EXE_LINKER_FLAGS " ${EMSCRIPTEN_PTHREADS_LINKER_FLAGS}")
endif()

There is a significant improvement in performance when more than a single thread is used. The following log is from Chrome vision 131.0.6778.86 running on an M1 MacBook Pro (17,1) with 8 cores (4 performance and 4 efficiency) and 16GB or RAM:

Starting Box2D benchmarks
benchmark.js:1 ======================================
benchmark.js:1 benchmark: joint_grid, steps = 500
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 6812 (ms), 73.3999 (fps)
benchmark.js:1 run 1 : 6779 (ms), 73.7572 (fps)
benchmark.js:1 run 2 : 6750 (ms), 74.0741 (fps)
benchmark.js:1 run 3 : 6752 (ms), 74.0521 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 4013 (ms), 124.595 (fps)
benchmark.js:1 run 1 : 4006 (ms), 124.813 (fps)
benchmark.js:1 run 2 : 4008 (ms), 124.75 (fps)
benchmark.js:1 run 3 : 4010 (ms), 124.688 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 2707 (ms), 184.706 (fps)
benchmark.js:1 run 1 : 2698 (ms), 185.322 (fps)
benchmark.js:1 run 2 : 2698 (ms), 185.322 (fps)
benchmark.js:1 run 3 : 2697 (ms), 185.391 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 2207 (ms), 226.552 (fps)
benchmark.js:1 run 1 : 2198 (ms), 227.48 (fps)
benchmark.js:1 run 2 : 2200 (ms), 227.273 (fps)
benchmark.js:1 run 3 : 2200 (ms), 227.273 (fps)
benchmark.js:1 body 10000 / shape 10000 / contact 0 / joint 19800 / stack 2598208
benchmark.js:1
benchmark.js:1 benchmark: large_pyramid, steps = 500
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 4844 (ms), 103.22 (fps)
benchmark.js:1 run 1 : 4853 (ms), 103.029 (fps)
benchmark.js:1 run 2 : 4848 (ms), 103.135 (fps)
benchmark.js:1 run 3 : 4842 (ms), 103.263 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 2558 (ms), 195.465 (fps)
benchmark.js:1 run 1 : 2560 (ms), 195.312 (fps)
benchmark.js:1 run 2 : 2557 (ms), 195.542 (fps)
benchmark.js:1 run 3 : 2560 (ms), 195.312 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 1844 (ms), 271.15 (fps)
benchmark.js:1 run 1 : 1838 (ms), 272.035 (fps)
benchmark.js:1 run 2 : 1840 (ms), 271.739 (fps)
benchmark.js:1 run 3 : 1839 (ms), 271.887 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 1452 (ms), 344.353 (fps)
benchmark.js:1 run 1 : 1436 (ms), 348.189 (fps)
benchmark.js:1 run 2 : 1435 (ms), 348.432 (fps)
benchmark.js:1 run 3 : 1437 (ms), 347.947 (fps)
benchmark.js:1 body 5051 / shape 5051 / contact 14950 / joint 0 / stack 2511616
benchmark.js:1
benchmark.js:1 benchmark: many_pyramids, steps = 200
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 7498 (ms), 26.6738 (fps)
benchmark.js:1 run 1 : 7505 (ms), 26.6489 (fps)
benchmark.js:1 run 2 : 7494 (ms), 26.688 (fps)
benchmark.js:1 run 3 : 7500 (ms), 26.6667 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 3930 (ms), 50.8906 (fps)
benchmark.js:1 run 1 : 3931 (ms), 50.8776 (fps)
benchmark.js:1 run 2 : 3933 (ms), 50.8518 (fps)
benchmark.js:1 run 3 : 3941 (ms), 50.7485 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 2812 (ms), 71.1238 (fps)
benchmark.js:1 run 1 : 2827 (ms), 70.7464 (fps)
benchmark.js:1 run 2 : 2810 (ms), 71.1744 (fps)
benchmark.js:1 run 3 : 2812 (ms), 71.1238 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 2199 (ms), 90.9504 (fps)
benchmark.js:1 run 1 : 2195 (ms), 91.1162 (fps)
benchmark.js:1 run 2 : 2194 (ms), 91.1577 (fps)
benchmark.js:1 run 3 : 2194 (ms), 91.1577 (fps)
benchmark.js:1 body 22001 / shape 22020 / contact 58000 / joint 0 / stack 9744000
benchmark.js:1
benchmark.js:1 benchmark: rain, steps = 1000
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 16712 (ms), 59.8372 (fps)
benchmark.js:1 run 1 : 16700 (ms), 59.8802 (fps)
benchmark.js:1 run 2 : 16712 (ms), 59.8372 (fps)
benchmark.js:1 run 3 : 16805 (ms), 59.5061 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 9840 (ms), 101.626 (fps)
benchmark.js:1 run 1 : 9839 (ms), 101.636 (fps)
benchmark.js:1 run 2 : 9829 (ms), 101.74 (fps)
benchmark.js:1 run 3 : 9843 (ms), 101.595 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 7783 (ms), 128.485 (fps)
benchmark.js:1 run 1 : 7764 (ms), 128.8 (fps)
benchmark.js:1 run 2 : 7763 (ms), 128.816 (fps)
benchmark.js:1 run 3 : 7773 (ms), 128.65 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 6332 (ms), 157.928 (fps)
benchmark.js:1 run 1 : 6326 (ms), 158.078 (fps)
benchmark.js:1 run 2 : 6334 (ms), 157.878 (fps)
benchmark.js:1 run 3 : 6350 (ms), 157.48 (fps)
benchmark.js:1 body 11001 / shape 15505 / contact 19903 / joint 10000 / stack 2522016
benchmark.js:1
benchmark.js:1 benchmark: smash, steps = 300
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 3644 (ms), 82.3271 (fps)
benchmark.js:1 run 1 : 3639 (ms), 82.4402 (fps)
benchmark.js:1 run 2 : 3643 (ms), 82.3497 (fps)
benchmark.js:1 run 3 : 3645 (ms), 82.3045 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 2279 (ms), 131.637 (fps)
benchmark.js:1 run 1 : 2275 (ms), 131.868 (fps)
benchmark.js:1 run 2 : 2277 (ms), 131.752 (fps)
benchmark.js:1 run 3 : 2282 (ms), 131.464 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 1796 (ms), 167.038 (fps)
benchmark.js:1 run 1 : 1790 (ms), 167.598 (fps)
benchmark.js:1 run 2 : 1794 (ms), 167.224 (fps)
benchmark.js:1 run 3 : 1793 (ms), 167.317 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 1510 (ms), 198.675 (fps)
benchmark.js:1 run 1 : 1504 (ms), 199.468 (fps)
benchmark.js:1 run 2 : 1496 (ms), 200.535 (fps)
benchmark.js:1 run 3 : 1493 (ms), 200.938 (fps)
benchmark.js:1 body 9601 / shape 9601 / contact 7487 / joint 0 / stack 6924128
benchmark.js:1
benchmark.js:1 benchmark: spinner, steps = 1400
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 9329 (ms), 150.07 (fps)
benchmark.js:1 run 1 : 9329 (ms), 150.07 (fps)
benchmark.js:1 run 2 : 9339 (ms), 149.909 (fps)
benchmark.js:1 run 3 : 9322 (ms), 150.182 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 5721 (ms), 244.712 (fps)
benchmark.js:1 run 1 : 5722 (ms), 244.67 (fps)
benchmark.js:1 run 2 : 5719 (ms), 244.798 (fps)
benchmark.js:1 run 3 : 5721 (ms), 244.712 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 4415 (ms), 317.101 (fps)
benchmark.js:1 run 1 : 4425 (ms), 316.384 (fps)
benchmark.js:1 run 2 : 4421 (ms), 316.67 (fps)
benchmark.js:1 run 3 : 4415 (ms), 317.101 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 3661 (ms), 382.409 (fps)
benchmark.js:1 run 1 : 3641 (ms), 384.51 (fps)
benchmark.js:1 run 2 : 3642 (ms), 384.404 (fps)
benchmark.js:1 run 3 : 3638 (ms), 384.827 (fps)
benchmark.js:1 body 3040 / shape 3399 / contact 9377 / joint 1 / stack 1746208
benchmark.js:1
benchmark.js:1 benchmark: tumbler, steps = 750
benchmark.js:1 thread count: 1
benchmark.js:1 run 0 : 4454 (ms), 168.388 (fps)
benchmark.js:1 run 1 : 4449 (ms), 168.577 (fps)
benchmark.js:1 run 2 : 4471 (ms), 167.748 (fps)
benchmark.js:1 run 3 : 4476 (ms), 167.56 (fps)
benchmark.js:1 thread count: 2
benchmark.js:1 run 0 : 2693 (ms), 278.5 (fps)
benchmark.js:1 run 1 : 2660 (ms), 281.955 (fps)
benchmark.js:1 run 2 : 2663 (ms), 281.637 (fps)
benchmark.js:1 run 3 : 2662 (ms), 281.743 (fps)
benchmark.js:1 thread count: 3
benchmark.js:1 run 0 : 2121 (ms), 353.607 (fps)
benchmark.js:1 run 1 : 2115 (ms), 354.61 (fps)
benchmark.js:1 run 2 : 2110 (ms), 355.45 (fps)
benchmark.js:1 run 3 : 2120 (ms), 353.774 (fps)
benchmark.js:1 thread count: 4
benchmark.js:1 run 0 : 1724 (ms), 435.035 (fps)
benchmark.js:1 run 1 : 1715 (ms), 437.318 (fps)
benchmark.js:1 run 2 : 1725 (ms), 434.783 (fps)
benchmark.js:1 run 3 : 1712 (ms), 438.084 (fps)
benchmark.js:1 body 2027 / shape 2029 / contact 11001 / joint 1 / stack 2215584
benchmark.js:1
benchmark.js:1 ======================================
benchmark.js:1 All Box2D benchmarks complete!

@erincatto
Copy link
Owner

Very exciting! I definitely want to support this. For comparison here are the M2 results.
image
https://box2d.org/files/benchmark_results.html

@aleph1
Copy link
Author

aleph1 commented Dec 4, 2024 via email

@erincatto
Copy link
Owner

erincatto commented Dec 4, 2024

Yeah, something to help people get going with WASM would be great. I will add your changes to the CMake files.

I played around with emscripten a while back, but it is kind of painful to work with on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants