Virtual Thread support #125

He-Pin · 2024-05-19T19:19:27Z

Quick Summary:
230.51 / S (Worker Thread)

VS

1545.18 / S (Virtual Threads)

Motivation:
Fix #90

I did some analyse on cask, and found it's using undertow for now. and the undertow itself is using Object lock as a lock in the http://www.jboss.org/xnio codebase, which we can do some contribution there to remove the monitor and migrating to ReentrantLock. Which should be better than waiting next Java LTS (maybe Java 25).

So, for now, I think we can just add the Virtual Thread support in cask, and then do some contribution upstream or migrating to another network layer maybe Netty?

The current changes is quite straightfoward.

Modifcation:

Update the dependecies, Mill, libraries, Scala
Because I'm using Windows 11, and chore: Rename page 1 #124 cause some issue, so I have to rename/delete it locally for now.
I fixed some hint problem to make the code looks better before starting ,mainly inside Main.scala, others can follow after this PR.
Add VirtualThreadSupport which only works on the Runtime where Virtual Thread is available, otherwise, it returns null..
Add NewThreadPerTaskExecutor which will start a new Virtual Thread for every Command/Runnable, the Undertow needs this.
Add VirtualThreadBlockingHandler which works like BlockingHandler but delegates to NewThreadPerTaskExecutor in 5 , instead of the connections' worker executor.
Add a cask.virtualThread.enabled system property with expected value true|false, and only using the Virtual Thread to run Routes if and only if both the Runtime supported and cask.virtualThread.enabled is set to true. I'm not exposing an executor or etc thing, because that's would not be the philosophy of cask, but hiding the runtime behavior behind of cask.virtualThread.enabled, as I was not knowing cask is running with undertow.
Do the some testing with ab, I changed a code with Thread.sleep(1000) to simulates.

  @cask.get("/")
  def hello(): String = {
    Thread.sleep(1000) // Simulate a slow endpoint
    "Hello World! Hello World! Hello World!"
  }

Results:
For the first run, I think it works just as expected, I did some assertion about the thread type Thread.isVirtual too during the testing.

Quick Summary: 230.51 / S (Worker Thread) VS 1545.18 / S (Virtual Threads)

My Laptop : i9-14900hx + 64G RAM

Result with Connection Worker Thread Java 21:

C:\Users\hepin>ab -n 10000 -c 10000 "http://localhost:8080/"
This is ApacheBench, Version 2.3 <$Revision: 1913912 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:
Server Hostname:        localhost
Server Port:            8080

Document Path:          /
Document Length:        38 bytes

Concurrency Level:      10000
Time taken for tests:   43.381 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1740000 bytes
HTML transferred:       380000 bytes
Requests per second:    230.51 [#/sec] (mean)
Time per request:       43381.358 [ms] (mean)
Time per request:       4.338 [ms] (mean, across all concurrent requests)
Transfer rate:          39.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.2      0      16
Processing:  1435 19619 10914.5  19634   38933
Waiting:     1011 19598 10928.7  19604   38933
Total:       1435 19619 10914.6  19634   38933

Percentage of the requests served within a certain time (ms)
  50%  19634
  66%  25433
  75%  29280
  80%  31199
  90%  35040
  95%  36965
  98%  37949
  99%  37957
 100%  38933 (longest request)

VS

Result with Virtual Threads on Java 21

C:\Users\hepin>ab -n 10000 -c 10000 "http://localhost:8080/"
This is ApacheBench, Version 2.3 <$Revision: 1913912 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:
Server Hostname:        localhost
Server Port:            8080

Document Path:          /
Document Length:        38 bytes

Concurrency Level:      10000
Time taken for tests:   6.472 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1740000 bytes
HTML transferred:       380000 bytes
Requests per second:    1545.18 [#/sec] (mean)
Time per request:       6471.734 [ms] (mean)
Time per request:       0.647 [ms] (mean, across all concurrent requests)
Transfer rate:          262.56 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.2      0      16
Processing:  1576 2120 202.6   2175    2527
Waiting:     1005 1631 364.6   1517    2279
Total:       1576 2120 202.6   2175    2527

Percentage of the requests served within a certain time (ms)
  50%   2175
  66%   2255
  75%   2306
  80%   2311
  90%   2327
  95%   2337
  98%   2341
  99%   2342
 100%   2527 (longest request)

Future Actions:

Add testing about the NewThreadPerTaskExecutor and testing the Routes is running inside a VT when all setup.
Add benchmarks with gathing/ab or doing it with Github Action (nightly build)
Add Github Actions for Java 21 and Virtual thread.
Adjust the code with feedback.

lihaoyi · 2024-05-19T23:12:05Z

cask/src/cask/internal/VirtualThreadBlockingHandler.scala

+}
+
+private[cask] object VirtualThreadBlockingHandler {
+  private lazy val EXECUTOR = new NewThreadPerTaskExecutor(


Let's make this Executor live in the Main object, rather than global. That would let it be more easily configured separately for unit tests or other multi-server scenarios

Make sense, I was thinking to make it simpler, but with vt, this should be some kind of concurrency limiter.

I have a question about this:
When we expose the executor with handlerExecutor and then user can just override it with a Executors.newThreadPerTaskExecutor on Java 21, then the MethodHandle thing can be avoid.

I was expecting to support virtual thread out of box with cask.virtualThread.enabled if and only if this systeam property set to true and running with Java 21.

So I think the solution should be:

add handlerExecutor just as you suggested.

add a cask.virtualThread.enabled system property and run with Java 21 when user set that to true and running with Java 21, otherwise, without 2, all the MethodHandle things seems unneeded.

I was busy on otherthings, sorry for the delay, what's your input on 2?

Update: It's possible to wrap the virtual thread on top of an executor, with method handle too, I think that would b e nice way to go, will update this pr with that way.

cask/src/cask/internal/NewThreadPerTaskExecutor.scala

cask/src/cask/main/Main.scala

cask/src/cask/internal/VirtualThreadSupport.scala

lihaoyi · 2024-05-19T23:28:46Z

Left some comments on the code. Thanks for looking into this!

Some high-level things to do:

Include virtual thread configuration in a dedicated example in the examples/ folder, and then similarly include it in the doc-site. That would help ensure that it remains discoverable for people who want to try it out, which I expect will be an increasing number as Java 21+ is adopted.
Embed your benchmark logic into the build system somehow, so other people can reproduce it. You can download and cache the ab binary via requests.get/os.write in Mill and then define a small ScalaModule to spin up Mill and run the benchmark with and without virtual threads. This will easily let people reproduce the benchmark or make adjustments to it as necessary
Once you've got your benchmarking logic setup, run the benchmarks with/without virtual threads on a few of our other examples: perhaps on minimalApplication as a tiny example, and todo as a big example, and staticFiles as a non-compute bound example. We expect the benefits to be not as dramatic as with your synthetic example using Thread.sleep, but it is still good to know whether there is any improvement, no improvement, or even regression. That would help us advise users what kind of use cases would benefit from using virtual threads and which ones won't
How much of the improvement is due to virtual threads, and how much is due to the NewThreadPerTaskExecutor we began using? We should run a benchmark with NewThreadPerTaskExecutor enabled but virtual threads disabled to verify that the virtual threads are the benefit, and not just the swapping out of the Executor strategy. Does the default one also spawn a NewThreadPerTask, or is it using a fixed thread pool of some sort?

He-Pin · 2024-05-20T01:26:25Z

Great, thanks for the quic and detailed review .

I will update this after work, we do see 10~15pt performance improvement and 5pt RAM reduce at internal stress testing, that numbers vary due to different workloads, but in most case,VT is a good choice ,and Nima is using virtual thread by default now.

He-Pin · 2024-05-27T18:02:34Z

Update:
As @lihaoyi once said , we need comparing the performance with and witout virtual thread, so I have to screen the executor, I tested it locally, it works well.

Even the backing scheduler of Virtual thread is not the default ForkJoinPool, it still works.

VirtualThread[#91,cask-handler-executor-virtual-thread-0]/runnable@pool-1-thread-1
VirtualThread[#93,cask-handler-executor-virtual-thread-1]/runnable@pool-1-thread-2

With this setup, I think I can continue other part of this PR.

object Compress extends cask.MainRoutes {

  protected override val handlerExecutor: Executor = Executors.newFixedThreadPool(Runtime.getRuntime.availableProcessors())

  @cask.decorators.compress
  @cask.get("/")
  def hello(): String = {
    println(Thread.currentThread())
    "Hello World! Hello World! Hello World!"
  }

  initialize()
}

He-Pin · 2024-05-27T18:10:18Z

cask/src/cask/main/Main.scala

+    else if (System.getProperty("cask.virtualThread.enabled", "true").toBoolean) {
+      Util.createVirtualThreadExecutor(executor).getOrElse(executor)
+    } else executor
+  }


@lihaoyi Wdyt about this?

I will update the PR later, and still need update the CI to Java 21 to run the benchmarks.

I think we should run CI on both 11 and 21, with the virtual threads test suites only enabled on 21

That's the plan.

lihaoyi · 2024-06-05T01:50:47Z

@He-Pin let me know when this is ready for review, looking forward to using virtual threads in cask!

He-Pin · 2024-06-05T13:22:10Z

@lihaoyi I will continue it this weekend, but currently, I need a PR [https://github.com//pull/124] get merged, because I'm using Windows 11, and I don't have a mac/linux for now.

lihaoyi · 2024-06-05T13:28:11Z

@He-Pin merged it

lihaoyi · 2024-07-02T07:21:58Z

Bump on this @He-Pin :)

He-Pin · 2024-07-02T09:04:26Z

@lihaoyi Sorry for being late, We encountered a virtual thread deadlocking at work, related to the classloader. When the virtual thread and the other platform thread(not the carrier thread) both try to load the same classes or a virtual thread is been notified but there is no carrier thread(been pinned) to run the virtual thread, then deadlock.

So we just changed the classloader, both in the JVM cpp and the lib.

But, cask is a library, So the best we can do is add some documents about this case, and a simple way is limiting the Max concurrent Virtual threads number.

I'm sorry for not updating this soon, I was a little busy at work and we had this issue and fixed it with JVM modification.

So what do you think about :

add some documentation about this.
add some best practices about virtual threads.

lihaoyi · 2024-07-04T01:27:16Z

@He-Pin I think documentation and best practices are fine. We do not have the power to change the JVM, so best we can do is tell people if they want to use JVM virtual threads with Cask what the best way of doing so is

lihaoyi · 2024-07-16T12:56:44Z

Bump on this. @He-Pin if you're not able to pick this up, I'll put it up again in the next set of bounties

He-Pin · 2024-07-17T02:49:03Z

Sorry for the delay, I was working with our internal JVM team to workaround the synchronization in the Classloader, and another issue I currently encounter is the com-lihaoyi/mill#3168 on Windows.

I will continue this this week, did spend sometime to learn the toolchian.

He-Pin · 2024-07-17T03:19:39Z

Because privateLookupIn starts in Java 9, and Cask currnetly support java8, so I have to fallback to Reflect I think.

lihaoyi · 2024-07-17T03:37:22Z

Got it, if you're still on it I'll leave it to you then!

Feel free to bump the required JVM version of necessary. We dont need to support Java 8 forever, e.g. requests-scala already moved to Java 11. Java is already on 21 or 22, so even jumping to Java 17 would be reasonable

lihaoyi · 2024-10-21T23:41:57Z

Closing due to inactivity

lihaoyi reviewed May 19, 2024

View reviewed changes

cask/src/cask/internal/NewThreadPerTaskExecutor.scala Outdated Show resolved Hide resolved

lihaoyi reviewed May 19, 2024

View reviewed changes

cask/src/cask/main/Main.scala Outdated Show resolved Hide resolved

lihaoyi reviewed May 19, 2024

View reviewed changes

cask/src/cask/internal/VirtualThreadSupport.scala Outdated Show resolved Hide resolved

He-Pin force-pushed the vt branch from 4af0209 to dc26b6f Compare May 26, 2024 19:35

He-Pin commented May 27, 2024

View reviewed changes

He-Pin marked this pull request as draft May 27, 2024 18:25

He-Pin mentioned this pull request May 28, 2024

MatchError when do mill.idea.GenIdea/idea com-lihaoyi/mill#3168

Closed

He-Pin added 11 commits June 12, 2024 02:32

chore: Update mill to 0.11.7-86-18d144

ba1a3b3

chore: Update scala versions

e5bfe99

chore: Update mima and acyclic

45fceae

chore: add type annotation to caskMetadata

196eb87

chore: Fix hint problem in Main.scala

2b5bbc5

chore: bump dependencies

5505631

feat: Add virtual threads support.

4ac756f

chore: update mill to 0.11.7-107-9ec9bc

d57d766

wip: Move helper methods to ThreadBlockingHandler

c38e3ba

chore: run virtual threads with other scheduler

ba0d0bc

update mill version

b48c433

He-Pin force-pushed the vt branch from 17dbf69 to b48c433 Compare June 11, 2024 18:32

update utest and requests version

0262d63

chore: bump mill to 0.11.8

d905516

lihaoyi closed this Oct 21, 2024

jodersky mentioned this pull request Nov 27, 2024

java.lang.ExceptionInInitializerError when Run mill __.compile with Java 21 #153

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Virtual Thread support #125

Virtual Thread support #125

He-Pin commented May 19, 2024 •

edited

Loading

lihaoyi May 19, 2024

He-Pin May 21, 2024 •

edited

Loading

He-Pin May 26, 2024 •

edited

Loading

lihaoyi commented May 19, 2024

He-Pin commented May 20, 2024

He-Pin commented May 27, 2024 •

edited

Loading

He-Pin May 27, 2024

He-Pin May 27, 2024

lihaoyi May 28, 2024

He-Pin May 28, 2024

lihaoyi commented Jun 5, 2024

He-Pin commented Jun 5, 2024

lihaoyi commented Jun 5, 2024

lihaoyi commented Jul 2, 2024

He-Pin commented Jul 2, 2024

lihaoyi commented Jul 4, 2024

lihaoyi commented Jul 16, 2024

He-Pin commented Jul 17, 2024 •

edited

Loading

He-Pin commented Jul 17, 2024

lihaoyi commented Jul 17, 2024

lihaoyi commented Oct 21, 2024

Virtual Thread support #125

Virtual Thread support #125

Conversation

He-Pin commented May 19, 2024 • edited Loading

lihaoyi May 19, 2024

Choose a reason for hiding this comment

He-Pin May 21, 2024 • edited Loading

Choose a reason for hiding this comment

He-Pin May 26, 2024 • edited Loading

Choose a reason for hiding this comment

lihaoyi commented May 19, 2024

He-Pin commented May 20, 2024

He-Pin commented May 27, 2024 • edited Loading

He-Pin May 27, 2024

Choose a reason for hiding this comment

He-Pin May 27, 2024

Choose a reason for hiding this comment

lihaoyi May 28, 2024

Choose a reason for hiding this comment

He-Pin May 28, 2024

Choose a reason for hiding this comment

lihaoyi commented Jun 5, 2024

He-Pin commented Jun 5, 2024

lihaoyi commented Jun 5, 2024

lihaoyi commented Jul 2, 2024

He-Pin commented Jul 2, 2024

lihaoyi commented Jul 4, 2024

lihaoyi commented Jul 16, 2024

He-Pin commented Jul 17, 2024 • edited Loading

He-Pin commented Jul 17, 2024

lihaoyi commented Jul 17, 2024

lihaoyi commented Oct 21, 2024

He-Pin commented May 19, 2024 •

edited

Loading

He-Pin May 21, 2024 •

edited

Loading

He-Pin May 26, 2024 •

edited

Loading

He-Pin commented May 27, 2024 •

edited

Loading

He-Pin commented Jul 17, 2024 •

edited

Loading