Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] URing weird CPU utilization with write requests #1260

Open
HazyMrf opened this issue Oct 8, 2024 · 4 comments
Open

[QUESTION] URing weird CPU utilization with write requests #1260

HazyMrf opened this issue Oct 8, 2024 · 4 comments

Comments

@HazyMrf
Copy link

HazyMrf commented Oct 8, 2024

I've been measuring performance of my Uring across the application and found out unexpected results:

  1. io_uring affinity set to 4 cores. application only uses uring for writes. the load is distributed not evenly across the cores:

tg_image_3237405771 (2)

  1. For comparison another application uses uring only for reads. The load is distrubted evenly

tg_image_2428212738 (1)

Can you please explain the reasons of such weird behaviour?

@HazyMrf HazyMrf changed the title [QUESTION] [QUESTION] URing bad performance on write requests Oct 8, 2024
@HazyMrf HazyMrf changed the title [QUESTION] URing bad performance on write requests [QUESTION] URing weird CPU utilization with write requests Oct 8, 2024
@HazyMrf
Copy link
Author

HazyMrf commented Oct 8, 2024

Application does not use neither IORING_SETUP_IOPOLL nor IORING_SETUP_SQPOLL. For each write operation it sets IOSQE_ASYNC flag. Submits are done using io_uring_submit(). Here is an strace sample of the application:

$ sudo perf trace --tid 1321722 -e 'io_uring_enter'  -- sleep 1
     0.000 ( 0.010 ms): io_uring_enter(fd: 4, to_submit: 1, argsz: 8)                         = 1
     0.182 ( 0.003 ms): io_uring_enter(fd: 4, to_submit: 1, argsz: 8)                         = 1
     0.362 ( 0.006 ms): io_uring_enter(fd: 4, to_submit: 1, argsz: 8)                         = 1
     1.328 ( 0.005 ms): io_uring_enter(fd: 4, to_submit: 2, argsz: 8)                         = 2
     3.264 ( 0.016 ms): io_uring_enter(fd: 4, to_submit: 1, argsz: 8)                         = 1
     7.107 ( 0.004 ms): io_uring_enter(fd: 4, to_submit: 1, argsz: 8)                         = 1

@axboe
Copy link
Owner

axboe commented Oct 8, 2024

Don't set IOSQE_ASYNC, it'll generally just slow things down. On anything more recent (eg 6.x kernels), it'll just do more harm than good. Anything that needs to punt to a worker thread will do so internally anyway, forcing it is usually not a good idea.

@HazyMrf
Copy link
Author

HazyMrf commented Oct 8, 2024

Okay, I will try this and send measured results here, thanks. Maybe there are other general advises to make my URing faster on new kernels? What I am for is low latency and balanced load across many CPUs

@HazyMrf
Copy link
Author

HazyMrf commented Oct 9, 2024

Hello @axboe , I tested your idea with removing IOSQE_ASYNC and it sadly didn't work, that is not the reason of performance degradation. Unfortunately I still observe uneven load distribution for write requests after removing IOSQE_ASYNC flag
photo_2024-10-09 11 50 11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants