Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for inflight and payload queue #337

Merged
merged 5 commits into from
Jul 1, 2024

Conversation

cnderrauber
Copy link
Member

The payloadQueue.updateSortedKeys costs more than 40% cpu time when send sctp packet at high rate with loss/out of order network. The function is frequently called to order the entire map to generate the sorted slice. This change uses individual queue structs for inflight and payload queue. The inflight queue's chunk tsn is always consecutive so use a queue to hold chunks and it is always ordered, and use bitmask to hold payload tsn queue to calculate cumulative tsn and SACK.

Queue benchmark:

BenchmarkOldPayloadQueue-10    	     332	   3773029 ns/op	 3416218 B/op	   67217     67217 allocs/op
BenchmarkReceivePayloadQueue-10    	    3188	    347581 ns/op	   19488 B/op      2101 allocs/op

Use the sctp send/recv tester in a cpu limited receiver with 50ms rtt network , it also shows lower cpu usage and higher throughput:

target data rate 10MB/s, 1024 message size, 50ms rtt
original: receiver 110% cpu usage, sender 50% cpu usage, result 6.36MB/s
changed: receiver 70% cpu usage, sender 30% cpu usage, result 8.75MB/s

Copy link

codecov bot commented Jun 27, 2024

Codecov Report

Attention: Patch coverage is 98.34254% with 3 lines in your changes missing coverage. Please review.

Project coverage is 81.52%. Comparing base (e90e787) to head (86ae184).

Files Patch % Lines
queue.go 91.66% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #337      +/-   ##
==========================================
+ Coverage   81.24%   81.52%   +0.28%     
==========================================
  Files          49       51       +2     
  Lines        3247     3324      +77     
==========================================
+ Hits         2638     2710      +72     
- Misses        468      470       +2     
- Partials      141      144       +3     
Flag Coverage Δ
go 81.52% <98.34%> (+0.28%) ⬆️
wasm 67.59% <95.02%> (+0.82%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Sean-Der
Copy link
Member

Amazing! Those numbers are really exciting @cnderrauber

Would it be possible to drop the dependency?

@edaniels
Copy link
Member

fantastic improvement

@cnderrauber
Copy link
Member Author

Amazing! Those numbers are really exciting @cnderrauber

Would it be possible to drop the dependency?

@Sean-Der It can be done by implementing a queue/ring buffer. Do we try to avoid all third-party dependency as possible or just concern this new imported dequeue?

@Sean-Der
Copy link
Member

@cnderrauber I try and avoid all third-party dependencies. Currently we only have 3rd party dependencies for testing. Downstream breaking/doing unexpected things has been a problem

Copy link
Contributor

@paulwe paulwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🚀

receive_payload_queue.go Outdated Show resolved Hide resolved
receive_payload_queue.go Outdated Show resolved Hide resolved
receive_payload_queue.go Outdated Show resolved Hide resolved
receive_payload_queue.go Show resolved Hide resolved
@cnderrauber
Copy link
Member Author

@cnderrauber I try and avoid all third-party dependencies. Currently we only have 3rd party dependencies for testing. Downstream breaking/doing unexpected things has been a problem

Sounds good, will remove it

Generate sorted slice is slow if the queue is
large when data rate is high and some packets
is lost/out-of-order. Use different queue struct
for inflight and payload queue. Since inflight
queue's chunk tsn is always consecutive so use
a queue to hold chunks and it is always ordered.
Use bitmask to hold payload tsn queue to calculate
cumulative tsn and SACK.
Copy link

@boks1971 boks1971 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Awesome stuff @cnderrauber

receive_payload_queue.go Show resolved Hide resolved
Fix gap block in edge condition
Remove third-party dequeue
Add copyright
Remove panics in queue
@cnderrauber cnderrauber requested a review from Sean-Der July 1, 2024 07:55
@cnderrauber cnderrauber merged commit a8bc9b8 into master Jul 1, 2024
13 checks passed
@cnderrauber cnderrauber deleted the queue_optimize branch July 1, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants