This project aims to benchmark various methods of copying data between TCP sockets, measuring both performance and CPU usage. The goal is to identify the fastest methods with the least amount of CPU overhead.
The benchmark tests the following methods:
- IoPipe: Utilizes the
io.Pipe
function for data transfer. - IoCopy: Uses the
io.Copy
function. - IoCopyBuffer: Uses
io.CopyBuffer
for buffered copying. - Syscall: Direct system calls for data transfer.
- IoCopyDirect: Direct copy using
io.Copy
. - UnixSyscall: Unix-specific system calls.
- Bufio: Buffered I/O using
bufio
package. - Splice: Linux
splice
system call. - Sendfile: Uses the
sendfile
system call. - ReadvWritev: Vectorized I/O operations using
readv
andwritev
.
We are using the net
package to create a TCP server and client for data transfer. The server listens on a specified port, and the client connects to the server to send and receive data.
the payload size is set to 10Kb, and the number of iterations is set to 5000.
const (
address = "localhost:12345"
numClients = 5000
bufferSize = 32 * 1024
)
var (
message = generateRandomString(10240) // Generate a 10kb random string
messageLength = len(message)
)
Tested on a base Hetzner instance with the following specifications:
- CPU: Intel Xeon (Skylake, IBRS, no TSX) (4) @ 2.099GHz
- RAM: 7747MiB
- OS: Ubuntu 22.04.4 LTS x86_64
The benchmark results are summarized as follows:
- UnixSyscall: 1580 ms
- IoPipe: 3140 ms
- IoCopy: 5940 ms
The IoPipe
method stands out as a native solution working with the net.Conn
interface, providing a balance between performance and CPU usage. However, methods such as UnixSyscall
show the potential for further optimization by directly interfacing with the underlying system calls.
To execute the benchmarks and analyze the results, use the following commands:
go test -bench=. test/tcp_test.go && go run analyse.go
The above commands will run the benchmark tests and generate a detailed analysis of each method's performance.
The native UnixSyscall method provides the best performance with the least CPU overhead. However, the IoPipe method is a close second and offers a more straightforward implementation. The choice of method depends on the specific requirements of the application, balancing performance and resource utilization.