Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡PROPOSAL: stateful L4 LoadBalancer #303

Open
4 tasks
ymmt2005 opened this issue Oct 24, 2024 · 1 comment
Open
4 tasks

💡PROPOSAL: stateful L4 LoadBalancer #303

ymmt2005 opened this issue Oct 24, 2024 · 1 comment

Comments

@ymmt2005
Copy link
Member

ymmt2005 commented Oct 24, 2024

What

Add a stateful L4 LoadBalancer implementation that can keep live connections
between Node or Pod restarts or when Node or Pod is added or removed.

Although we can combine Coil with a third-party LoadBalancer implementation such as MetalLB,
I think implementing it in Coil makes sense for the following reasons.

  • MetalLB is stateless, so it cannot keep connections when BGP nodes are added or removed.
  • Coil has IP address pools, which are also useful for managing LoadBalancer IP addresses.

How

Research is needed, but the basic idea is as follows.

  • Coil has a central database.
    • The database has a collection of mappings between the source IP address and the target Pod. The mapping is created for each load balancer IP.
    • This database can be a Redis or something like that.
  • For each load balancer IP, Coil runs a set of Pods called LB Pod.
    • Coil programs routing for the LB IP to LB Pods.
    • LB Pods have a cluster-local IP address and enable IP forwarding to route packets.
  • On each LB IP, a program synchronizes the mapping for the LB IP into an eBPF map.
    • The eBPF map will be used to look up the destination Pod.
  • When a packet to an LB IP reaches an LB Pod,
    • An eBPF program looks up the target Pod from the eBPF map.
    • It forwards the packet with some encapsulation (FoU?) to the target Pod.
  • When the encapped packet reaches the target Pod,
    • It decapsulates the packet. For this, target Pods should have the LB IP on a dummy link too.
    • Since the packet keeps the source address, this will do direct-server-return (DSR).

One concern is MTU. For FoU encapsulation, the MTU should be adjusted somewhere on the way,
but it is not clear where is appropriate. Maybe the target Pod's veth?

Checklist

  • Finish implementation of the issue
  • Test all functions
  • Have enough logs to trace activities
  • Notify developers of necessary actions
@ymmt2005
Copy link
Member Author

ymmt2005 commented Oct 24, 2024

There should be a discussion about if statefulness is always good.

For load balancers that accept packets from everywhere on the Internet, stateful
load balancing can be quite costly. The size of the mapping can be as big as the size
of the IP address space. So, definitely stateless load balancing is better for this.

However, we have a use case where clients of a load balancer are all internal.
In this case, statefulness is preferable because the connection won't be interrupted
because of load balancer maintenance and the size of the mapping should be small
and predictable.

So, we should consider implementing both stateful and stateless load balancers,
and allow users to choose by, for example, annotating Service objects.

Stateless load balancing can make some best effort when choosing a backend Pod.
See the following articles:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant