Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed the issue of IP resources being exhausted when using multiple NICs #4359

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ty-dc
Copy link
Collaborator

@ty-dc ty-dc commented Dec 3, 2024

Thanks for contributing!

Notice:

What issue(s) does this PR fix:

Fixes #

Special notes for your reviewer:

Problem 1:

When ipam.spidernet.io/ippools does not specify an interface name, the network card information recorded in spiderEndpoint is incorrect.

Especially for statefulset applications, if interface is equal to "", the endpoint will be deleted.

  current:
    ips:
    - cleanGateway: false
      interface: eth0
      ipv4: 172.18.40.5/16
      ipv4Gateway: 172.18.0.1
      ipv4Pool: default-v4-ippool
      vlan: 0
    - cleanGateway: false
      interface: ""
      ipv4: 172.20.40.200/16
      ipv4Gateway: 172.20.0.1
      ipv4Pool: test-v4-ippool
      vlan: 0

Problem 2:

IP ​​resources are exhausted,The 4102bd95-8f63-4f66-9b64-d7050e91b6c5 UID occupies IP resources indefinitely.

kubectl get sp default-v4-ippool -ojsonpath={.status.allocatedIPs} | jq 
{
  "172.18.40.10": {
    "pod": "default/ippool-test-app-58b7766c7c-mvwpf",
    "podUid": "4009cb83-bd17-4b08-92f0-2ea0cd3d64fc"
  },
  "172.18.40.11": {
    "pod": "default/ippool-test-app-58b7766c7c-5jp5d",
    "podUid": "4e57d4ee-fd3f-44fa-8c80-1db2c6ae0e98"
  },
  "172.18.40.12": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.13": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.19": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.20": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  }
}

Problem 3:

Statefulset cannot be created without NICs

  Warning  FailedCreatePodSandBox  13s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a0d4fe8eb1b4d9a115997df44488c4b297d5cb845234dd069f6db8decb282b98": plugin type="multus" name="multus-cni-network" failed (add): [default/test-sts-0/07793483-79f5-47ff-acdd-4a67d697ef85:macvlan-vlan100]: error adding container to network "macvlan-vlan100": spiderpool IP allocation error: [POST /ipam/ip][500] postIpamIpFailure  failed to retrieve the existing IP allocation: failed to update SpiderEndpoint allocation details NIC name net1, error: Operation cannot be fulfilled on spiderendpoints.spiderpool.spidernet.io "test-sts-0": StorageError: invalid object, Code: 4, Key: /registry/spiderpool.spidernet.io/spiderendpoints/default/test-sts-0, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 7be12fb3-6b7c-4d6f-8536-e16b924305e4, UID in object meta:

@ty-dc ty-dc added the pr/not-ready not ready for merging label Dec 3, 2024
@ty-dc ty-dc force-pushed the fix/ip-allocate branch 2 times, most recently from e9fbecf to 94c0ea0 Compare December 4, 2024 03:44
@@ -505,6 +505,11 @@ func (i *ipam) allocateIPFromCandidate(ctx context.Context, c *PoolCandidate, ni
continue
}

if ip == nil || ip.Address == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ip == nil || ip.Address == nil 为什么代表 “ its UID %s already exists in pool ”
并且 break

这块的代码可读性 不高

@@ -115,6 +115,11 @@ func (im *ipPoolManager) AllocateIP(ctx context.Context, poolName, nic string, p
if err != nil {
return err
}
if allocatedIP == nil {
// The Pod has already obtained an IP address in the pool.
// We skip this allocation and return nil.
Copy link
Collaborator

@weizhoublue weizhoublue Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从这个 api 定义来看,为什么不是 把 分配的 ip return 出去,或者 在 相关的 return error 参数中 说明这种 case

而是用 ip为空 这种 非常有歧义的 方式来代表 ip 已经分配了

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯。

@ty-dc ty-dc force-pushed the fix/ip-allocate branch 2 times, most recently from b429b11 to 839477e Compare December 4, 2024 09:42
Copy link

codecov bot commented Dec 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.85%. Comparing base (e3b8408) to head (f05a801).
Report is 16 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4359      +/-   ##
==========================================
+ Coverage   79.56%   79.85%   +0.28%     
==========================================
  Files          54       54              
  Lines        6362     6369       +7     
==========================================
+ Hits         5062     5086      +24     
+ Misses       1103     1082      -21     
- Partials      197      201       +4     
Flag Coverage Δ
unittests 79.85% <100.00%> (+0.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/ippoolmanager/ippool_manager.go 84.61% <100.00%> (-2.03%) ⬇️
...orkloadendpointmanager/workloadendpoint_manager.go 100.00% <ø> (ø)

... and 1 file with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/not-ready not ready for merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants