Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ability to specify range of VM IDs to use #286

Merged
merged 24 commits into from
Dec 5, 2024

Conversation

rybnico
Copy link
Contributor

@rybnico rybnico commented Sep 10, 2024

Description of changes:

This feature allows you to specify a vmidRange on a ProxmoxCluster object. A free ID between vmidRange.start and vmidRange.end will then be determined and used for a new VM.

We use ranges of VM IDs in Proxmox to differentiate between projects/customers/stages. I suspect many Proxmox users do this.

While developing this feature I found a bug in luthermonson/go-proxmox which will be fixed in the next release. Due to the size of this PR and possible requests from your side to merge it, I have created this PR with a dependency on the current main branch of go-proxmox. Of course, it shouldn't be merged until the new release of go-proxmox is out.

What do you think of this feature and its implementation?

Testing performed:

I created tests for the new features. We also tested the functionality on our Proxmox test cluster.

api/v1alpha1/proxmoxcluster_types.go Outdated Show resolved Hide resolved
internal/service/vmservice/vm.go Outdated Show resolved Hide resolved
@@ -311,10 +315,15 @@ func getMachineAddresses(scope *scope.MachineScope) ([]clusterv1.MachineAddress,
}

func createVM(ctx context.Context, scope *scope.MachineScope) (proxmox.VMCloneResponse, error) {
vmid, err := getVMID(ctx, scope)
if err != nil {
return proxmox.VMCloneResponse{}, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's only one remaining thing:
you will probably need to fail the machine provisioning here.

if err != nil {
  if errors.Is(err, ErrNoVMIDInRangeFree) {
    // fail the machine, by setting failureReason, and failureMessage
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you have the time to check my above comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without failing the machine here,
the user will not know what's happening, and the reconciliation is always be queued

Copy link
Collaborator

@65278 65278 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard the nitpicks in the review.

My problem is with the API itself:
Putting the VMID range into cluster id works for your usecase, but other people may have different use cases.
The only sensible place where to put VM IDs is into the ProxmoxMachineTemplates, as people may want to set VM ID ranges per machine kind.

Of course this means that if you want to reuse ProxmoxMachineTemplates but not VM IDs, you can't, but that's a small price to pay.

@rybnico
Copy link
Contributor Author

rybnico commented Sep 24, 2024

My problem is with the API itself: Putting the VMID range into cluster id works for your usecase, but other people may have different use cases. The only sensible place where to put VM IDs is into the ProxmoxMachineTemplates, as people may want to set VM ID ranges per machine kind.

Very good point. I tested the functionality by specifying different vmidRanges for my control plane and worker nodes and liked it immediately.
I moved spec.vmidRange from ProxmoxCluster to ProxmoxMachine (which is used in ProxmoxMachineTemplate) and updated the tests accordingly.

@mcbenjemaa
Copy link
Member

that's what i got.

      vmIDRange:
        start: 1000
        end: 1100
      vmIDRange:
        start: 2000
        end: 2100

Screenshot 2024-11-06 at 12 18 25

@mcbenjemaa mcbenjemaa removed the request for review from lubedacht November 6, 2024 12:07
@rybnico
Copy link
Contributor Author

rybnico commented Nov 6, 2024

that's what i got.

I cannot reproduce the issue. When I create a new VM, the controller reliably picks the first free VM ID from the vmIDRange, on both control plane and worker nodes.

Can you please have a look in the debug log at the API call to /api2/json/cluster/nextid that is made to determine the next free VM ID?

It should look something like this and it happens just before the VM clone task is started:

I1106 23:20:05.653862   68929 find.go:63] "vmid doesn't exist yet" controller="proxmoxmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ProxmoxMachine" ProxmoxMachine="default/poc01-control-plane-6zp2l" namespace="default" name="poc01-control-plane-6zp2l" reconcileID="7c8f9f1f-0
38f-4bcc-ad63-e513ddb96a67" machine="default/poc01-control-plane-6zp2l" cluster="default/poc01"
I1106 23:20:05.653904   68929 logger.go:52] SEND: GET - https://proxmoxtest01:8006/api2/json/cluster/status
I1106 23:20:05.734977   68929 logger.go:52] RECV: 200 - 200 OK
I1106 23:20:05.734997   68929 logger.go:52] BODY: {"data":[{"level":"","type":"node","name":"proxmoxtest01","local":1,"nodeid":0,"online":1,"id":"node/proxmoxtest01","ip":"10.1.103.50"}]}
I1106 23:20:05.735016   68929 logger.go:52] SEND: GET - https://proxmoxtest01:8006/api2/json/cluster/nextid?vmid=1000
I1106 23:20:05.750891   68929 logger.go:52] RECV: 200 - 200 OK
I1106 23:20:05.750898   68929 logger.go:52] BODY: {"data":"1000"}

@rybnico
Copy link
Contributor Author

rybnico commented Nov 14, 2024

@mcbenjemaa Have you had a chance to test this again with debug output enabled? Is there anything I can do to help you test this branch?

@mcbenjemaa
Copy link
Member

I will test it again and let you know

@mcbenjemaa
Copy link
Member

I will test it again and let you know

Testing with VMIDRanges is okay.
But, without ranges, there's something strange!

@rybnico
Copy link
Contributor Author

rybnico commented Nov 28, 2024

Testing with VMIDRanges is okay. But, without ranges, there's something strange!

Ok, I will also test without VMIDRanges. Can you be a bit more specific about "something strange"?

@mcbenjemaa
Copy link
Member

Testing with VMIDRanges is okay. But, without ranges, there's something strange!

Ok, I will also test without VMIDRanges. Can you be a bit more specific about "something strange"?

The machines are failing and are a bit slow. Honestly, I don't know whether it is because of the code or because some issues are happening in my setup.
let's run e2e tests again.

@rybnico
Copy link
Contributor Author

rybnico commented Dec 2, 2024

@mcbenjemaa I can't see from the logs why the e2e tests failed.

The logs from the Proxmox cluster would be interesting, though:

STEP: Dumping logs from the "capmox-e2e-rzk79z" workload cluster @ 12/02/24 14:35:32.973
Unable to get logs for workload Cluster create-workload-cluster-1hh797/capmox-e2e-rzk79z: log collector is nil.

Any idea why it failed?

Copy link
Member

@mcbenjemaa mcbenjemaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

LGTM

Copy link
Collaborator

@65278 65278 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me (I don't even have a nitpick). I'll give it a try and then greenlight.

@mcbenjemaa mcbenjemaa enabled auto-merge (squash) December 5, 2024 09:31
@mcbenjemaa mcbenjemaa merged commit 5f3ba2f into ionos-cloud:main Dec 5, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants