Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neonvm: prevent migrating VMs to the nodes with non-matching architecture #1123

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mikhail-sakhnov
Copy link
Contributor

@mikhail-sakhnov mikhail-sakhnov commented Oct 24, 2024

Add node affinity to the VM spec during the live migration pending state, if it doesn't have.

#1083

Copy link

github-actions bot commented Oct 24, 2024

No changes to the coverage.

HTML Report

Click to open

@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/architecture-fence-for-livemigration branch 2 times, most recently from 5446383 to a036d85 Compare October 24, 2024 09:46
@mikhail-sakhnov mikhail-sakhnov marked this pull request as ready for review October 24, 2024 09:46
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/architecture-fence-for-livemigration branch 3 times, most recently from 893c635 to 8e751ea Compare October 31, 2024 00:58
@sharnoff sharnoff self-assigned this Oct 31, 2024
Copy link
Member

@sharnoff sharnoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

pkg/neonvm/controllers/vmmigration_controller.go Outdated Show resolved Hide resolved
if vm.Spec.Affinity != nil && vm.Spec.Affinity.NodeAffinity != nil && vm.Spec.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution != nil {
for _, term := range vm.Spec.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms {
for _, expr := range term.MatchExpressions {
if expr.Key == "kubernetes.io/arch" && expr.Operator == corev1.NodeSelectorOpIn && len(expr.Values) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: does len(expr.Values) > 0 mean that this function will return true for VMs that have an architecture affinity for e.g. "either amd64 or arm64" ? IIUC, would that mean that we could accidentally migrate between x86 and ARM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Values are checked in the loop body, if I understand correctly your question

tests/e2e/vm-migration/01-assert.yaml Show resolved Hide resolved
Comment on lines +187 to +195
if err := addArchitectureAffinity(vm, sourceNode); err != nil {
log.Error(err, "Failed to add architecture affinity to VM", "sourceNodeStatus", sourceNode.Status)
}
if err = r.Update(ctx, vm); err != nil {
log.Error(err, "Failed to update node affinity of the source vm")
return ctrl.Result{RequeueAfter: time.Second}, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: IIUC, does this mean that the VM will permanently get affinity for the node it starts with?

Is there a reason we can't just do it on the target pod for the migration? (I guess, potential race conditions?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a particular reason, I just feel it is a bit safer to make the guard on the very first step of the migration.

Plus if we going to have multiarch clusters it is a bit unsafe to even have VM without architecture affinity, may be in later it should be even added during the VM creation, in the vmcontroller?

@sharnoff sharnoff assigned mikhail-sakhnov and unassigned sharnoff Nov 18, 2024
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/architecture-fence-for-livemigration branch from a0219ef to c5ba550 Compare November 26, 2024 12:27
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/architecture-fence-for-livemigration branch from c5ba550 to f5246f7 Compare November 26, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants