Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Unix domain socket support for VLAN #17473

Closed
wants to merge 2 commits into from

Conversation

arixmkii
Copy link
Contributor

@arixmkii arixmkii commented Feb 10, 2023

Fixes #15852

This change adds support for new QEMU stream netdev added in 7.2.0. It is implemented as an opt-in mode for previously supported platforms and the only supported mode on Windows.

Old FD netdev has changes only on podman side. Instead of previously used QMP socket address, it now has VLAN dedicated socket address to make both implementations more similar.

As VLAN socket address has short lifespan (it exists only after forwarder has been started and before QEMU has finished startup), it is not promoted to persisted machine settings, but is rather calculated inside Start method.

Signed-off-by: Arthur Sengileyev [email protected]

I tried to highlight everything significant in the commit message. This indeed fixes #15852, but it has a lot more added that the change required to fix it. I needed other changes to actually check the desired behavior of the fix for the issue mentioned above, if the team will insist I can split it up into series of commits (I would prefer not to, though 😅 ).

VLAN socket address is not persisted anywhere in machine socket, so, the logic decides on the mode by checking the command line. Actually the way how QEMU command line is stored is dictating a lot of how QEMU and in turn gvproxy and Podman should operate. So, it might look dirty, but to me looked like a reasonable implementation (at least at this point in time).

Machine Init

When in SOCKET VLAN mode (on platforms, where FD supported it has to be enabled via env var) the Init will create QEMU command line compatible with new QEMU 7.2.0 feature.

Machine Start

Podman will ignore the current settings, but will rely on what was create by Machine Init (analyze command line)

Named pipes

Now gvproxy will also expose named pipe on Windows (no-op for non-Windows paths). The name of the pipe is of the format qemu-podman-... For default machine it will result in qemu-podman-machine-default. This prefix is needed
to prevent name collision with WSL machine. Clients could read pipe name from podman machine inspect output (the change is already merged to Podman Desktop).

Mode switch

Current implementation is using opt-in via env var, but alternative implementations (like checking QEMU version or config files) could exist. Socket VLAN is more cross platform, but it can't replace old implementation, because
it is only available starting from QEMU 7.2.0 (which means baseline RHEL 10 😅 )

Configuration mismatch warning

When mode switch has happened between Init and Start - like machine was created with FD settings, but then Start is
called, when SOCKET VLAN usage is requested - there will be WARN output what could be changed to fix the mismatch.
Works in both directions.

Socket VLAN gvproxy readiness check

It is impossible to use the Dial, because then we could not pass it forward as FD. The most common reason for failed start
would be socket address unavailable. In this case gvproxy will terminate. Readiness check algorithm looks like this:

  1. Initial delay (small sleep) to account for gvproxy starting up in the background.
  2. Start loop for up to max attempts
  3. Check if process is alive using PID
    3.1. If not, then short circuit to failure
    3.2. If process is running continue to the next check
  4. Check if we can os.Stat the socket address
    4.1. If not then sleep and go to the next loop iteration
    4.4. If ok, then break the look and continue

Refactorings

  • startHostNetworking private method is now more sophisticated implementation: accepts input param, returns in addition Process data and has logic changes to account for name pipe functionality.

Additional fixes

  • use "reasonable" UID on Windows (no platform check, but we rather check for the invalid return value)
  • fixed type in Errorf in checkProcessStatus in _unix sources
  • use stored v.ReadySock.Path instead of making it up again in Start method using the same rules

Local testing

Acceptance test: start the machine and run nginx with port forwarding to host machine.

  • Tested on Windows applying all other pending changes for Windows support first
  • Tested on Apple Silicon Mac

Output from Apple Silicon Mac:
Running FD created machine (created with release version of Podman) w/ FD mode

podman % ./bin/darwin/podman --log-level=debug machine start
INFO[0000] ./bin/darwin/podman filtering at log level debug
Starting machine "podman-machine-default"
[/Users/username/git/podman/bin/libexec/podman/gvproxy -listen-qemu unix:///var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/vlan_podman-machine-default.sock -pid-file /var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_proxy.pid -ssh-port 64042 -forward-sock /Users/username/.local/share/containers/podman/machine/podman-machine-default/podman.sock -forward-dest /run/user/501/podman/podman.sock -forward-user core -forward-identity /Users/username/.ssh/podman-machine-default --debug]
DEBU[0000] qemu cmd: [/opt/homebrew/bin/qemu-system-aarch64 -m 8192 -smp 4 -fw_cfg name=opt/com.coreos/config,file=/Users/username/.config/containers/podman/machine/qemu/podman-machine-default.ign -qmp unix:/var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/qmp_podman-machine-default.sock,server=on,wait=off -netdev socket,id=vlan,fd=3 -device virtio-net-pci,netdev=vlan,mac=5a:94:ef:e4:0c:ee -device virtio-serial -chardev socket,path=/var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_ready.sock,server=on,wait=off,id=apodman-machine-default_ready -device virtserialport,chardev=apodman-machine-default_ready,name=org.fedoraproject.port.0 -pidfile /var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_vm.pid -accel hvf -accel tcg -cpu host -M virt,highmem=on -drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -drive file=/Users/username/.local/share/containers/podman/machine/qemu/podman-machine-default_ovmf_vars.fd,if=pflash,format=raw -virtfs local,path=/Users/username,mount_tag=vol0,security_model=mapped-xattr -drive if=virtio,file=/Users/username/.local/share/containers/podman/machine/qemu/podman-machine-default_fedora-coreos-37.20230110.2.0-qemu.aarch64.qcow2] 
Waiting for VM ...
...

Running FD created machine (created with release version of Podman) w/ socket VLAN mode

podman % CONTAINERS_USE_SOCKET_VLAN=true ./bin/darwin/podman --log-level=debug machine start
INFO[0000] ./bin/darwin/podman filtering at log level debug 
Starting machine "podman-machine-default"
WARN[0000] Stored Podman Machine configuration doesn't match current settings 
WARN[0000] Consider replacing "socket,id=vlan,fd=3" with "stream,id=vlan,server=off,addr.type=unix,addr.path=/var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/vlan_podman-machine-default.sock" in machine config 
[/Users/username/git/podman/bin/libexec/podman/gvproxy -listen-qemu unix:///var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/vlan_podman-machine-default.sock -pid-file /var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_proxy.pid -ssh-port 64042 -forward-sock /Users/username/.local/share/containers/podman/machine/podman-machine-default/podman.sock -forward-dest /run/user/501/podman/podman.sock -forward-user core -forward-identity /Users/username/.ssh/podman-machine-default --debug]
DEBU[0000] qemu cmd: [/opt/homebrew/bin/qemu-system-aarch64 -m 8192 -smp 4 -fw_cfg name=opt/com.coreos/config,file=/Users/username/.config/containers/podman/machine/qemu/podman-machine-default.ign -qmp unix:/var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/qmp_podman-machine-default.sock,server=on,wait=off -netdev socket,id=vlan,fd=3 -device virtio-net-pci,netdev=vlan,mac=5a:94:ef:e4:0c:ee -device virtio-serial -chardev socket,path=/var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_ready.sock,server=on,wait=off,id=apodman-machine-default_ready -device virtserialport,chardev=apodman-machine-default_ready,name=org.fedoraproject.port.0 -pidfile /var/folders/9n/358_gdmn3fxfnyttwys8k1dm0000gn/T/podman/podman-machine-default_vm.pid -accel hvf -accel tcg -cpu host -M virt,highmem=on -drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -drive file=/Users/username/.local/share/containers/podman/machine/qemu/podman-machine-default_ovmf_vars.fd,if=pflash,format=raw -virtfs local,path=/Users/username,mount_tag=vol0,security_model=mapped-xattr -drive if=virtio,file=/Users/username/.local/share/containers/podman/machine/qemu/podman-machine-default_fedora-coreos-37.20230110.2.0-qemu.aarch64.qcow2] 
Waiting for VM ...
...

[NO NEW TESTS NEEDED]

Does this PR introduce a user-facing change?

Podman Machine now supports QEMU 7.2.0 stream netdev for VLAN socket

@arixmkii
Copy link
Contributor Author

It should not alter any default behavior and everything else is ether behind opt-in toggle or not yet supported platform (QEMU on Windows), so, I was leaning towards adding NO NEW TESTS tag, but the changeset is not small/trivial, so, the team might disagree with it.

This changeset with all other in pending review states is 99% of changes needed to enable QEMU on Windows.

But yeah, I don't expect this PR would be the fast one to get through.

@arixmkii
Copy link
Contributor Author

arixmkii commented Feb 27, 2023

I'm not sure if this part behaved on Windows as expected, but for sure it didn't affect happy path in my testing. https://github.com/containers/podman/blob/06e85f0f6093081e8707b30e191ba190613090b1/pkg/machine/qemu/machine.go#L502-L512

@arixmkii
Copy link
Contributor Author

Rebased to main to resolve conflicts from another merged PR.

@arixmkii arixmkii force-pushed the unix-vlan branch 2 times, most recently from 64e2869 to 753b34c Compare February 28, 2023 17:41
@arixmkii
Copy link
Contributor Author

I'm still a bit lost if it is possible to write a test for this one as one needs to use Windows for it or being able to modify environment variable for a specific test.

@arixmkii
Copy link
Contributor Author

@rhatdan may be you can help me to find some reviewers for this one? At least to get some comments if it is ok conceptually.

@rhatdan
Copy link
Member

rhatdan commented Mar 11, 2023

@baude @Luap99 @n1hility @ashley-cui PTAL

return err
}

for i := 0; i < 6; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 6? Personal nit, I'd prefer a variable with a name that explained why 6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because it was already used in this file in 2 places https://github.com/containers/podman/blob/main/pkg/machine/qemu/machine.go#L515 and https://github.com/containers/podman/blob/main/pkg/machine/qemu/machine.go#L604

Probably it should refactored into more generic utility, which could be used in all places in this file. I have not strong opinion if this should be done as part of this PR or not. If it is agreed to add the refactoring here, then I will do it in a separate commit as this is not implementation related.

Copy link
Contributor Author

@arixmkii arixmkii Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added named constants. Didn't manage to get full refactoring to unify similar code paths.

@TomSweeneyRedHat
Copy link
Member

The changes in general LGTM, but this is all way out of my wheelhouse

@n1hility
Copy link
Member

I'm not sure if this part behaved on Windows as expected, but for sure it didn't affect happy path in my testing.

https://github.com/containers/podman/blob/06e85f0f6093081e8707b30e191ba190613090b1/pkg/machine/qemu/machine.go#L502-L512

This should be safe on Windows. The biggest difference is that SIGTERM can't be eaten and the program kept alive, you can only use it for cleanup or exiting with an exit code like this block is doing, so this part looks right to me.

@arixmkii arixmkii force-pushed the unix-vlan branch 7 times, most recently from 90d001f to 2ee867a Compare March 21, 2023 10:54
@rhatdan
Copy link
Member

rhatdan commented Mar 21, 2023

Either add tests or [NO NEW TESTS NEEDED]

@arixmkii
Copy link
Contributor Author

I will add [NO NEW TESTS NEEDED] and force push. My reasoning:

  • Windows QEMU machine is not enabled yet (and it will require additional setup for tests, a bit topic on its own)
  • Alternate mode of FD will be purely opt in and has stricter requirements on minimal supported QEMU version, it looks like it now can only be reliably tested on macOS with brew

@arixmkii
Copy link
Contributor Author

arixmkii commented Mar 21, 2023

Observed some sort of regression at least in my test Windows build, so, I might apply some fixes to the commits, but the concept is not expect to change.

Update: removed additional refactoring for now. Probably that is beyond something I can quickly reimplement in a complex Go lang codebase. Removed commit is available at arixmkii@1321726

@arixmkii
Copy link
Contributor Author

@TomSweeneyRedHat @n1hility are any additional changes needed?

@arixmkii
Copy link
Contributor Author

Failed "Verify Win Installer Build" looks like a flake to me (at least from build log and searching the web for that specific error).

@n1hility
Copy link
Member

n1hility commented Apr 2, 2023

Looking into the verify installer failure, we have other runs where its working and repeated runs are failing here, so looking into why that happens

@@ -557,7 +593,10 @@ func (v *MachineVM) Start(name string, opts machine.StartOptions) error {
defer dnw.Close()

attr := new(os.ProcAttr)
files := []*os.File{dnr, dnw, dnw, fd}
files := []*os.File{dnr, dnw, dnw}
if fd != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on this check.

// cleanupVMProxyProcess kills the proxy process and removes the VM's pidfile
func (v *MachineVM) cleanupVMProxyProcess(proxyProc *os.Process) error {
// cleanupVMProxyProcess kills the proxy process
func (v *MachineVM) cleanupVMProxyProcess(proxyPid int) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, why did you choose to change the type here from something reasonably strong to something more generic like an int? im not sure i have an objection with it, i would however like to know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because internally this os.Process pointer was used only to call Kill method https://pkg.go.dev/os#Process.Kill, which really doesn't work as we need to send SIGTERM for correct cleanup and it also doesn't work on Windows. There is no point extracting the pid here from os.Proccess as previously we acquired that looking up pid value we already stored as part of the state. os.Process really brings no benefits there if we can't use its API.

@@ -921,7 +961,11 @@ func (v *MachineVM) stopLocked() error {
}

fmt.Println("Waiting for VM to exit...")
for isProcessAlive(vmPid) {
for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that we never exit this loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added retry limit

Timeout: timeout,
}
return monitor, nil
return define.NewMachineFile(filepath.Join(rtDir, name+".sock"), nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, there are actually plans for podman 5 to make a sister to VMFile but more like VMSocket or MachineSocket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refactor this PR if the changes lands before this one is merged.

}

cmd := gvproxy.NewGvproxyCommand()
cmd.AddQemuSocket(fmt.Sprintf("unix://%s", v.QMPMonitor.Address.GetPath()))
cmd.AddQemuSocket(fmt.Sprintf("unix://%s", filepath.ToSlash(vlanSocket.GetPath())))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@baude
Copy link
Member

baude commented Dec 11, 2023

Ok, reviewed. Many of the comments are fairly soft and if you wish to not do them, thats ok, just be sure to respond to the comments so it can be reconciled later.

I have two stronger review comments that I will put in subsequently because then we can respond to them individually.

@baude
Copy link
Member

baude commented Dec 11, 2023

I would love, where possible, to see more tests that lean heavily on your code. Unittests where possible as well. This is a lot of code to change and it is going to get heavily reworked and broken in the very near future. Would you consider going back and accessing your comfort with tests and that regressions are not introduced?

@baude
Copy link
Member

baude commented Dec 11, 2023

And finally, I don't recall seeing any sort of version checking but yet this requires a newer QEMU if I understood things correctly. Podman, Podman Desktop, Brew, Distros are really struggling with package level interdependencies. Do you feel that we should somehow do something in your code to avoid this?

@baude
Copy link
Member

baude commented Dec 11, 2023

ps. i think we just about have hyperv and wsl passing tests again (maybe a flake here and there) ... i would prefer that we do not merge this PR until #20953 is merged.

@arixmkii
Copy link
Contributor Author

Would you consider going back and accessing your comfort with tests and that regressions are not introduced?

Yes, I will try to add tests after addressing your other comments.

ps. i think we just about have hyperv and wsl passing tests again (maybe a flake here and there)

I'm counting on it. It was just a coincidence that I needed a rebase (because line endings were changes in README.md from CRLF to LF).

And finally, I don't recall seeing any sort of version checking but yet this requires a newer QEMU if I understood things correctly.

That newer version is one, which was released a year ago :) I have a version specific breakdown in this comment #17473 (comment) Also, this code path is enabled unconditionally only for Windows code and is behind Env var feature toggle for others. I will think about if it is viable to add version check within current changes. Will either add them or create an issue for a follow up and assign to me.

@arixmkii arixmkii force-pushed the unix-vlan branch 2 times, most recently from b7e68be to 0d27e95 Compare December 11, 2023 21:32
@arixmkii
Copy link
Contributor Author

arixmkii commented Dec 11, 2023

Addressed comments. Will try to add unit tests later this week.

@arixmkii
Copy link
Contributor Author

@baude I rebased to latest changes and added unit tests, which are similar to existent ones. I see that currently there are not many unit tests and machines are mostly tested e2e.

I wanted to update code to use your new utilities from sockets, but I have question. Why is the new ready socket different from qmp socket. The QMP one is using different runtime directory depending on rootless or rootful configurations

address, err := define.NewMachineFile(filepath.Join(rtDir, "qmp_"+name+".sock"), nil)
but the new Ready socket is always using temp directory w/o any conditional checks on the isRootful state
if err := sockets.SetSocket(&vm.ReadySocket, sockets.ReadySocketPath(runtimeDir+"/podman/", vm.Name), &symlink); err != nil {
Could you clarify, which one I would use for vlan sort of socket? Or is it a bug introduced with the refactoring to ReadySocket?

This change adds support for new QEMU stream netdev added in 7.2.0.
It is implemented as an opt-in mode for previously supported
platforms and the only supported mode on Windows.

Old FD netdev has changes only on podman side. Instead of previously
used QMP socket address, it now has VLAN dedicated socket address to
make both implementations more similar.

As VLAN socket address has short lifespan (it exists only after
forwarder has been started and before QEMU has finished startup),
it is not promoted to persisted machine settings, but is rather
calculated inside Start method.

Signed-off-by: Arthur Sengileyev <[email protected]>
@arixmkii
Copy link
Contributor Author

arixmkii commented Feb 9, 2024

Will be continued in #21594

@arixmkii arixmkii closed this Feb 9, 2024
@arixmkii arixmkii deleted the unix-vlan branch March 1, 2024 13:58
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label May 31, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators May 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use dedicated configured path for QEMU and gvproxy vlan socket
6 participants