Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K0s workers require the mount binary to be in their PATH #3386

Closed
4 tasks done
twz123 opened this issue Aug 16, 2023 · 13 comments · Fixed by #3409
Closed
4 tasks done

K0s workers require the mount binary to be in their PATH #3386

twz123 opened this issue Aug 16, 2023 · 13 comments · Fixed by #3409
Assignees
Labels
bug Something isn't working

Comments

@twz123
Copy link
Member

twz123 commented Aug 16, 2023

Before creating an issue, make sure you've checked the following:

  • You are running the latest released version of k0s
  • Make sure you've searched for existing issues, both open and closed
  • Make sure you've searched for PRs too, a fix might've been merged already
  • You're looking at docs for the released version, "main" branch docs are usually ahead of released versions.

Platform

Linux 6.1.43 #1-NixOS SMP PREEMPT_DYNAMIC Thu Aug  3 08:24:19 UTC 2023 x86_64 GNU/Linux
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.05.20230806.61676e4"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.05 (Stoat)"
SUPPORT_END="2023-12-31"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.05 (Stoat)"
VERSION_CODENAME=stoat
VERSION_ID="23.05"

Version

main

Sysinfo

`k0s sysinfo`
Machine ID: "d02e3609003cf4b2d96bf16e7f1729c02fa3be9f8e0eab85d9ec38b0aa11528f" (from machine) (pass)
Total memory: 14.9 GiB (pass)
Disk space available for /var/lib/k0s: 91.7 GiB (pass)
Operating system: Linux (pass)
  Linux kernel release: 6.1.43 (pass)
  Max. file descriptors per process: current: 524288 / max: 524288 (pass)
  AppArmor: unavailable (pass)
  Executable in path: modprobe: /run/current-system/sw/bin/modprobe (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (pass)
    cgroup controller "memory": available (pass)
    cgroup controller "devices": available (assumed) (pass)
    cgroup controller "freezer": available (assumed) (pass)
    cgroup controller "pids": available (pass)
    cgroup controller "hugetlb": available (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
      CONFIG_NETFILTER_NETLINK: module (pass)
      CONFIG_NF_NAT: module (pass)
      CONFIG_IP_SET: IP set support: module (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
      CONFIG_IP_VS: IP virtual server support: module (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
        CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
        CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
        CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
      CONFIG_NF_DEFRAG_IPV4: module (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
      CONFIG_NF_DEFRAG_IPV6: module (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
      CONFIG_LLC: module (pass)
      CONFIG_STP: module (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: module (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

K0s fails to reach node readiness without mount being in the PATH. This is an undocumented hard external runtime dependency.

Steps to reproduce

Run a k0s worker without the mount binary in its PATH. One possibility: Using the smoke tests, e.g. by applying this blunt patch:

diff --git a/inttest/common/launchdelegate.go b/inttest/common/launchdelegate.go
index 918f3dbf0..4e6e4790d 100644
--- a/inttest/common/launchdelegate.go
+++ b/inttest/common/launchdelegate.go
@@ -65,7 +65,7 @@ func (s *standaloneLaunchDelegate) InitController(ctx context.Context, conn *SSH
 		fmt.Fprintf(&script, "umask %d\n", s.controllerUmask)
 	}
 	fmt.Fprintf(&script, "export ETCD_UNSUPPORTED_ARCH='%s'\n", runtime.GOARCH)
-	fmt.Fprintf(&script, "%s controller --debug %s </dev/null >>/tmp/k0s-controller.log 2>&1 &\n", s.k0sFullPath, strings.Join(k0sArgs, " "))
+	fmt.Fprintf(&script, "env -u PATH %s controller --debug %s </dev/null >>/tmp/k0s-controller.log 2>&1 &\n", s.k0sFullPath, strings.Join(k0sArgs, " "))
 	fmt.Fprintln(&script, "disown %1")
 
 	if err := conn.Exec(ctx, "cat >/tmp/start-k0s && chmod +x /tmp/start-k0s", SSHStreams{

(Yes, it removes the entire PATH, but the culprit is the mount executable.)

Then run make check-singlenode.

Expected behavior

Works - k0s is zero dependency 🙈

Actual behavior

The test will fail because the node won't get ready.

Screenshots and logs

The controller logs will contain some typical log lines:

  • cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/k0s/containerd/io.containerd.snapshotter.v1.overlayfs"" component=kubelet

  • kubelet.go:1400] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"" component=kubelet

  • mount_linux.go:232] Mount failed: exec: "mount": executable file not found in $PATH" component=kubelet

Additional context

The dependency is not with k0s itself, but with kubelet. We need to decide what to do:

  • Ship another embedded binary?
  • Search well-known paths and modify the component's PATH accordingly?
  • Add this to the docs and move on. (Probably a sysinfo probe should be added.)
@twz123 twz123 added the bug Something isn't working label Aug 16, 2023
@jnummelin
Copy link
Member

Add this to the docs and move on. (Probably a sysinfo probe should be added.)

This gets my vote

@twz123
Copy link
Member Author

twz123 commented Aug 16, 2023

Add this to the docs and move on. (Probably a sysinfo probe should be added.)

This gets my vote

Where do we draw the line what to embed and what not? We went quite far to not rely on external executables and libraries like libc, coreutils, iptables. What's different with mount in that regard?

@jnummelin
Copy link
Member

I'd draw the line on something like:
"Is the needed bin/lib is something that can be consider being present in EVERY Linux distro and has not proven to cause version incompatibilities"

@twz123
Copy link
Member Author

twz123 commented Aug 16, 2023

Alright, let's be pragmatic. There's probably not a good alternative to shelling out to mount that we could propose as a PR upstream, as we did for find, du and nice back in the day.

(Although it'd be cool to be able to run k0s from an empty environment. The mount binary would be the only blocker. All the other binaries are optional/used for corner cases. /cc #271 😄)

@kke
Copy link
Contributor

kke commented Aug 18, 2023

The mount_linux.go also runs umount.

jnummelin added a commit to jnummelin/k0s that referenced this issue Aug 23, 2023
@twz123
Copy link
Member Author

twz123 commented Aug 24, 2023

@ncopa wanted to have a look if the dependency can be replaced upstream

@twz123 twz123 reopened this Aug 24, 2023
@kke
Copy link
Contributor

kke commented Aug 25, 2023

I think it should be possible with syscall.Mount

@github-actions
Copy link
Contributor

The issue is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added the Stale label Sep 24, 2023
@twz123 twz123 removed the Stale label Sep 25, 2023
@github-actions
Copy link
Contributor

The issue is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added the Stale label Oct 25, 2023
@kke kke removed the Stale label Oct 26, 2023
Copy link
Contributor

The issue is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added the Stale label Nov 25, 2023
@twz123 twz123 removed the Stale label Nov 26, 2023
@jnummelin
Copy link
Member

jnummelin commented Dec 20, 2023

Looking at the repo https://github.com/kubernetes/mount-utils where all mount stuff is implemented, replacing (u)mount calls with syscalls is only small part of the problem. There's plenty of other exec calls it does to resize FS, systemd dependencies, FS initialization etc.. Replacing all these is really not feasible or possible. So replacing just mount doesn't really get us anywhere IMO. xref: kubernetes/mount-utils#13

@twz123
Copy link
Member Author

twz123 commented Dec 20, 2023

Oh, wow! Something that we need to keep an eye on and improve our docs and maybe sysprobes. So we're not that zero-dependency, unfortunately...

@ncopa
Copy link
Collaborator

ncopa commented Dec 20, 2023

I believe this was fixed with 19c800d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants