Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use FQDN with toolbox prefix #1086

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

akdev1l
Copy link

@akdev1l akdev1l commented Aug 17, 2022

Issue: #969

Setting the hostname to toolbox causes timeouts whenever anything tries
to resolve the name of the machine - for example sudo does this.

This change makes it so the FQDN is set to
${container_name}.${hostname} as recommended in the linked issue.

After this change commands can properly resolve the local FQDN.

I removed the symlink to /run/host/etc/hosts because podman already copies that information in and then we can use --add-host to add a mapping to localhost for the container - this way calling ping $(hostname) does what is expected.

Pending PRs:

  1. Allow toolbox containers to be created with a custom hostname #1007 - this adds a new flag to allow setting the hostname of the toolbox, I think this should just follow the container-name.hostname convention otherwise it seems confusing
  2. https://github.com/containers/toolbox/pull/771/files - similar but sets the hostname of the container to be equal to the container name - the name is still unresolvable however.
  3. https://github.com/containers/toolbox/pull/383/files - same as above but tries to sanitize the container name
  4. https://github.com/containers/toolbox/pull/573/files - Obsolete, PR against bash toolbox

None of these PRs address my issue with delays due to unresolvable hostnames. So this one tries to do that.

Sample Output

[akdev@canzuk toolbox]$ ./build/src/toolbox create -i docker.io/akdev1l/ubuntu-toolbox:22.04 test1
Created container: test1
Enter with: toolbox enter test1
[akdev@canzuk toolbox]$ ./build/src/toolbox enter test1
⬢[akdev@test1 toolbox]$ hostname
test1.canzuk.hq.akdev.xyz
⬢[akdev@test1 toolbox]$ cat /etc/hosts
127.0.0.1	test1.canzuk.hq.akdev.xyz
127.0.0.1	localhost localhost.localdomain localhost4 localhost4.localdomain4 toolbox
::1	localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.1.100	host.containers.internal

@softwarefactory-project-zuul
Copy link

Build succeeded.

✔️ unit-test SUCCESS in 6m 54s
✔️ system-test-fedora-rawhide SUCCESS in 16m 52s
✔️ system-test-fedora-36 SUCCESS in 9m 51s
✔️ system-test-fedora-35 SUCCESS in 10m 05s

@akdev1l
Copy link
Author

akdev1l commented Aug 17, 2022

this seems to trigger a minor bug, when exiting the toolbox if I pressed ctrl+c at the prompt then toolbox falsely prints out an empty Error: message

I'll have to dig into that

@softwarefactory-project-zuul
Copy link

Build succeeded.

✔️ unit-test SUCCESS in 6m 57s
✔️ system-test-fedora-rawhide SUCCESS in 11m 04s
✔️ system-test-fedora-36 SUCCESS in 9m 57s
✔️ system-test-fedora-35 SUCCESS in 10m 10s

@softwarefactory-project-zuul
Copy link

Build failed.

unit-test NODE_FAILURE in 0s
✔️ system-test-fedora-rawhide SUCCESS in 13m 16s
✔️ system-test-fedora-36 SUCCESS in 13m 16s
✔️ system-test-fedora-35 SUCCESS in 13m 51s

@akdev1l
Copy link
Author

akdev1l commented Aug 19, 2022

sorry for the noise the bug I hit seems to be because toolbox exits with return code 130 (SIGTERM) whenever ctrl+c is pressed - seriously wondering how my changes triggered this

@akdev1l
Copy link
Author

akdev1l commented Aug 19, 2022

capsh --caps= -- -c exec "$@" /bin/sh /bin/bash -l

why is so convoluted ... it would seems this is equivalent

capsh --caps= -- -c 'exec /bin/bash -l'

mm from further experimentation it seems that bash just does this and toolbox should just ignore the 130 error code

toolbox without modification does this:

[akdev@canzuk toolbox]$ toolbox enter test10
⬢[akdev@test10 toolbox]$ ^C
⬢[akdev@test10 toolbox]$ ^C
⬢[akdev@test10 toolbox]$ ^C
⬢[akdev@test10 toolbox]$
logout
[akdev@canzuk toolbox]$ echo $?
0
[akdev@canzuk toolbox]$ bash
[akdev@canzuk toolbox]$ ^C
[akdev@canzuk toolbox]$ ^C
[akdev@canzuk toolbox]$
exit
[akdev@canzuk toolbox]$ echo $?
130

imo this should keep the error code of the last command execute so it should be 130 at the end to match the behaviour of bash - we could probably simplify that capsh shenanigans, looks like there's at least an unnecessary fork there

@softwarefactory-project-zuul
Copy link

Build failed.

unit-test FAILURE in 7m 15s
system-test-fedora-rawhide FAILURE in 16m 10s
system-test-fedora-36 FAILURE in 10m 29s
system-test-fedora-35 FAILURE in 10m 34s

@softwarefactory-project-zuul
Copy link

Build failed.

✔️ unit-test SUCCESS in 7m 07s
system-test-fedora-rawhide FAILURE in 15m 36s
system-test-fedora-36 FAILURE in 10m 27s
system-test-fedora-35 FAILURE in 10m 41s

@softwarefactory-project-zuul
Copy link

Build failed.

✔️ unit-test SUCCESS in 7m 03s
system-test-fedora-rawhide FAILURE in 15m 52s
system-test-fedora-36 FAILURE in 10m 13s
system-test-fedora-35 FAILURE in 10m 30s

@softwarefactory-project-zuul
Copy link

Build failed.

✔️ unit-test SUCCESS in 7m 10s
system-test-fedora-rawhide FAILURE in 16m 00s
system-test-fedora-36 FAILURE in 10m 19s
system-test-fedora-35 FAILURE in 10m 54s

@akdev1l
Copy link
Author

akdev1l commented Aug 25, 2022

I have continued development of toolbox on my own as I'm not really having this merged on a reasonable time frame.

my fork is at: https://github.com/akdev1l/toolbox/tree/akdev

I have fixed some long standing issues and added some features (particularly I have enabled a static build and containerized toolbox itself, resolved the DNS issues and added basic export support) - feel free to ping me if there's any interest on that.

Otherwise I will leave this PR to die - feel free to close.

Copy link
Member

@debarshiray debarshiray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be good if the commit message had some explanation around exit code 130 and SIGTERM. Particularly the user facing behaviour.

src/cmd/run.go Outdated
@@ -477,10 +479,10 @@ func constructExecArgs(container string,

execArgs = append(execArgs, []string{
container,
"capsh", "--caps=", "--", "-c", "exec \"$@\"", "/bin/sh",
"capsh", "--caps=", "--", "-c",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to have anything to do with what the commit message describes. :)

src/cmd/run.go Outdated Show resolved Hide resolved
@debarshiray
Copy link
Member

debarshiray commented Nov 17, 2022

sorry for the noise the bug I hit seems to be because toolbox
exits with return code 130 (SIGTERM) whenever ctrl+c is
pressed - seriously wondering how my changes triggered this

I don't think your changes introduced that. :)

A few months ago, Toolbx started propagating the exit code of the command, which triggered this. You wouldn't have encountered this behaviour before that.

From the bash(1) manual:

When a command terminates on a fatal signal N,
bash uses the value of 128+N as the exit status.

Ctrl+c is SIGINT (not SIGTERM) and it's numerical value is 2:

$ kill -L
 1) SIGHUP	 2) SIGINT ...

Hence, 130 (= 128 + 2) as the exit code.

Copy link
Member

@debarshiray debarshiray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @akdev1l ; and my apologies for the delay.

images/fedora/f32/README.md Outdated Show resolved Hide resolved
src/cmd/create.go Outdated Show resolved Hide resolved
@debarshiray
Copy link
Member

Setting the hostname to toolbox causes timeouts whenever
anything tries to resolve the name of the machine - for example
sudo does this.

I am curious. I have never experienced delays when using sudo(8) inside a Toolbx container, but maybe you are doing something that I never do. Could you please describe this in a bit more detail?

@debarshiray
Copy link
Member

I have enabled a static build

The way things stand today, we are unlikely to build statically or disable CGO. See: #832

Of course, this may change if our realities change or if new information comes to light.

@akdev1l
Copy link
Author

akdev1l commented Nov 26, 2022

@debarshiray hi! thanks for your comments - I'll clean this up and just deal with the fqdn change

you gave me some historical background on toolbx and I appreciate that.

I am curious. I have never experienced delays when using sudo(8) inside a Toolbx container, but maybe you are doing something that I never do. Could you please describe this in a bit more detail?

this happened to me because I build/distribute my own toolbx images (https://github.com/akdev1l/toolbox-images) - there is a requirement that isn't specified in the documentation which is having nss-myhostname installed and enabled in /etc/nsswitch.conf in the toolbox image. Without that sudo tries to resolve the container hostname and fails. (using the fqdn as originally intended in this PR solves this issue too as the name is resolvable, hence that was original drive for this change)

@debarshiray
Copy link
Member

this happened to me because I build/distribute my own toolbx
images (https://github.com/akdev1l/toolbox-images) - there is
a requirement that isn't specified in the documentation which
is having nss-myhostname installed and enabled in
/etc/nsswitch.conf in the toolbox image. Without that sudo
tries to resolve the container hostname and fails. (using the
fqdn as originally intended in this PR solves this issue too as the
name is resolvable, hence that was original drive for this change)

nod

@HarryMichal explained that to me later in real life. My apologies, I forgot to mention that here.

@akdev1l
Copy link
Author

akdev1l commented Dec 5, 2022

@debarshiray minimal cleaned this up, this should be the minimal change required for a proper FQDN - looks like it passes the tests

I still think we should change the prefix from "toolbox" to the name of the toolbox, bash prompts seem to show the first component of the FQDN (with this change there won't be any user visible prompt changes as we have hardcoded it the first component to toolbox for now)

[akdev@toronto toolbox]$ echo $PS1
[\u@\h \W]\$

so this would solve the issue of distinguishing the containers very elegantly imo (no changes required on the images, no big changes required on toolbox, no custom scripts nor custom prompts for bash or other shells)

@akdev1l akdev1l changed the title fix: don't use toolbox as constant for FQDN fix: use FQDN with toolbox prefix Dec 5, 2022
@softwarefactory-project-zuul
Copy link

Build succeeded.

✔️ unit-test SUCCESS in 8m 14s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 8m 26s
✔️ system-test-fedora-rawhide SUCCESS in 32m 01s
✔️ system-test-fedora-36 SUCCESS in 11m 08s
✔️ system-test-fedora-35 SUCCESS in 12m 08s

@softwarefactory-project-zuul
Copy link

Build succeeded.

✔️ unit-test SUCCESS in 8m 14s
✔️ unit-test-migration-path-for-coreos-toolbox SUCCESS in 8m 14s
✔️ system-test-fedora-rawhide SUCCESS in 31m 57s
✔️ system-test-fedora-36 SUCCESS in 11m 00s
✔️ system-test-fedora-35 SUCCESS in 11m 52s

@jkemp814
Copy link

jkemp814 commented Feb 8, 2023

#98 (comment)

@jnohlgard
Copy link

@debarshiray Does this PR need anything else before merging?
This PR fixes a quite annoying problem #1059 when using GUI apps inside toolbox when running KDE on the host. Having the host's host name as the domain for the toolbox container's host name fixes that issue.

@nievesmontero
Copy link
Collaborator

nievesmontero commented Jul 5, 2023

Hey @akdev1l, sorry for the delay on this PR. I am reviewing everything to merge it as soon as possible.

I just wanted to ask a quick question. At the beginnning of the issue, in the initial explanation, you say that the expected container hostname is the following ${container_name}.${hostname}. However, after reviewing the code I realised that it is actually toolbox.${hostname}, am I right?

@runiq
Copy link

runiq commented Jul 5, 2023

At the beginnning of the issue, in the initial explanation, you say that the expected container hostname is the following ${container_name}.${hostname}. However, after reviewing the code I realised that it is actually toolbox.${hostname}, am I right?

As someone who uses multiple toolboxes (one for regular development work, and a "grab-bag" one for other stuff), I would prefer the container name. (But it's not a big deal, honestly, since I can always change the prompt from within the toolbox itself.)

@nievesmontero
Copy link
Collaborator

I also wanted to reproduce those timeouts you were talking about. Could you please let me know which commands led you to those timeouts so that I can reproduce them on my machine?

@jnohlgard
Copy link

I also wanted to reproduce those timeouts you were talking about. Could you please let me know which commands led you to those timeouts so that I can reproduce them on my machine?

Personally I encountered them mostly when using Jetbrains IDEs, e.g. IntelliJ, CLion, from within toolbox on a Fedora Kinoite (KDE plasma) system, but it should be possible to reproduce with other GUI apps on a KDE system because the problems seem to be with kwin trying to do some DNS resolution in relation to putting the host name in the title bar of the windows ("xterm <@toolbox>". Sorry I don't have any more exact recipe for this though.

@juhp
Copy link
Contributor

juhp commented Aug 25, 2023

+1 for using the container name for the hostname.
Distrobox already does this I believe.
With multiple toolboxes it makes much more sense.

@runiq
Copy link

runiq commented Oct 5, 2023

I just realized something: With the Foot terminal (and others), I can emit OSC 7 on directory change to have new terminals spawn in that directory. However, if you look at the linked code snippet:

printf \e\]7\;file://%s%s\e\\ $hostname (string escape --style=url $PWD)

You can see that this depends on the correct hostname (so that it doesn't erroneously catch directory changes over SSH connections, for example).

So, the toolbox not having the host's hostname breaks at least that bit of functionality.

@castedo
Copy link

castedo commented Oct 10, 2024

+1 for using the container name for the hostname. Distrobox already does this I believe. With multiple toolboxes it makes much more sense.

@juhp Distrobox has now switched to NOT set the hostname to container name because messing with the hostname caused too many problems. See
89luca89/distrobox#62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants