Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runInLinuxVM: refactor structuredAttrs support, fix disko #360413

Merged
merged 6 commits into from
Nov 30, 2024

Conversation

wolfgangwalther
Copy link
Contributor

@wolfgangwalther wolfgangwalther commented Nov 30, 2024

In #354535 we fixed runInLinuxVM with structuredAttrs. However, this broke some out of tree users, e.g. disko: nix-community/disko#900.

In this series of commits I'm going through the structuredAttrs support for runInLinuxVM step-by-step again. After some cleanups and improvements, I decided to change the approach a bit. To make .attrs.sh available in stage2, we mounted the /build folder instead of only /tmp/xchg in #354535. I'm reverting back to /tmp/xchg here and instead am copying the attrs files into that folder as well.

This makes cases like disko work, which:

This was previously not picked up anymore after changing the folder to /build.

Could you test / confirm that this branch fixes it for you, @iFreilicht?

cc @Ma27

Resolves #359782

Things done

  • Built on platform(s)
    • x86_64-linux:
      • nix-build pkgs/build-support/vm/test.nix -A buildStructuredAttrsHelloInVM -A buildHelloInVM
      • nix-build -A apptainer.tests.image-hello-cowsay
      • nix-build -A nixosTests.systemd-boot
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

Those were left-over after 97ed6b4.

This also cleans up some confusion around TMPDIR. We had the following
lines:

  mkdir xchg
  ...
  cd $TMPDIR
  ...
  path=$TMPDIR/xchg

Those only worked because the **current directory** is the same as
$TMPDIR. Both are /build by default. To refer to the same directory in
two different ways is very confusing at best.
The export to saved-env was very intentionally done at the very
beginning of vmRunCommand, even jumping through extra hoops just to
avoid the PATH variable from polluting the saved variable.

In 26eba25 we loaded stdenv in the
wrong place, we should do it after saving the previous environment. This
is also more consistent with the order of how we load those values back
in stage2Init.
In [1] we started sourcing stdenv/setup in stage2Init to allow for
structuredAttrs. We failed to take the changed NIX_BUILD_TOP etc.
variables into account. We need to load stdenv/setup after changing
them, because the structuredAttrs startup code makes use of it.

[1]: 97ed6b4
This makes it simpler to copy more files to xchg for the structuredAttrs
case in the next commit.
…tory

The approach taken in [1] breaks down as soon as vmRunCommand is
manually called with an overriden TMPDIR, like disko does it. /build
will just not be available.

By moving the .attrs.sh file into the xchg folder explicitly, we can all
the "exchange infrastructure" the same as before, thus avoid more
breakage.

This reverts some parts of [1].

[1]: 97ed6b4
@Ma27
Copy link
Member

Ma27 commented Nov 30, 2024

I'm pretty sure this also fixes #359782 because of

I'm reverting back to /tmp/xchg here and instead am copying the attrs files into that folder as well.

Sorry, doing it the other way round was a bad decision.

@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Nov 30, 2024
@wolfgangwalther
Copy link
Contributor Author

I'm pretty sure this also fixes #359782

Yes, I think so, too.

Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed makes the reproducer from #359782 build (as well as hello with/without structured attrs and the systemd-boot tests also look good).

As mentioned before, I think this is the better approach.
@Mindavi does this also fix the issues you had? If so, I'd go ahead and merge.

Sorry for the inconvenience!

@iFreilicht
Copy link
Contributor

iFreilicht commented Nov 30, 2024

Thank you so much! This mostly fixes my issue, but I'm still getting this error when running the script generated by disko:

EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
loading kernel modules...
mounting Nix store...
mounting host's temporary directory...
starting stage 2 (/nix/store/pwj58yhz8c0l0c19r5872kbyk4hw9ykc-vm-run-stage2)
/nix/store/pwj58yhz8c0l0c19r5872kbyk4hw9ykc-vm-run-stage2: line 16: stdenv: unbound variable
[    0.367587] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100

The line source $stdenv/setup this error originates from was added in the original PR.

I don't even know how it's possible for stdenv to be unbound. We're using makeScriptWriter, maybe that's incompatible somehow?

Call order is diskoImagesScript -> writeCheckedBash -> makeScriptWriter

Maybe that line could just be wrapped with an if [ -v stdenv ]?

@wolfgangwalther
Copy link
Contributor Author

Maybe that line could just be wrapped with an if [ -v stdenv ]?

I'd rather not do that, because this would silently hide other problems. We should find out why $stdenv is not set here.

@wolfgangwalther
Copy link
Contributor Author

I think for disko, this line is the problem:

https://github.com/nix-community/disko/blob/b71e3faca99b40fb801f03fd950fbefbbba691a4/lib/make-disk-image.nix#L204C7-L204C62

This clears all other env-vars, that we had saved previously - including $stdenv.

Maybe make > a >> here, so that you only override origBuilder?

Copy link
Contributor

@iFreilicht iFreilicht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Yes, that was it! So from my side, this is good to merge 👍

iFreilicht added a commit to nix-community/disko that referenced this pull request Nov 30, 2024
Fixes #900

This was caused by NixOS/nixpkgs#354535
originally. The breaking changes introduced there have been resolved by
NixOS/nixpkgs#360413, but one addition survived,
which was the line `source $stdenv/setup`.

Because we used `>` instead of `>>`, `saved-env` was overwritten, so
even with the second PR, the script failed with the following error:

    /nix/store/pw...ykc-vm-run-stage2: line 16: stdenv: unbound variable

Once this and the second PR mentioned above are merged, #903 will be
unblocked.
@Ma27 Ma27 merged commit eecda1a into NixOS:master Nov 30, 2024
20 of 21 checks passed
@wolfgangwalther wolfgangwalther deleted the structured-attrs-run-in-vm branch December 1, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make-disk-image.nix: cannot initialize fsdev 'sa'
3 participants