-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QEMU backend #195
QEMU backend #195
Conversation
af41390
to
e854a34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool work!
I would refactor the Makefile to put version numbers into variables to ease updating the Makefile, if needed.
lib/store_spec.ml
Outdated
@@ -60,7 +66,7 @@ let store_t = Arg.conv (of_string, pp) | |||
let store ?docs names = | |||
Arg.opt Arg.(some store_t) None @@ | |||
Arg.info | |||
~doc:"$(docv) must be one of $(b,btrfs:/path), $(b,rsync:/path), $(b,xfs:/path), $(b,overlayfs:/path), $(b,zfs:pool) or $(b,docker:path) for the OBuilder cache." | |||
~doc:"$(docv) must be one of $(b,btrfs:/path), $(b,rsync:/path), $(b,xfs:/path), $(b,overlayfs:/path), $(b,zfs:pool), $(b,qmu:/path) or $(b,docker:path) for the OBuilder cache." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~doc:"$(docv) must be one of $(b,btrfs:/path), $(b,rsync:/path), $(b,xfs:/path), $(b,overlayfs:/path), $(b,zfs:pool), $(b,qmu:/path) or $(b,docker:path) for the OBuilder cache." | |
~doc:"$(docv) must be one of $(b,btrfs:/path), $(b,rsync:/path), $(b,xfs:/path), $(b,overlayfs:/path), $(b,zfs:pool), $(b,qemu:/path) or $(b,docker:path) for the OBuilder cache." |
doc/qemu.md
Outdated
``` | ||
|
||
Moving on to the next stage in the build which is the `run` directive. | ||
First, `qemu-img` creates a snapshot of the current `resuilt` layer into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, `qemu-img` creates a snapshot of the current `resuilt` layer into | |
First, `qemu-img` creates a snapshot of the current `result` layer into |
This is pretty cool! If you install Docker in your base image, you could even have obuilder-in-obuilder working :p |
* Added RISCV64 support * Added OpenBSD support
@@ -1,12 +1,18 @@ | |||
|
|||
clean: | |||
rm -f *.qcow2 | |||
rm -f *.qcow2 *.iso |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm -f *.qcow2 *.iso | |
rm -f ./*.qcow2 ./*.iso |
# busybox | ||
|
||
seed.iso: busybox.yaml | ||
cp $< user-data.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp $< user-data.yaml | |
cp -p $< user-data.yaml |
tar -czf mirror/pub/OpenBSD/7.6/amd64/site76.tgz install.site | ||
for f in BUILDINFO SHA256.sig base76.tgz bsd bsd.mp bsd.rd comp76.tgz game76.tgz man76.tgz pxeboot xbase76.tgz xfont76.tgz xserv76.tgz xshare76.tgz ; do curl -C - -L https://cdn.openbsd.org/pub/OpenBSD/7.6/amd64/$$f -o mirror/pub/OpenBSD/7.6/amd64/$$f ; done | ||
cd mirror/pub/OpenBSD/7.6/amd64 && ls -l > index.txt | ||
cp install.conf disklabel mirror |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp install.conf disklabel mirror | |
cp -p install.conf disklabel mirror |
mkdir -p mirror/pub/OpenBSD/7.6/amd64 | ||
tar -czf mirror/pub/OpenBSD/7.6/amd64/site76.tgz install.site | ||
for f in BUILDINFO SHA256.sig base76.tgz bsd bsd.mp bsd.rd comp76.tgz game76.tgz man76.tgz pxeboot xbase76.tgz xfont76.tgz xserv76.tgz xshare76.tgz ; do curl -C - -L https://cdn.openbsd.org/pub/OpenBSD/7.6/amd64/$$f -o mirror/pub/OpenBSD/7.6/amd64/$$f ; done | ||
cd mirror/pub/OpenBSD/7.6/amd64 && ls -l > index.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's index.txt
used for? ls -l
is a bit brittle, maybe find . -print
would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is included in the OpenBSD distribution. For example, see https://cdn.openbsd.org/pub/OpenBSD/7.6/amd64/index.txt
The FAQ says it should be updated when a custom site76.tgz
is added. Therefore, I think ls -l
is the expected format.
Note: If you intend to provide the sets over HTTP(s), place site76.tgz in your source directory and include it in your
index.txt
. It will then be an option at install time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part way thru reviewing! Really fun to read thru, and I'm learning quite a bit about QEMU along the way. I have a few (probably quite naive) questions, not having used QEMU before.
I'll resume reviewing tomorrow!
doc/qemu.md
Outdated
- windows-server-2022-amd64-ocaml-4.14.2.qcow2 | ||
- windows-server-2022-amd64-ocaml-5.2.0.qcow2 | ||
|
||
The base images build automatically using Cloud Init on Ubuntu, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure how to parse this, as I assume the base images will only be build on command, and not on their own. But does it mean
The base images build automatically using Cloud Init on Ubuntu, | |
The build of base images will automatically use Cloud Init on Ubuntu, |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base images build are build using make ubuntu
, make windows
or make openbsd
. The builds are unattended builds requiring no manual intervention. Cloud Init is used on Ubuntu, autounattend.xml
on Windows and autoinstall
on OpenBSD.
| Ok _ -> Lwt_result.ok (Lwt.return ()) | ||
| _ -> Lwt_unix.sleep 1. >>= fun _ -> loop (n - 1) in | ||
Lwt_unix.sleep 5. >>= fun _ -> | ||
loop t.qemu_boot_time >>= fun _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiuc, we ignore the possible error that the loop could return if it runs of out of retries here. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters. A failure here would indicate that qemu_boot_time
is too low or the base image is bad. The next ssh
commands would also fail and we would then orderly clean up the QEMU machine with the quit
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally have a preference for failing early, as it helps with debugging and can sometimes make it easier to improve fault tolerance later on. But I am convinced it won't cause problems here, so won't press the point.
A comment here to explain that reasoning may help future readers, but I don't insist on that either in case you think it's not needed.
lib/qemu_sandbox.ml
Outdated
| 0 -> Lwt_result.fail (`Msg "No connection") | ||
| n -> | ||
Os.exec_result ~pp (ssh @ ["exit"]) >>= function | ||
| Ok _ -> Lwt_result.ok (Lwt.return ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be
| Ok _ -> Lwt_result.ok (Lwt.return ()) | |
| Ok _ -> Lwt.return_ok () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Updated.
Lwt_unix.sleep 1. >>= fun () -> | ||
loop (n - 1) | ||
else Lwt.return () in | ||
loop t.qemu_boot_time >>= fun _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised we use the CLI configured "boot time" also to determine how long to to wait before issuing the quit here. Is it because the boot up time and time before it is responsive to a poweroff times are generally symmetrical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in testing, the times when a longer boot time was needed coincided with times when a longer shutdown time was required. Notably, this was needed for RISCV-emulated machines, which are generally slower. Your second sentence is slightly wrong about what is happening here. An orderly shutdown is requested via an ACPI event or an OS shutdown command, but if the machine doesn't shut down in a timely manner, quit
is issued to the emulator which terminates the machine akin to powering it off at the wall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice. I guess that symmetry may make sense. Given that the variable and flag is called "boot time", I think adding a comment to that effect here could help readability in the future. Feel free to disregard tho.
|
||
# busybox | ||
|
||
seed.iso: busybox.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The busybox.yml
target seems to be missing, giving make: *** No rule to make target 'busybox.yaml', needed by 'seed.iso'. Stop.
is it possible this file needs to be committed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - I missed it from the commit. It has been added now.
|
||
let strf = Printf.sprintf | ||
|
||
let running_as_root = Unix.getuid () = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: I can see that this occurs in several places in the code base. But, it seems like it is a misnomer, since afaict the value here gets decided at compile time. So it seem like it should either be
let running_as_root () = Unix.getuid () = 0
or
let compiled_as_root = Unix.getuid () = 0
Any ideas on the intended meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does look to be wrong. It is supposed to be used to decide whether to sudo
or not, but since we run as root anyway this has gone unnoticed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions around how the new tar
value is exposed
lib/qemu_sandbox.ml
Outdated
| Linux -> None | ||
| OpenBSD -> Some ["gtar"; "-xf"; "-"] | ||
| Windows -> Some ["/cygdrive/c/Windows/System32/tar.exe"; "-xf"; "-"; "-C"; "/"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To illustrate my suggestion around cleaning up the API for the new tar value, this would look something like
| Linux -> None | |
| OpenBSD -> Some ["gtar"; "-xf"; "-"] | |
| Windows -> Some ["/cygdrive/c/Windows/System32/tar.exe"; "-xf"; "-"; "-C"; "/"] | |
| Linux -> default_tar | |
| OpenBSD -> ["gtar"; "-xf"; "-"] | |
| Windows -> ["/cygdrive/c/Windows/System32/tar.exe"; "-xf"; "-"; "-C"; "/"] |
lib/s.ml
Outdated
val shell : t -> string list option | ||
(** [shell] optional value to be used as the default shell. *) | ||
|
||
val tar : t -> string list option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val tar : t -> string list option | |
val tar : t -> string list |
lib/s.ml
Outdated
(** [shell] optional value to be used as the default shell. *) | ||
|
||
val tar : t -> string list option | ||
(** [tar] tar command for this sandbox. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(** [tar] tar command for this sandbox. *) | |
(** [tar t] is the command to use for this sandbox. *) |
lib/sandbox.jail.ml
Outdated
@@ -168,6 +168,10 @@ let create ~state_dir:_ _c = | |||
let finished () = | |||
Lwt.return () | |||
|
|||
let shell _ = None | |||
|
|||
let tar _ = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let tar _ = None | |
let tar _ = default_tar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished passing eyes over everything. Awesome work! LMK if you have any questions about my suggestions or questions.
lib/qemu_store.mli
Outdated
(** [create ~path] creates a new overlayfs store where everything will | ||
be stored under [path]. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(** [create ~path] creates a new overlayfs store where everything will | |
be stored under [path]. *) | |
(** [create ~root] creates a new overlayfs store where everything will | |
be stored under [root]. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but with s/overlayfs/QEMU/g
lib/sandbox.macos.ml
Outdated
@@ -114,6 +114,10 @@ let finished () = | |||
Os.sudo [ "zfs"; "mount"; "obuilder/result" ] >>= fun () -> | |||
Lwt.return () | |||
|
|||
let shell _ = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious what it means when sell x
is None
? I haven't been able to figure that out in the course of reviewing. Could you point out how this shell value ends up being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is that all the sandboxes invoke command x
by running /usr/bin/env bash -c x
where that default value comes from lib/build.ml
by passing no parameter for shell
. [Hence why my tar
implementation used the same concept]. Shell is defined as:
let shell = Option.value ~default:(if Sys.win32 then ["cmd"; "/S"; "/C"] else ["/usr/bin/env"; "bash"; "-c"]) shell in
This only makes sense when Sys
is the machine that we are building on, but in this case we build on a different architecture to the one we target. Also since commands are invoked via SSH, they already have a default shell so adding an extra one isn't needed.
I've reworked the original implementation of the default shell to follow the tar implementation...
Check main.ml
for how it is used and of course ocluster-worker
qemu-img create -f qcow2 -b $< -F qcow2 $@ 20G | ||
qemu-system-riscv64 -m 16G -smp 8 -machine type=virt -nographic \ | ||
-bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.bin \ | ||
-kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved this by installing a system dep, thanks to Mark's guidance.
@@ -0,0 +1,10 @@ | |||
/ 2G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A readme in the qemu
dir that explains what these different files are used for would be really helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as far as I've gotten with documenting the requirements I've needed to try to make ubuntu
:
# Utilities to support the QEMU backend
See [../doc/qemu.md](../doc/qemu.md) for documentation and usage
## Prerequisites
In order to use the targets defined in the [Makefile](./Makefile), at least the
following must be installed,
- [`qemu`](https://www.qemu.org/download/), including
- A full installation, e.g.,
- MacOS: `brew install qemu`
- Debian: `apt-get install qemu-system`
- Arch: `pacman -Syu qemu-full`
- u-boot-qemu:
- [Debian](https://packages.debian.org/stable/admin/u-boot-qemu)
- [Arch](https://aur.archlinux.org/packages/u-boot-qemu-bin)
- `cloud-localds` form [cloud-utils](https://github.com/canonical/cloud-utils)
I suggest putting this (or something like it) in qemu/README.hd
, and also including some guidance on what the various different files in this dir are used for. LMK if you'd like me to make a commit into this branch adding the start of the readme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! Thanks for the careful explanations in the docs and in reply to my comments.
I have a few minor followup suggestions, but mostly purely optional stuff about where I think comments could help.
The one change I'd still really like to see included is a README with content covering the kinds of stuff I suggested in #195 (review)
I was able to build the ubuntu base images! 🎉 Awesome 👯
| Ok _ -> Lwt_result.ok (Lwt.return ()) | ||
| _ -> Lwt_unix.sleep 1. >>= fun _ -> loop (n - 1) in | ||
Lwt_unix.sleep 5. >>= fun _ -> | ||
loop t.qemu_boot_time >>= fun _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally have a preference for failing early, as it helps with debugging and can sometimes make it easier to improve fault tolerance later on. But I am convinced it won't cause problems here, so won't press the point.
A comment here to explain that reasoning may help future readers, but I don't insist on that either in case you think it's not needed.
Lwt_unix.sleep 1. >>= fun () -> | ||
loop (n - 1) | ||
else Lwt.return () in | ||
loop t.qemu_boot_time >>= fun _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice. I guess that symmetry may make sense. Given that the variable and flag is called "boot time", I think adding a comment to that effect here could help readability in the future. Feel free to disregard tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark and I agreed that we can add the README in a followup PR. All else looks good to me! Awesome work here. Maybe have a look at my last comments in case there are things there you'd like to incorporate.
This PR adds a QEMU sandbox which allows testing in Windows (and Linux) virtual machines.