Skip to content

Commit

Permalink
merge: #3178
Browse files Browse the repository at this point in the history
3178: feat: add PoolNoodle to manage firecracker jails r=sprutton1 a=sprutton1

PoolNoodle lives!

When using Firecracker, this will spin up a background thread responsible for ensuring that jails are available for consumption for functions. It favors creating new jails from the total available and will instead clean up old jails if none are available. 

```
    /// Starts the loop responsible for jail lifetimes. The loop works by:
    /// 1. Check if we have fewer ready jails than the `[MINUMUM_JAIL_PERCENTAGE]` of `[pool_size]`
    /// 2. If so, go get an unprepared jail and prepare it!
    /// 3. If not, check if there are any jails that need to be cleaned.
    /// 4. If so, clean them and move them to `[unprepared]` so they can be made ready.
    /// 5. If not, do nothing!
```

I am _certain_ there are edge cases here that I have not covered. Feel free to bring them up in the PR. The biggest obvious issue is "what if we run out of jails?". This is a real issue we have today. PoolNoodle should help mitigate it some, but it's not bulletproof. I definitely see a more robust implementation involving changing how we think about pooling altogether (and whether or not Deadpool is right for our needs here). I think this will get us by for now, though.

Also, this is still leveraging the old bash scripts to do the jail prep/clean. Porting these over to rust is a non-trivial undertaking, so I'd like to push that out a bit.

```rust
---------------------------------------------------------------------
---------------------------------------------------------------------
---------------------------:::::::::::::::::::::---------------------
---------------------:::::::::::::::::::::-------::::::--------------
-----------------:::::::::::::::::::==========------::::::-----------
----#*+#-----::::::::::::::::::::::::---:::--==++++-:::::::::--------
---+#%`@@#----::::::::::::::::::===========++++***#*=::::::::::::-----`
--=+*`@@@@@*::::::::::::::::::::========++++****###*=::::::::::::::---`
--=`@@@@@@@@%+::::::::::::::::::-=======+++++**###%#+:::::::::::::::--`
----=`@@@@@@@@%=:::::::::::::::::=========++++*###%#+-::::::::::::::::`
------#`@@@@@@@@%-:::::::::::::::-....:===:....:*#%#+-::::::::::::::::`
-----:::#`@@@@@@@@%-::::::::::::..:-=.::+.:=**:-==%#*=::::::::::::::::`
----::::::%`@@@@@@@@%=::::::::::.-#%@@--=.#%@@#*#*%%*=::::::::::::::::`
---:::::::::`@@@@@@@@@%=::::::..:-*@@***#+-+####*%%%#+::::::::::::::::`
-:::::::::::%`@@@@@@@@@@%#:.......+***#%%###**#%%%%%#+::::::::::::::::`
:::::::::::*#**#`@@@@@@@@@%:......=+*#%%%%%@%#%%%%%%#*-:::::::::::::::`
::::::::::#`@*+**%@@@@@@@@@=......:++*******####%%%%%*=:::::::::::::::`
:::::::::::=%##%`@@@#%@%@@+........++++******###%%%%%*=:::::::::::::::`
:::::::::::=+*#`@@%#%%%%@*.........++++++*****###%%%%#+.::::::::::::::`
::::::::::=++===+*%*===+=.........++++++*****####%%%#**=:::::::::::::
:::::::::===++++*#`@*+++**=........-+++++******###%%%%#%###*-:::::::::`
::::::::-++===*%`@@@@%%%##:........++++++******###%%%%%%%###**+:::::::`
:::::::::*%***%`@@@@@%%%#:.......=+=+++++******####%%%%#%###****+:::::`
::::::::::*`@@@@@@@%%%%%#......-+==+++*********####%%%%+:.=##****+-:::`
:::::::::::::%`@@@-%%%%%#*:.:++++++************####%%%%+::::-***+**-::`
--::::::::::::::::-#%%%###********#-**********####%%%#*.:::::**++**-:
--::::::::::::::::::*#%%%#######*:..**********####%%%#*.:::::=*++**=:
---:::::::::::::::::::-#%%###*-.....**********#####%%##::::::=*++**+:
---::::::::::::::::::::::::::.......+*********#####%%%#-:::::+++**#=:
---::::::::::::::::::::::::::::.....-********######%%%#+::::*++**##::
----:::::::::::::::::::::::::::::::::#*******######%%%#*****++**##=::
------::::::::::::::::::::::::::::::.********######%%%%##**+***#%=:::
---------::::::::::::::::::::::::::::********######%%%%%*****##%-::::
---------::::::::::::::::::::::::::::+*******######%%%%%#**##%=:::---
----------:::::::::::::::::::::::::::-#*****#######%%%%%-:+#=::::----
:------------:::::::::::::::::::::::::##****#######%%%%%+:::::::-----
=:----------::::::::::::::::::::::::::*#***########%%%%%*::::::------
```

Co-authored-by: Scott Prutton <[email protected]>
  • Loading branch information
si-bors-ng[bot] and sprutton1 authored Jan 19, 2024
2 parents c2ad11a + f0e618f commit 5fb601d
Show file tree
Hide file tree
Showing 12 changed files with 389 additions and 103 deletions.
13 changes: 6 additions & 7 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion bin/veritech/scripts/stop.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,12 @@ userdel jailer-$SB_ID || true

# Remove directories and files
umount /srv/jailer/firecracker/$SB_ID/root/rootfs.ext4 || true
dmsetup remove rootfs-overlay-$SB_ID || true
until ! mountpoint -q /srv/jailer/firecracker/$SB_ID/root/rootfs.ext4
do
echo "waiting for unmount"
sleep .1
done
dmsetup remove --retry rootfs-overlay-$SB_ID

# We are not currently creating these.
# umount /srv/jailer/firecracker/$SB_ID/root/image-kernel.bin || true
Expand Down
10 changes: 9 additions & 1 deletion lib/deadpool-cyclone/src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,12 @@ pub trait Spec {
/// # });
/// # Ok::<(), SpawnError>(())
/// ```
async fn spawn(&self) -> result::Result<Self::Instance, Self::Error>;
async fn spawn(&self, id: u32) -> result::Result<Self::Instance, Self::Error>;

/// whether to enable pool_noodle when using this spec
fn use_pool_noodle(&self) -> bool;
/// the size of the pool
fn pool_size(&self) -> u16;
}

/// A type which implements the [Builder pattern] and builds a [`Spec`].
Expand Down Expand Up @@ -311,6 +316,9 @@ pub trait Instance {
/// # Ok::<(), TerminationError>(())
/// ```
async fn terminate(&mut self) -> result::Result<(), Self::Error>;

/// Get the id of the underlying child runtime
fn id(&self) -> u32;
}

// async fn spawn<B, E, I, S>(builder: &B) -> Result<impl Instance<Error = E>, E>
Expand Down
28 changes: 3 additions & 25 deletions lib/deadpool-cyclone/src/instance/cyclone/firecracker-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ execute_cleanup() {

}

prepare_jailers() {
clean_jails() {
ITERATIONS="${1:-100}" # Default to 100 jails
DOWNLOAD_ROOTFS="${2:-false}" # Default to false
DOWNLOAD_KERNEL="${3:-false}" # Default to false
Expand All @@ -315,32 +315,10 @@ prepare_jailers() {
fi
/firecracker-data/stop.sh $iter &> /dev/null &
done
wait
echo
echo "Elapsed: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"
fi

if test -f "/firecracker-data/prepare_jailer.sh"; then
IN_PARALLEL=1
SECONDS=0
for (( iter=0; iter<$ITERATIONS; iter++ ))
do
echo -ne "Validating jail $(($iter + 1 )) out of $ITERATIONS ... \r"
# this ensures we only run n jobs in parallel at a time to avoid
# process locks. This is an unreliable hack.
# TODO(scott): we need to walk through the processes called in this script
# and understand where locking could occur. Parallelization can be
# dangerous here, but testing implies that it works.
if [ $(jobs -r | wc -l) -ge $IN_PARALLEL ]; then
wait $(jobs -r -p | head -1)
fi
/firecracker-data/prepare_jailer.sh $iter &
done
echo
echo "Elapsed: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"
else
echo "prepare_jailer.sh script not found, skipping jail creation."
exit 1
fi
}

# -----------------------------------------
Expand Down Expand Up @@ -407,5 +385,5 @@ check_params_set && echo -e "Installation Values found to be:\n - $VARIABLES_FIL
check_os_release && echo -e "Operating System found to be:\n - $OS_VARIANT"
install_pre_reqs
execute_configuration_management $DOWNLOAD_ROOTFS $DOWNLOAD_KERNEL
prepare_jailers $JAILS_TO_CREATE $DOWNLOAD_ROOTFS $DOWNLOAD_KERNEL $FORCE_CLEAN_JAILS
clean_jails $JAILS_TO_CREATE $DOWNLOAD_ROOTFS $DOWNLOAD_KERNEL $FORCE_CLEAN_JAILS
execute_cleanup
12 changes: 11 additions & 1 deletion lib/deadpool-cyclone/src/instance/cyclone/local_http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ impl Instance for LocalHttpInstance {

Ok(())
}
fn id(&self) -> u32 {
0
}
}

#[async_trait]
Expand Down Expand Up @@ -293,10 +296,17 @@ impl Spec for LocalHttpInstanceSpec {
type Instance = LocalHttpInstance;
type Error = LocalHttpInstanceError;

fn use_pool_noodle(&self) -> bool {
false
}
fn pool_size(&self) -> u16 {
0
}

async fn setup(&self) -> result::Result<(), Self::Error> {
Ok(())
}
async fn spawn(&self) -> result::Result<Self::Instance, Self::Error> {
async fn spawn(&self, _id: u32) -> result::Result<Self::Instance, Self::Error> {
let socket_addr = socket_addr_from(&self.socket_strategy).await?;
let mut cmd = self.build_command(&socket_addr);

Expand Down
57 changes: 33 additions & 24 deletions lib/deadpool-cyclone/src/instance/cyclone/local_uds.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,9 @@ impl Instance for LocalUdsInstance {

Ok(())
}
fn id(&self) -> u32 {
self.runtime.id()
}
}

#[async_trait]
Expand Down Expand Up @@ -345,6 +348,16 @@ impl Spec for LocalUdsInstanceSpec {
type Instance = LocalUdsInstance;
type Error = LocalUdsInstanceError;

fn use_pool_noodle(&self) -> bool {
matches!(
self.runtime_strategy,
LocalUdsRuntimeStrategy::LocalFirecracker
)
}
fn pool_size(&self) -> u16 {
self.pool_size
}

async fn setup(&self) -> result::Result<(), Self::Error> {
match self.runtime_strategy {
LocalUdsRuntimeStrategy::LocalDocker => Ok(()),
Expand All @@ -353,9 +366,9 @@ impl Spec for LocalUdsInstanceSpec {
}
}

async fn spawn(&self) -> result::Result<Self::Instance, Self::Error> {
async fn spawn(&self, id: u32) -> result::Result<Self::Instance, Self::Error> {
let (temp_path, socket) = temp_path_and_socket_from(&self.socket_strategy)?;
let mut runtime = runtime_instance_from_spec(self, &socket).await?;
let mut runtime = runtime_instance_from_spec(self, &socket, id).await?;

runtime.spawn().await?;
//TODO(scott): Firecracker requires the client to add a special connection detail. We
Expand Down Expand Up @@ -571,6 +584,7 @@ impl Default for LocalUdsRuntimeStrategy {

#[async_trait]
pub trait LocalInstanceRuntime: Send + Sync {
fn id(&self) -> u32;
fn socket(&mut self) -> PathBuf;
async fn spawn(&mut self) -> result::Result<(), LocalUdsInstanceError>;
async fn terminate(&mut self) -> result::Result<(), LocalUdsInstanceError>;
Expand Down Expand Up @@ -623,6 +637,9 @@ impl LocalProcessRuntime {

#[async_trait]
impl LocalInstanceRuntime for LocalProcessRuntime {
fn id(&self) -> u32 {
0
}
fn socket(&mut self) -> PathBuf {
self.socket.to_path_buf()
}
Expand Down Expand Up @@ -766,6 +783,9 @@ impl LocalDockerRuntime {

#[async_trait]
impl LocalInstanceRuntime for LocalDockerRuntime {
fn id(&self) -> u32 {
0
}
fn socket(&mut self) -> PathBuf {
self.socket.to_path_buf()
}
Expand Down Expand Up @@ -798,22 +818,13 @@ impl LocalInstanceRuntime for LocalDockerRuntime {
struct LocalFirecrackerRuntime {
cmd: Command,
child: Option<Child>,
vm_id: u32,
socket: PathBuf,
}

impl LocalFirecrackerRuntime {
async fn build() -> Result<Box<dyn LocalInstanceRuntime>> {
// todo(scott): the pid is a naive method for removing serially creating jails.
// we read the pid, use that as the vm_id, and then increment it.
// There are obvious contention problems here, but we should address
// this within deadpool itself.
let pid = Path::new("/firecracker-data/pid");
let vm_id = &std::fs::read_to_string(pid)
.map_err(LocalUdsInstanceError::FirecrackerPidRead)?
.parse::<i32>()
.map_err(LocalUdsInstanceError::FirecrackerPidParse)?;
std::fs::write(pid, format!("{}", vm_id + 1))
.map_err(LocalUdsInstanceError::FirecrackerPidWrite)?;
async fn build(id: u32) -> Result<Box<dyn LocalInstanceRuntime>> {
let vm_id = id;

let mut cmd = Command::new("/usr/bin/jailer");
cmd.arg("--cgroup-version")
Expand All @@ -836,13 +847,17 @@ impl LocalFirecrackerRuntime {
Ok(Box::new(LocalFirecrackerRuntime {
cmd,
child: None,
vm_id,
socket: sock,
}))
}
}

#[async_trait]
impl LocalInstanceRuntime for LocalFirecrackerRuntime {
fn id(&self) -> u32 {
self.vm_id
}
fn socket(&mut self) -> PathBuf {
self.socket.to_path_buf()
}
Expand Down Expand Up @@ -894,6 +909,7 @@ impl LocalInstanceRuntime for LocalFirecrackerRuntime {
async fn runtime_instance_from_spec(
spec: &LocalUdsInstanceSpec,
socket: &PathBuf,
id: u32,
) -> Result<Box<dyn LocalInstanceRuntime>> {
match spec.runtime_strategy {
LocalUdsRuntimeStrategy::LocalProcess => {
Expand All @@ -902,14 +918,13 @@ async fn runtime_instance_from_spec(
LocalUdsRuntimeStrategy::LocalDocker => {
LocalDockerRuntime::build(socket, spec.clone()).await
}
LocalUdsRuntimeStrategy::LocalFirecracker => LocalFirecrackerRuntime::build().await,
LocalUdsRuntimeStrategy::LocalFirecracker => LocalFirecrackerRuntime::build(id).await,
}
}

async fn setup_firecracker(spec: &LocalUdsInstanceSpec) -> Result<()> {
let script_bytes = include_bytes!("firecracker-setup.sh");
let command = Path::new("/firecracker-data/firecracker-setup.sh");
let pid = Path::new("/firecracker-data/pid");

// we need to ensure the file is in the correct location with the correct permissions
std::fs::create_dir_all(
Expand All @@ -922,20 +937,14 @@ async fn setup_firecracker(spec: &LocalUdsInstanceSpec) -> Result<()> {
std::fs::set_permissions(command, std::fs::Permissions::from_mode(0o755))
.map_err(LocalUdsInstanceError::FirecrackerSetupPermissions)?;

// firecracker pid is used to serialize jail creation
std::fs::create_dir_all(
pid.parent()
.expect("This should never happen. Did you remove the path from the string above?"),
)
.map_err(LocalUdsInstanceError::FirecrackerSetupCreate)?;
std::fs::write(pid, "0").map_err(LocalUdsInstanceError::FirecrackerSetupWrite)?;
// Spawn the shell process
let _status = Command::new("sudo")
.arg(command)
.arg("-j")
.arg(&spec.pool_size.to_string())
.arg("-rk")
.status();
.status()
.await;

Ok(())
}
Expand Down
Loading

0 comments on commit 5fb601d

Please sign in to comment.