The TMC sandbox consists of the following:
- A User-Mode Linux kernel.
- A minimal Linux root disk image with compilers and stuff. Currently based on Debian 6 using Multistrap but something smaller might be nice.
- An initrd that layers a ramdisk on top of the read-only root disk (using aufs).
- An optional rack webservice.
Install the following prerequisites:
build-essential
squashfs-tools
multistrap
If you're on a Debian derivative, you may need to install Debian's archive key:
curl -L http://ftp-master.debian.org/archive-key-6.0.asc | sudo apt-key add -
Now build with sudo make
. Root access is needed by multistrap since it chroots.
You can test the sandbox with ./run-test-exercise.sh
or ./run-bash.sh
under uml/
.
The sandbox is invoked by starting uml/output/linux.uml
with at least the following kernel parameters:
initrd=initrd.img
- the initrd.ubdarc=rootfs.squashfs
- the rootfs (therc
meaning read-only shared).ubdbr=runnable.tar
- an uncompressed tar file containing an executabletmc-run
.ubdc=output.tar
- a zeroed file with a reasonable amount of space for the output.test_output.txt
,exit_code.txt
,stdout.txt
andstderr.txt
will be written there as a tar-file.mem=xyzM
- the memory limit.
The normal boot process is skipped. The initrd invokes a custom init script that prepares a very minimal environment, calls tmc-run
, flushes output and halts the virtual machine.
Note: the RAM given to the sandbox is mmap'ed from /run/shm
, a tmpfs that defaults to half of your actual RAM. Make sure your /run/shm
has enough for all the sandboxes you are running, or the sandboxes may suffer kernel panics as they try to allocate memory they think they have available.
There's a simple Rack webservice under web/
.
The service implements the following protocol.
POST /tasks.json
Expects multipart formdata with these parameters:
- file: task file as plain tar file
- notify: URL for notification when done
- token: token to post to notification URL
It runs the task in the sandbox and sends a POST request to the notify URL with the following JSON object:
- status: one of 'finished', 'failed', 'timeout'.
- 'finished' iff
tmc-run
completed successfully with exit code 0. - 'timeout' if
tmc-run
took too long to complete - 'failed' in any other case
- 'finished' iff
- exit_code: the exit code of
tmc-run
, or null if not applicable - token: the token given in the request
- test_output: the
test_output.txt
created by the task. May be empty. - stdout: the stdout.txt created by the task. May be empty.
- stderr: the stderr.txt created by the task. May be empty.
Only a limited number of tasks may run per instance of this webservice.
If it is busy, it responds with a HTTP 500 and a JSON object {status: 'busy'}
.
First, read through the configuration file in site.defaults.yml
.
Install dependencies with bundle install
and
compile the small C extension with rake ext
.
Run tests by doing sudo rake test
under web/
. It requires e2fsprogs
and e2tools
to be installed.
Start the service with sudo webapp.rb run
and stop it with Ctrl-C.
That script does the extra setup needed for network support, if configured,
and then invokes rackup
on the configured http port as the configured user account.
The service may be installed as an init script by doing sudo rake init:install
(or rvmsudo ...
).
The service should definitely be secured by a firewall or network segregation.
The web service can be configured to provide very limited network access to the sandboxes. It uses a TAP device, dnsmasq and squid to give access via a HTTP proxy only. The required software is included and started/stopped automatically. Tap devices are also created and configured on demand and destroyed on exit.
On Ubuntu, you may need to comment out the line dns=dnsmasq
from /etc/NetworkManager/NetworkManager.conf
to avoid a conflict with the system's own dnsmasq.`
Running maven projects efficiently is tricky because downloading dependencies can take a lot of time. We found that a simple HTTP cache outside UML doesn't help much. For fast execution, the dependencies should already be in the local repository.
We don't want untrusted code to have write access to the repository.
To solve this, the webservice has an optional plugin
that inspects incoming exercises and starts a background process to
download their dependencies to a cache. This way, a project needs to download
its dependencies in the actual sandbox only on the first run (or the first
few runs if unlucky), when the cache is not yet populated.
The cache may also be populated by a pom.xml file upload to /maven_cache/populate.json
.
The technical details are documented in web/plugins/maven_cache.rb
.
The cache must be explicitly enabled in site.yml
.