Skip to content

Commit

Permalink
docs: Update usage.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ayrtonm committed Sep 29, 2023
1 parent 086a5e6 commit 82ee0a0
Showing 1 changed file with 103 additions and 238 deletions.
341 changes: 103 additions & 238 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,290 +4,155 @@ To compartmentalize a program, cross-compartment calls must go through call gate
wrappers which change PKRU, switch stacks and scrub unused registers.
Compartments are the dynamic shared objects (DSOs) a program is comprised of and
they are either assigned one of 15 protection keys or default to the untrusted
protection key. This doc walks through how to compartmentalize a program.
protection key. This doc walks through how to compartmentalize a program using
our source rewriter.

## Direct calls

Cross-compartment direct calls go through call gate wrappers defined in shims
generated by the header-rewriter. Running the rewriter on a compartment's
exported headers produces a source file for a shim specific to the compartment.
The shim library must be specific to both the compartment and the caller so
`CALLER_PKEY` must be set to `0-15` with the `-D` flag when compiling the shim.
Note that `-DCALLER_PKEY=0` assigns the untrusted protection key to the caller's
compartment. The rewriter also produces a file with linker flags which must be
used with `-Wl,@$ARGS_FILE` when linking the caller's shared objects.
## Build process

Consider a main binary which exports `foo.h` and a library which exports `bar.h`
and `baz.h` with direct calls in both directions. To wrap calls in both
directions we invoke the rewriter twice to generate two shim libraries as shown
below. Then the original shared objects must be rebuilt with the linker flags
generated by the rewriter.
The build process for a compartmentalized program is to first run the sources
through our source rewriter, then compile with any standard C toolchain with a
few additional flags. Instead of rewriting sources in-place, the rewriter
creates a set of new, intermediate source (and header) files. Since the rewriter
only accepts a list of `.c` source files, the set of intermediate headers that
will be created is controlled by the `--root-directory` and `--output-directory`
command-line flags. Any file from subdirectories of the root directory
which is `#include`d in an input `.c` is copied over to the output directory
under the same subdirectory. Any `#include`d file which is not under the root
directory is treated as a system header and does not get rewritten.

```
+----------------+
| library shim |
/------>| libbar_shim.so |------\
calls to bar | | __wrap_bar() { | | __wrap_bar calls the
become calls | | bar(); | | original bar
to __wrap_bar| | } | |
| +----------------+ V
+---------------+ +----------------+
| | | |
| | | shared library |
| main binary | | libbar.so |
| foo() { ... } | | bar() { ... } |
| | | |
+---------------+ +----------------+
^ +----------------+ | calls to foo
__wrap_foo | | main | | become calls
calls the | | binary shim | | to __wrap_foo
original foo\-------| libmain_shim.so|<-----/
| __wrap_foo() { |
| foo(); |
| } |
+----------------+
```

Let's assign the main binary protection key 1 and give the library the default,
untrusted protection key. We must specify each compartment's protection key by
passing `--compartment-pkey=$N` to the rewriter when generating their
corresponding shim sources. Then we generate the shim sources and linker args
files with the following. This also modifies `foo.h`, `bar.h` and `baz.h`
in-place and creates backups of the original headers.

```
# Generate main_shim.c and main_shim.c.args and specify the main compartment's
# pkey.
header-rewriter --compartment-pkey=1 main_shim.c foo.h -- -I $SYS_HEADERS
Additionally the rewriter also takes an optional `--output-prefix` for naming
the build artifacts it generates and a list of source files. Generally you'll
want to generate and use a compile commands JSON to ensure the rewriter
preprocesses each source file with the same command-line arguments as when it is
compiled.

# May omit --compartment-pkey or use --compartment-pkey=0 since the library has
# the default pkey.
header-rewriter libbar_shim.c bar.h baz.h -- -I $SYS_HEADERS
```

To compile the shim sources we specify the shim caller's pkey. We also always disable
lazy binding.
## Manual source changes

```
# CALLER_PKEY for the main shim is 0 since the library has the untrusted pkey
gcc -shared main_shim.c -Wl,-z,now -DCALLER_PKEY=0 -I $IA2_INCLUDE_DIR \
-o libmain_shim.so
### Defining compartments

# CALLER_PKEY for the library shim is the main binary's pkey
gcc -shared libbar_shim.c -Wl,-z,now -Wl,--version-script,libbar_shim.c.syms \
-DCALLER_PKEY=1 -I $IA2_INCLUDE_DIR -o libbar_shim.so
```

We now modify the main binary's source to initialize our runtime using
`INIT_RUNTIME` and assign it a protection key with `IA2_COMPARTMENT`.
`INIT_RUNTIME` must be invoked once in the main binary. To assign a protection
key to another shared object, only `IA2_COMPARTMENT` must be added to one of
the object's source files.
The compartments for each DSO are declared with macros in one of their
constituent source files. We also need to declare the number of pkeys used by
the runtime with another macro. Consider a main binary `main.c` which we want to
put in compartment 1 and a library `foo.c` in compartment 2.

```
// main.c
// This header defines INIT_RUNTIME and IA2_COMPARTMENT
#include <ia2.h>
// Initialize the runtime and allocate 1 protection key.
INIT_RUNTIME(1);
INIT_RUNTIME(2); // This is the number of pkeys needed
// Assign protection key 1 to the main binary.
// This must be defined before including the following line
#define IA2_COMPARTMENT 1
#include <ia2_compartment_init.inc>
```
Then we rebuild the library and ensure it's linked against the main shim. We
must also include the modified header(s) corresponding to the main shim source.
In this case that means including the modified `foo.h`, but the original
`bar.h` and `baz.h`.
// foo.c
```
gcc -shared bar.c libmain_shim.so -Wl,-z,now -I $IA2_INCLUDE_DIR -fPIC \
-o libbar.so
```

Finally we rebuild the main binary. This time we include the modified `bar.h`
and `baz.h` since they correspond to the library shim and the original `foo.h`.
#include <ia2.h>
```
gcc main.c libbar.so libbar_shim.so libmain_shim.so -Wl,-z,now \
-I $IA2_INCLUDE_DIR -fPIC -Wl,-rpath=. -Wl,-T/$REPO_ROOT/libia2/padding.ld
#define IA2_COMPARTMENT 2
#include <ia2_compartment_init.inc>
```

All read-only and relro sections loaded from objects on disk will be shared with
all compartments. Binaries on disk are not considered secret, and tampering with
read-only sections is prohibited (TODO: syscall filtering). Relro sections are
read-only after the dynamic linker has applied relocations to the section, and
we do not consider address space layout (and therefore relocation contents) to
be secret. This means that secret data such as keys MUST NOT be embedded in a
binary, e.g. as a string literal.

Shared objects that are assigned a protection key must have certain
sections page-aligned and padded. This includes `ia2_shared_data`,
`.dynamic`, and other sections that must be accessible from
to any compartment. The `padding.ld` linker script ensures this. This linker
script may augment other linker scripts, but it is the user's responsibility to
ensure these shared sections are page-aligned and padded. Failure to do this
causes the runtime to terminate the program during initialization.

If direct calls occur in only one direction (e.g. libraries rarely call the main
binary directly), only one shim is required. To wrap calls between two shared
libraries in different compartments, the process is the same.

We currently cannot wrap variadic (varargs) functions correctly. To switch
stacks in that case, we would need to know how many arguments need to be passed
on the stack, and that requires application-specific knowledge (see #18 for
details). We emit a warning for variadic functions in processed headers, but we
preserve the function declaration as-is. This will result in calls to that
function not switching compartments and running with the caller's permission.

## Indirect calls

Cross-compartment indirect calls go through call gate wrappers defined by macros
which must be manually added to the program source. We assume a compartment's
exported headers define all function pointer types which it may receive from or
send to another compartment. With this assumption the rewriter has an
`--output-header` option which generates a header with the indirect wrapper
macros. The rewritten exported headers will include the output header by its
full path, so using the indirect wrappers does not require adding any new
includes.

Let's again consider a main binary with pkey 1 and a library with untrusted pkey
0. This time the library exports `ptr.h` with the following
Note that this must only be included in one source file per DSO.

```
typedef int(*Fn)(int);
void set_fn_ptr(Fn f);
Fn get_fn_ptr(void);
```
### Sharing data

Running the rewriter on `ptr.h` will change `Fn` from a typedef for a function
pointer to a typedef for an opaque struct. The output header will then have the
macros needed to manually wrap function pointers in the main binary's source.
Function pointers that will be sent to another compartment or are visible to the
library (e.g. global variables) must be wrapped as follows. Failure to manually
wrap all cross-compartment function pointers or using the incorrect mangled type
name, will give a warning when compiling the main binary. This should be
converted to a hard-error with `-Werror=incompatible-pointer-types`.
Some statically-allocated data must be made accessible to all compartments to
avoid significant refactoring. In these cases `IA2_SHARED_DATA` can be used to
place variables in a special ELF section which is explicitly accessible from all
compartments. Note that we assume that on-disk data (i.e. read-only variables)
is not sensitive so this is only needed for some read-write variables.

To send a function pointer to another compartment, you must first define a wrapper with
### Signal handlers

```
IA2_DEFINE(target_fn, mangled_fnptr_type, target_pkey);
```
Signal handlers require special call gate wrappers defined by the
`IA2_DEFINE_SIGACTION` or `IA2_DEFINE_SIGHANDLER` macros. To reference the
special call gate wrapper defined for a given function `foo` use
`IA2_SIGHANDLER(foo)`.

This creates a wrapper to call `target_fn` and transition from untrusted pkey 0
to `target_pkey`. `mangled_fnptr_type` is the mangled function pointer type
which can be found by looking at the definition of `Fn` in the rewritten
`ptr.h`. The mangled function pointer type may also be found in the errors shown
when attempting to compile the source with the rewritten headers, though this
depends on the compiler and how the function pointer is used.
### Interaction with system headers

The wrapper may then be referenced (and passed to a function) with
Function pointers defined in system headers (or those outside
`--root-directory`) do not get rewritten as opaque types. Due to limitations in
the rewriter, this means that some annotations will be inserted and trigger
compiler errors due to type mismatches. For example, consider a compartment that
calls `qsort`. This is a libc function with the signature
```
IA2_WRAPPER(target_fn, target_pkey)
void qsort(void *ptr, size_t count, size_t size, int (*comp)(const void *, const void *));
```

Since it is declared in a system header, the fourth argument will remain a
function-pointer instead of being rewritten as an opaque struct. However the
rewriter currently sees the function-pointer type passed in as the fourth
argument and rewrites it as
```
-qsort(ptr, count, size, cmp_fn)
+qsort(ptr, count, size, IA2_FN(cmp_fn))
```
// main.c
#include <ia2.h>
// The rewritten ptr.h includes ia2.h for the IA2_* macros. It also includes the
// output header which defines the mangled type macros (i.e. _ZTSPFiiE).
#include "ptr.h"
INIT_RUNTIME(1);
#define IA2_COMPARTMENT 1
#include <ia2_compartment_init.inc>
// This creates an opaque struct set to NULL.
Fn uninit = IA2_NULL_FNPTR(_ZTSPFiiE);

int incr(int x) { return x + 1; }
To avoid this annotation the `IA2_IGNORE` macro must be added around `cmp_fn` to
ensure it is not modified by the rewriter.

int main() {
// With the modified ptr.h this will fail to compile.
// set_fn_ptr(incr);
```
qsort(ptr, count, size, IA2_IGNORE(cmp_fn))
```

// This defines a wrapper to call `incr` and change the PKRU from the
// untrusted pkey to pkey 1.
IA2_DEFINE_WRAPPER(incr, _ZTSPFiiE, 1);
The same logic applies if a system header defines a struct which contains
function pointers. Note that this temporary limitation of the rewriter will
always cause compiler errors at the sites that need to be changed.

// If the wrapper was already defined in another source file, you must
// instead declare it to avoid multiple definition linker errors
// IA2_DECLARE_WRAPPER(incr, _ZTSPFiiE, 1);
### Function pointer annotations in macros

// This passes the wrapper defined/declared on the previous line as an
// argument to set_fn_ptr
set_fn_ptr(IA2_WRAPPER(incr, 1));
While the rewriter can rewrite object-like macros, automatically rewriting
function-like macros is currently not supported. Again sites that need manual
changes will trigger compiler errors due to type mismatches. The following
macros are usually adequate to handle these cases and are documented in more
detail in ia2.h.

// Alternatively IA2_DEFINE_WRAPPER_FN_SCOPE may be used as follows to both
// define a wrapper and get a pointer to it. This reduces the number of
// changes that need to be made, but may only be used in functions.
// set_fn_ptr(IA2_DEFINE_WRAPPER_FN_SCOPE(incr, _ZTSPFiiE, 1));
}
```

To call a function pointer received from another compartment use
```
IA2_CALL(target_fn, mangled_fnptr_type, caller_pkey)(args)
IA2_FN_ADDR(func) - Get the address of the wrapper for `func`
IA2_ADDR(opaque) - Get the address of the wrapper pointed to by the struct `opaque`
IA2_AS_PTR(opauqe) - Same as IA2_ADDR but may be used on the LHS for assignment
IA2_FN(func) - Reference the wrapper for `func` as an opaque struct
IA2_CALL(opaque, id) - Calls the the wrapper which `opaque` points to
IA2_CAST(func, ty) - Get a struct of type `ty` pointing to `func`'s wrapper
```

This creates a wrapper to call `target_fn` and transition from `caller_pkey` to
untrusted pkey 0.
## Building the callgate shim

Running the source rewriter produces a `.c` which will be used to build the
callgate shim library. To compile it use

```
// main.c
#include "ptr.h"
int main() {
Fn decr = get_fn_ptr();
// This will fail to compile with modified ptr.h
// decr();
if (!IA2_FNPTR_IS_NULL(decr)) {
// This defines a wrapper to call decr and change the PKRU from pkey 1
// to the untrusted pkey. The function pointer that this expands to must
// be called immediately after invoking the macro.
IA2_CALL(decr, _ZTSPFiiE, 1)();
}
}
$CC -shared -fPIC -Wl,-z,now callgate_wrapper.c -I /path/to/libia2/include -o libcallgates.so
```

In both these examples since one side is using untrusted pkey 0 (the library),
function pointers only need to be wrapped on the side with the trusted pkey.
However, we currently assume that trusted compartments all mutually distrust
each other. This means that to send function pointers between compartments with
two different trusted pkeys, function pointers need to be wrapped on both sides.

Also as shown above, ia2.h contains additional `IA2_*` macros for added
flexibility. In particular the IA2_*_FN_SCOPE macros are generally more
ergonomic, but may not be used in the global scope. See the documentation in
ia2.h for more info.

## Shared headers
## Compiling and linking the program

The examples above assume that each compartment exports a unique set of headers
which define its interface. However it's not always possible to assign each
header to one compartment. For example, a header may contain declarations for
functions defined in one compartment and types used by multiple compartments.
Assuming all functions declared in a header belong to a single compartment, the
rewriter handles this with the `shared-headers` option.


Using the example shown above let's say that `ptr.h` is the same, but
`set_fn_ptr` and `get_fn_ptr` are defined in the main binary. The library uses
the `Fn` typedef so it's headers include `ptr.h`. When we run the rewriter we
must pass in `ptr.h`, but also blacklist the function declarations from being
rewritten.
In addition to the flags normally used to build the sources, the following flags
are also required

```
header-rewriter lib_shim.c lib.h ptr.h --shared-headers ptr.h
```
# For all DSOs
-fPIC
-DPKEY=$PKEY
-DIA2_ENABLE=1
-include /path/to/generated_output_header.h
-Werror=incompatible-pointer-types
-Wl,--wrap=pthread_create
-pthread
-Wl,-z,now
-Wl,-z,relro
-Wl,-T/path/to/libia2/padding.ld
This will again modify `lib.h` and `ptr.h` in-place, but avoids changing the
function declarations in `ptr.h` since the library does not define those
functions. Then by including the modified `ptr.h` in the main binary source, we
can wrap function pointers with `IA2_FNPTR_WRAPPER(foo, _ZTSPFiiE, ...)`.
# For the DSO that initializes the runtime
-Wl,--wrap=main
-Wl,--dynamic-list=/path/to/libia2/dynsym.syms
-Wl,--export-dynamic
```

TODO: Add blurbs on public vs private headers and `$SYS_HEADERS`
Also if the rewriter produces a linker args file for a given compartment (i.e. a
.ld file) you must include `-Wl,@/path/to/generated_linker_args_$PKEY.ld` when
linking that DSO.

0 comments on commit 82ee0a0

Please sign in to comment.