From 82ee0a09e4b50716aca18f33c0022ff125dce2bd Mon Sep 17 00:00:00 2001 From: Ayrton Munoz Date: Fri, 29 Sep 2023 18:32:07 -0400 Subject: [PATCH] docs: Update usage.md --- docs/usage.md | 341 +++++++++++++++----------------------------------- 1 file changed, 103 insertions(+), 238 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index da5373d68e..a3bc42dc27 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -4,290 +4,155 @@ To compartmentalize a program, cross-compartment calls must go through call gate wrappers which change PKRU, switch stacks and scrub unused registers. Compartments are the dynamic shared objects (DSOs) a program is comprised of and they are either assigned one of 15 protection keys or default to the untrusted -protection key. This doc walks through how to compartmentalize a program. +protection key. This doc walks through how to compartmentalize a program using +our source rewriter. -## Direct calls -Cross-compartment direct calls go through call gate wrappers defined in shims -generated by the header-rewriter. Running the rewriter on a compartment's -exported headers produces a source file for a shim specific to the compartment. -The shim library must be specific to both the compartment and the caller so -`CALLER_PKEY` must be set to `0-15` with the `-D` flag when compiling the shim. -Note that `-DCALLER_PKEY=0` assigns the untrusted protection key to the caller's -compartment. The rewriter also produces a file with linker flags which must be -used with `-Wl,@$ARGS_FILE` when linking the caller's shared objects. +## Build process -Consider a main binary which exports `foo.h` and a library which exports `bar.h` -and `baz.h` with direct calls in both directions. To wrap calls in both -directions we invoke the rewriter twice to generate two shim libraries as shown -below. Then the original shared objects must be rebuilt with the linker flags -generated by the rewriter. +The build process for a compartmentalized program is to first run the sources +through our source rewriter, then compile with any standard C toolchain with a +few additional flags. Instead of rewriting sources in-place, the rewriter +creates a set of new, intermediate source (and header) files. Since the rewriter +only accepts a list of `.c` source files, the set of intermediate headers that +will be created is controlled by the `--root-directory` and `--output-directory` +command-line flags. Any file from subdirectories of the root directory +which is `#include`d in an input `.c` is copied over to the output directory +under the same subdirectory. Any `#include`d file which is not under the root +directory is treated as a system header and does not get rewritten. -``` - +----------------+ - | library shim | - /------>| libbar_shim.so |------\ -calls to bar | | __wrap_bar() { | | __wrap_bar calls the -become calls | | bar(); | | original bar -to __wrap_bar| | } | | - | +----------------+ V -+---------------+ +----------------+ -| | | | -| | | shared library | -| main binary | | libbar.so | -| foo() { ... } | | bar() { ... } | -| | | | -+---------------+ +----------------+ - ^ +----------------+ | calls to foo -__wrap_foo | | main | | become calls -calls the | | binary shim | | to __wrap_foo -original foo\-------| libmain_shim.so|<-----/ - | __wrap_foo() { | - | foo(); | - | } | - +----------------+ -``` - -Let's assign the main binary protection key 1 and give the library the default, -untrusted protection key. We must specify each compartment's protection key by -passing `--compartment-pkey=$N` to the rewriter when generating their -corresponding shim sources. Then we generate the shim sources and linker args -files with the following. This also modifies `foo.h`, `bar.h` and `baz.h` -in-place and creates backups of the original headers. - -``` -# Generate main_shim.c and main_shim.c.args and specify the main compartment's -# pkey. -header-rewriter --compartment-pkey=1 main_shim.c foo.h -- -I $SYS_HEADERS +Additionally the rewriter also takes an optional `--output-prefix` for naming +the build artifacts it generates and a list of source files. Generally you'll +want to generate and use a compile commands JSON to ensure the rewriter +preprocesses each source file with the same command-line arguments as when it is +compiled. -# May omit --compartment-pkey or use --compartment-pkey=0 since the library has -# the default pkey. -header-rewriter libbar_shim.c bar.h baz.h -- -I $SYS_HEADERS -``` -To compile the shim sources we specify the shim caller's pkey. We also always disable -lazy binding. +## Manual source changes -``` -# CALLER_PKEY for the main shim is 0 since the library has the untrusted pkey -gcc -shared main_shim.c -Wl,-z,now -DCALLER_PKEY=0 -I $IA2_INCLUDE_DIR \ - -o libmain_shim.so +### Defining compartments -# CALLER_PKEY for the library shim is the main binary's pkey -gcc -shared libbar_shim.c -Wl,-z,now -Wl,--version-script,libbar_shim.c.syms \ - -DCALLER_PKEY=1 -I $IA2_INCLUDE_DIR -o libbar_shim.so -``` - -We now modify the main binary's source to initialize our runtime using -`INIT_RUNTIME` and assign it a protection key with `IA2_COMPARTMENT`. -`INIT_RUNTIME` must be invoked once in the main binary. To assign a protection -key to another shared object, only `IA2_COMPARTMENT` must be added to one of -the object's source files. +The compartments for each DSO are declared with macros in one of their +constituent source files. We also need to declare the number of pkeys used by +the runtime with another macro. Consider a main binary `main.c` which we want to +put in compartment 1 and a library `foo.c` in compartment 2. ``` // main.c -// This header defines INIT_RUNTIME and IA2_COMPARTMENT #include -// Initialize the runtime and allocate 1 protection key. -INIT_RUNTIME(1); +INIT_RUNTIME(2); // This is the number of pkeys needed -// Assign protection key 1 to the main binary. +// This must be defined before including the following line #define IA2_COMPARTMENT 1 #include -``` -Then we rebuild the library and ensure it's linked against the main shim. We -must also include the modified header(s) corresponding to the main shim source. -In this case that means including the modified `foo.h`, but the original -`bar.h` and `baz.h`. +// foo.c -``` -gcc -shared bar.c libmain_shim.so -Wl,-z,now -I $IA2_INCLUDE_DIR -fPIC \ - -o libbar.so -``` - -Finally we rebuild the main binary. This time we include the modified `bar.h` -and `baz.h` since they correspond to the library shim and the original `foo.h`. +#include -``` -gcc main.c libbar.so libbar_shim.so libmain_shim.so -Wl,-z,now \ - -I $IA2_INCLUDE_DIR -fPIC -Wl,-rpath=. -Wl,-T/$REPO_ROOT/libia2/padding.ld +#define IA2_COMPARTMENT 2 +#include ``` -All read-only and relro sections loaded from objects on disk will be shared with -all compartments. Binaries on disk are not considered secret, and tampering with -read-only sections is prohibited (TODO: syscall filtering). Relro sections are -read-only after the dynamic linker has applied relocations to the section, and -we do not consider address space layout (and therefore relocation contents) to -be secret. This means that secret data such as keys MUST NOT be embedded in a -binary, e.g. as a string literal. - -Shared objects that are assigned a protection key must have certain -sections page-aligned and padded. This includes `ia2_shared_data`, -`.dynamic`, and other sections that must be accessible from -to any compartment. The `padding.ld` linker script ensures this. This linker -script may augment other linker scripts, but it is the user's responsibility to -ensure these shared sections are page-aligned and padded. Failure to do this -causes the runtime to terminate the program during initialization. - -If direct calls occur in only one direction (e.g. libraries rarely call the main -binary directly), only one shim is required. To wrap calls between two shared -libraries in different compartments, the process is the same. - -We currently cannot wrap variadic (varargs) functions correctly. To switch -stacks in that case, we would need to know how many arguments need to be passed -on the stack, and that requires application-specific knowledge (see #18 for -details). We emit a warning for variadic functions in processed headers, but we -preserve the function declaration as-is. This will result in calls to that -function not switching compartments and running with the caller's permission. - -## Indirect calls - -Cross-compartment indirect calls go through call gate wrappers defined by macros -which must be manually added to the program source. We assume a compartment's -exported headers define all function pointer types which it may receive from or -send to another compartment. With this assumption the rewriter has an -`--output-header` option which generates a header with the indirect wrapper -macros. The rewritten exported headers will include the output header by its -full path, so using the indirect wrappers does not require adding any new -includes. - -Let's again consider a main binary with pkey 1 and a library with untrusted pkey -0. This time the library exports `ptr.h` with the following +Note that this must only be included in one source file per DSO. -``` -typedef int(*Fn)(int); -void set_fn_ptr(Fn f); -Fn get_fn_ptr(void); -``` +### Sharing data -Running the rewriter on `ptr.h` will change `Fn` from a typedef for a function -pointer to a typedef for an opaque struct. The output header will then have the -macros needed to manually wrap function pointers in the main binary's source. -Function pointers that will be sent to another compartment or are visible to the -library (e.g. global variables) must be wrapped as follows. Failure to manually -wrap all cross-compartment function pointers or using the incorrect mangled type -name, will give a warning when compiling the main binary. This should be -converted to a hard-error with `-Werror=incompatible-pointer-types`. +Some statically-allocated data must be made accessible to all compartments to +avoid significant refactoring. In these cases `IA2_SHARED_DATA` can be used to +place variables in a special ELF section which is explicitly accessible from all +compartments. Note that we assume that on-disk data (i.e. read-only variables) +is not sensitive so this is only needed for some read-write variables. -To send a function pointer to another compartment, you must first define a wrapper with +### Signal handlers -``` -IA2_DEFINE(target_fn, mangled_fnptr_type, target_pkey); -``` +Signal handlers require special call gate wrappers defined by the +`IA2_DEFINE_SIGACTION` or `IA2_DEFINE_SIGHANDLER` macros. To reference the +special call gate wrapper defined for a given function `foo` use +`IA2_SIGHANDLER(foo)`. -This creates a wrapper to call `target_fn` and transition from untrusted pkey 0 -to `target_pkey`. `mangled_fnptr_type` is the mangled function pointer type -which can be found by looking at the definition of `Fn` in the rewritten -`ptr.h`. The mangled function pointer type may also be found in the errors shown -when attempting to compile the source with the rewritten headers, though this -depends on the compiler and how the function pointer is used. +### Interaction with system headers -The wrapper may then be referenced (and passed to a function) with +Function pointers defined in system headers (or those outside +`--root-directory`) do not get rewritten as opaque types. Due to limitations in +the rewriter, this means that some annotations will be inserted and trigger +compiler errors due to type mismatches. For example, consider a compartment that +calls `qsort`. This is a libc function with the signature ``` -IA2_WRAPPER(target_fn, target_pkey) +void qsort(void *ptr, size_t count, size_t size, int (*comp)(const void *, const void *)); ``` +Since it is declared in a system header, the fourth argument will remain a +function-pointer instead of being rewritten as an opaque struct. However the +rewriter currently sees the function-pointer type passed in as the fourth +argument and rewrites it as +``` +-qsort(ptr, count, size, cmp_fn) ++qsort(ptr, count, size, IA2_FN(cmp_fn)) ``` -// main.c -#include -// The rewritten ptr.h includes ia2.h for the IA2_* macros. It also includes the -// output header which defines the mangled type macros (i.e. _ZTSPFiiE). -#include "ptr.h" - -INIT_RUNTIME(1); -#define IA2_COMPARTMENT 1 -#include - -// This creates an opaque struct set to NULL. -Fn uninit = IA2_NULL_FNPTR(_ZTSPFiiE); -int incr(int x) { return x + 1; } +To avoid this annotation the `IA2_IGNORE` macro must be added around `cmp_fn` to +ensure it is not modified by the rewriter. -int main() { - // With the modified ptr.h this will fail to compile. - // set_fn_ptr(incr); +``` +qsort(ptr, count, size, IA2_IGNORE(cmp_fn)) +``` - // This defines a wrapper to call `incr` and change the PKRU from the - // untrusted pkey to pkey 1. - IA2_DEFINE_WRAPPER(incr, _ZTSPFiiE, 1); +The same logic applies if a system header defines a struct which contains +function pointers. Note that this temporary limitation of the rewriter will +always cause compiler errors at the sites that need to be changed. - // If the wrapper was already defined in another source file, you must - // instead declare it to avoid multiple definition linker errors - // IA2_DECLARE_WRAPPER(incr, _ZTSPFiiE, 1); +### Function pointer annotations in macros - // This passes the wrapper defined/declared on the previous line as an - // argument to set_fn_ptr - set_fn_ptr(IA2_WRAPPER(incr, 1)); +While the rewriter can rewrite object-like macros, automatically rewriting +function-like macros is currently not supported. Again sites that need manual +changes will trigger compiler errors due to type mismatches. The following +macros are usually adequate to handle these cases and are documented in more +detail in ia2.h. - // Alternatively IA2_DEFINE_WRAPPER_FN_SCOPE may be used as follows to both - // define a wrapper and get a pointer to it. This reduces the number of - // changes that need to be made, but may only be used in functions. - // set_fn_ptr(IA2_DEFINE_WRAPPER_FN_SCOPE(incr, _ZTSPFiiE, 1)); -} ``` - -To call a function pointer received from another compartment use -``` -IA2_CALL(target_fn, mangled_fnptr_type, caller_pkey)(args) +IA2_FN_ADDR(func) - Get the address of the wrapper for `func` +IA2_ADDR(opaque) - Get the address of the wrapper pointed to by the struct `opaque` +IA2_AS_PTR(opauqe) - Same as IA2_ADDR but may be used on the LHS for assignment +IA2_FN(func) - Reference the wrapper for `func` as an opaque struct +IA2_CALL(opaque, id) - Calls the the wrapper which `opaque` points to +IA2_CAST(func, ty) - Get a struct of type `ty` pointing to `func`'s wrapper ``` -This creates a wrapper to call `target_fn` and transition from `caller_pkey` to -untrusted pkey 0. +## Building the callgate shim + +Running the source rewriter produces a `.c` which will be used to build the +callgate shim library. To compile it use ``` -// main.c -#include "ptr.h" - -int main() { - Fn decr = get_fn_ptr(); - // This will fail to compile with modified ptr.h - // decr(); - - if (!IA2_FNPTR_IS_NULL(decr)) { - // This defines a wrapper to call decr and change the PKRU from pkey 1 - // to the untrusted pkey. The function pointer that this expands to must - // be called immediately after invoking the macro. - IA2_CALL(decr, _ZTSPFiiE, 1)(); - } -} +$CC -shared -fPIC -Wl,-z,now callgate_wrapper.c -I /path/to/libia2/include -o libcallgates.so ``` -In both these examples since one side is using untrusted pkey 0 (the library), -function pointers only need to be wrapped on the side with the trusted pkey. -However, we currently assume that trusted compartments all mutually distrust -each other. This means that to send function pointers between compartments with -two different trusted pkeys, function pointers need to be wrapped on both sides. - -Also as shown above, ia2.h contains additional `IA2_*` macros for added -flexibility. In particular the IA2_*_FN_SCOPE macros are generally more -ergonomic, but may not be used in the global scope. See the documentation in -ia2.h for more info. - -## Shared headers +## Compiling and linking the program -The examples above assume that each compartment exports a unique set of headers -which define its interface. However it's not always possible to assign each -header to one compartment. For example, a header may contain declarations for -functions defined in one compartment and types used by multiple compartments. -Assuming all functions declared in a header belong to a single compartment, the -rewriter handles this with the `shared-headers` option. - - -Using the example shown above let's say that `ptr.h` is the same, but -`set_fn_ptr` and `get_fn_ptr` are defined in the main binary. The library uses -the `Fn` typedef so it's headers include `ptr.h`. When we run the rewriter we -must pass in `ptr.h`, but also blacklist the function declarations from being -rewritten. +In addition to the flags normally used to build the sources, the following flags +are also required ``` -header-rewriter lib_shim.c lib.h ptr.h --shared-headers ptr.h -``` +# For all DSOs +-fPIC +-DPKEY=$PKEY +-DIA2_ENABLE=1 +-include /path/to/generated_output_header.h +-Werror=incompatible-pointer-types +-Wl,--wrap=pthread_create +-pthread +-Wl,-z,now +-Wl,-z,relro +-Wl,-T/path/to/libia2/padding.ld -This will again modify `lib.h` and `ptr.h` in-place, but avoids changing the -function declarations in `ptr.h` since the library does not define those -functions. Then by including the modified `ptr.h` in the main binary source, we -can wrap function pointers with `IA2_FNPTR_WRAPPER(foo, _ZTSPFiiE, ...)`. +# For the DSO that initializes the runtime +-Wl,--wrap=main +-Wl,--dynamic-list=/path/to/libia2/dynsym.syms +-Wl,--export-dynamic +``` -TODO: Add blurbs on public vs private headers and `$SYS_HEADERS` +Also if the rewriter produces a linker args file for a given compartment (i.e. a +.ld file) you must include `-Wl,@/path/to/generated_linker_args_$PKEY.ld` when +linking that DSO.