-
Notifications
You must be signed in to change notification settings - Fork 0
A C interpreter for a purely functional language, bolted to a very unsafe FFI
License
andrewbuss/crisp
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
crisp: a purely functional language interpreter bolted to a very unsafe FFI Repo contents: crisp.h : header file with core declarations crisp.c : the core, including cell allocation and evaluation parse.c : routines for converting between cells and strings ffi.c : routines supporting the foreign function interface interpreter.c : REPL modules/std : a module containing important functions which are | not required to implement the minimal interpreter + std.c : wrappers around zip, concat, apply, assoc which conform | to the native function interface, plus hash and ispair, | plus asc, sum, product, and modulus + std.crisp : declare the above native functions in the global env This also contains many useful lambda functions: nil, not, and, nand, or, nor, neq, dec, inc, void, makerec, defrec, test, testwith, list-equal, c{a,d}{a,d}r, reduce, map, all, any, none, filter, reverse-concat, reverse, reversed-range, range, repeat, zip, len tests.crisp : an assortment of tests and additional syntax examples libc_demo.crisp : a few examples using the FFI with libc bintree.crisp : an implementation of a basic persistent binary tree map fuzz_parser.c : a small program to exercise the parser and verify round-trip stability. Only useful for fuzzing Building: git submodule update --init ln -sr libatomic_ops bdwgc cmake . make # optionally, install with all modules; read Modules below sudo make install Some syntax examples and testing code are in the tests file. Run tests: crisp < tests.crisp Run as a REPL: crisp Modules: crisp.c contains the minimal evaluation logic, but many common functions are implemented in the std module. A module is a crisp script, optionally combined with any number of compiled object files, linked into a shared library. When a module is loaded with the `import` function, the script is executed in an environment with a few extra mappings: this : an FFI_LIBRARY, the module currently being loaded native-function : convert an FFI_SYMBOL into a NATIVE_FUNCTION In practice, a module will contain `def`'s to define new functions in the global environment. For example: def hash native-function this.hash will expose `hash` for use after the module is imported. Private functions need not be declared in this way. Modules are loaded at runtime using the `import` function. Modules may import other modules similarly, although crisp does not attempt to detect or resolve circular dependencies. Modules are installed by default to /usr/lib/ with filenames of the form `libstd.crisp.so`. In addition to the default library paths, `import` tries to find a library named `std` in: $CRISP_MODULE_PATH/modules/libstd.crisp.so $CRISP_MODULE_PATH/std/libstd.crisp.so where CRISP_MODULE_PATH is an environment variable that may be provided at runtime, and defaults to `./modules/` if not provided. If imports are failing, check that modules are reachable from the current directory. Syntax notes: Whitespace is generally ignored. An exception, in order to work as a REPL, is that lines are evaluated after each newline, unless there are unclosed parentheses. Example: sum 1 2 3 ; result: 6 4 5 ; result: 4 5 (sum 1 2 3 4 5) ; result: 15 The if function takes two required arguments: a predicate and a clause to be evaluated if the predicate is non-nil. If only two arguments are supplied and the predicate is nil, nil is returned. If the predicate is nil, everything after the second argument is evaluated. This allows an easier syntax for chaining if's. The lambda function works similarly; evaluating everything after the first argument as a referent. Example: def fizzbuzz (lambda x with fizz (equal (modulus x 3) 0) with buzz (equal (modulus x 5) 0) if (and fizz buzz) FizzBuzz if fizz Fizz if buzz Buzz x) map fizzbuzz (range 20) ; output: ; FizzBuzz 1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz 16 17 Fizz 19 Evaluation notes: A list is evaluated by examining its head. If the head is callable, and isn't a special function (examples include def, quote, lambda), which holds evaluation for its arguments, the arguments are evaluated and the function is applied. If the head is not callable, each element of the list is independently evaluated and the resulting list is returned. The behavior of native functions is unconstrained, although the interface is defined: a native function accepts a cell which may be either NIL or a PAIR, and returns a cell which may be of any type. Little is specified about the behavior of FFI functions - the interface is not stable enough to describe yet. Roughly, an FFI_FN can be applied to integer or symbol arguments. Symbol arguments are passed as pointers to the underlying strings, so (libc.puts foo) prints "foo\n". The return value of an FFI_FN is always assumed to be an integer. This means that (libc.malloc 4096) will return an integer pointer which can be passed to libc.gets or libc.free, for example. Finally, lambdas are evaluated by evaluating the body of a lambda in a new environment where the names in the lambda args list are mapped to the corresponding values to which the lambda is being applied. This new environment is stacked atop the environment in which the lambda was created, so variables may override other local variables, but global definitions are still accessible. For example, to evaluate with z 4 (def f lambda (x y) sum y (product z x)) f 3 7 the interpreter first pairs x with 3 and y with 7 to yield a new env concatenated with the existing environment: (x . 3) (y . 7) (z . 4) ... (f . (lambda (x y) sum y (product z x))) Then it evaluates the body of the lambda in this environment, obtaining: sum y (product z x) -> y -> 7 product z x -> z -> 4 x -> 3 -> apply product 4 3 -> 12 -> apply sum 7 12 -> 19 Implementation notes: At one point crisp was implemented in a purely functional style (as far as that's possible in C). Later work, particularly work on optimizing tail calls, has added enough complex control flow that this is no longer really true, but the goal of moving as much functionality from the interpreter into the language remains. There is no special "environment" struct. The environment is a normal list containing dotted-pair mappings between variable names and their values. Evaluation contexts are stacked by concatenating a list of name/value pairs to the previous environment. The intention is to implement only the minimum functionality in C required to implement higher-level functionality in the crisp language, such as recursion, map, filter, defrec, and, via the FFI, libc, and glib, I/O of some sort. At present, crisp uses the Boehm-Demers-Weiser (http://www.hboehm.info/gc/) garbage collector rather than including one of its own. A reference counted system is a goal of later work. The FFI functionality is exciting because it allows reuse of libraries from many other projects with very little extra C code in crisp. Using libc from crisp allows all sorts of low-level I/O. Symbols are interned when they are parsed. Fuzzing: Automated fuzzing is a fun way to catch bugs. CMake targets are included building crisp without FFI or GC. FFI yields a lot of false positives when a fuzzer discovers it can pass a garbage pointer to libc.free for example. Allowing the fuzzer to call arbitrary external code is also simply a worrying idea. GC tends to clog up the fuzzer with extra logic and control flow, and isn't needed for short executions. Build the slimmer crisp: cd fuzzing cmake .. -DCMAKE_C_COMPILER=afl-gcc make crisp_fuzz Prepare directories for afl: mkdir inputs afl_state echo "(lambda (x y) y x) 4 2" > inputs/simple Run afl with a 200 MB memory limit: afl-fuzz -m 200 -x afl_dict.txt -i inputs -o afl_state ./crisp_fuzz
About
A C interpreter for a purely functional language, bolted to a very unsafe FFI
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published