-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
216 lines (160 loc) · 8.61 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
crisp: a purely functional language interpreter bolted to a very unsafe FFI
Repo contents:
crisp.h : header file with core declarations
crisp.c : the core, including cell allocation and evaluation
parse.c : routines for converting between cells and strings
ffi.c : routines supporting the foreign function interface
interpreter.c : REPL
modules/std : a module containing important functions which are
| not required to implement the minimal interpreter
+ std.c : wrappers around zip, concat, apply, assoc which conform
| to the native function interface, plus hash and ispair,
| plus asc, sum, product, and modulus
+ std.crisp : declare the above native functions in the global env
This also contains many useful lambda functions:
nil, not, and, nand, or, nor, neq, dec, inc,
void, makerec, defrec, test, testwith, list-equal,
c{a,d}{a,d}r, reduce, map, all, any, none, filter,
reverse-concat, reverse, reversed-range, range,
repeat, zip, len
tests.crisp : an assortment of tests and additional syntax examples
libc_demo.crisp : a few examples using the FFI with libc
bintree.crisp : an implementation of a basic persistent binary tree map
fuzz_parser.c : a small program to exercise the parser and verify
round-trip stability. Only useful for fuzzing
Building:
git submodule update --init
ln -sr libatomic_ops bdwgc
cmake .
make
# optionally, install with all modules; read Modules below
sudo make install
Some syntax examples and testing code are in the tests file. Run tests:
crisp < tests.crisp
Run as a REPL:
crisp
Modules:
crisp.c contains the minimal evaluation logic, but many common functions
are implemented in the std module. A module is a crisp script, optionally
combined with any number of compiled object files, linked into a shared
library. When a module is loaded with the `import` function, the script is
executed in an environment with a few extra mappings:
this : an FFI_LIBRARY, the module currently being loaded
native-function : convert an FFI_SYMBOL into a NATIVE_FUNCTION
In practice, a module will contain `def`'s to define new functions in the
global environment. For example:
def hash native-function this.hash
will expose `hash` for use after the module is imported. Private functions
need not be declared in this way.
Modules are loaded at runtime using the `import` function. Modules may
import other modules similarly, although crisp does not attempt to detect
or resolve circular dependencies.
Modules are installed by default to /usr/lib/ with filenames of the form
`libstd.crisp.so`. In addition to the default library paths, `import`
tries to find a library named `std` in:
$CRISP_MODULE_PATH/modules/libstd.crisp.so
$CRISP_MODULE_PATH/std/libstd.crisp.so
where CRISP_MODULE_PATH is an environment variable that may be provided at
runtime, and defaults to `./modules/` if not provided. If imports are
failing, check that modules are reachable from the current directory.
Syntax notes:
Whitespace is generally ignored. An exception, in order to work as a REPL,
is that lines are evaluated after each newline, unless there are unclosed
parentheses. Example:
sum 1 2 3 ; result: 6
4 5 ; result: 4 5
(sum 1 2 3
4 5) ; result: 15
The if function takes two required arguments: a predicate and a clause to
be evaluated if the predicate is non-nil. If only two arguments are
supplied and the predicate is nil, nil is returned. If the predicate is
nil, everything after the second argument is evaluated. This allows an
easier syntax for chaining if's. The lambda function works similarly;
evaluating everything after the first argument as a referent. Example:
def fizzbuzz (lambda x
with fizz (equal (modulus x 3) 0)
with buzz (equal (modulus x 5) 0)
if (and fizz buzz)
FizzBuzz
if fizz
Fizz
if buzz
Buzz
x)
map fizzbuzz (range 20)
; output:
; FizzBuzz 1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz
11 Fizz 13 14 FizzBuzz 16 17 Fizz 19
Evaluation notes:
A list is evaluated by examining its head. If the head is callable, and
isn't a special function (examples include def, quote, lambda), which
holds evaluation for its arguments, the arguments are evaluated and the
function is applied. If the head is not callable, each element of the list
is independently evaluated and the resulting list is returned.
The behavior of native functions is unconstrained, although the interface
is defined: a native function accepts a cell which may be either NIL or a
PAIR, and returns a cell which may be of any type.
Little is specified about the behavior of FFI functions - the interface
is not stable enough to describe yet. Roughly, an FFI_FN can be
applied to integer or symbol arguments. Symbol arguments are passed as
pointers to the underlying strings, so (libc.puts foo) prints "foo\n".
The return value of an FFI_FN is always assumed to be an integer.
This means that (libc.malloc 4096) will return an integer pointer which
can be passed to libc.gets or libc.free, for example.
Finally, lambdas are evaluated by evaluating the body of a lambda in a new
environment where the names in the lambda args list are mapped to the
corresponding values to which the lambda is being applied. This new
environment is stacked atop the environment in which the lambda was
created, so variables may override other local variables, but global
definitions are still accessible.
For example, to evaluate
with z 4 (def f lambda (x y) sum y (product z x))
f 3 7
the interpreter first pairs x with 3 and y with 7 to yield a new env
concatenated with the existing environment:
(x . 3) (y . 7) (z . 4) ... (f . (lambda (x y) sum y (product z x)))
Then it evaluates the body of the lambda in this environment, obtaining:
sum y (product z x) ->
y -> 7
product z x ->
z -> 4
x -> 3
-> apply product 4 3 -> 12
-> apply sum 7 12 -> 19
Implementation notes:
At one point crisp was implemented in a purely functional style
(as far as that's possible in C). Later work, particularly work on
optimizing tail calls, has added enough complex control flow that this
is no longer really true, but the goal of moving as much functionality
from the interpreter into the language remains.
There is no special "environment" struct. The environment is a normal list
containing dotted-pair mappings between variable names and their values.
Evaluation contexts are stacked by concatenating a list of name/value pairs
to the previous environment.
The intention is to implement only the minimum functionality in C required
to implement higher-level functionality in the crisp language, such as
recursion, map, filter, defrec, and, via the FFI, libc, and glib, I/O of
some sort.
At present, crisp uses the Boehm-Demers-Weiser (http://www.hboehm.info/gc/)
garbage collector rather than including one of its own. A reference counted
system is a goal of later work.
The FFI functionality is exciting because it allows reuse of libraries from
many other projects with very little extra C code in crisp. Using libc from
crisp allows all sorts of low-level I/O.
Symbols are interned when they are parsed.
Fuzzing:
Automated fuzzing is a fun way to catch bugs. CMake targets are included
building crisp without FFI or GC. FFI yields a lot of false positives when
a fuzzer discovers it can pass a garbage pointer to libc.free for example.
Allowing the fuzzer to call arbitrary external code is also simply a
worrying idea. GC tends to clog up the fuzzer with extra logic and control
flow, and isn't needed for short executions.
Build the slimmer crisp:
cd fuzzing
cmake .. -DCMAKE_C_COMPILER=afl-gcc
make crisp_fuzz
Prepare directories for afl:
mkdir inputs afl_state
echo "(lambda (x y) y x) 4 2" > inputs/simple
Run afl with a 200 MB memory limit:
afl-fuzz -m 200 -x afl_dict.txt -i inputs -o afl_state ./crisp_fuzz