Skip to content
Alex Wood edited this page Nov 9, 2022 · 21 revisions

This is an informational page describing how Clasp's build process works, mostly intended for Clasp developers. If you just want to know how to build Clasp, see Building and Installing from Source.

Build system

Clasp is built using Ninja along with a custom metabuilder called Koga, which lives in src/koga. Koga is a lisp program that will generally be run in SBCL. Based on the configuration it's given, Koga will produce .ninja files containing the actual "low level" build instructions and various other script files needed by the build. These can then be used by Ninja to build Clasp. This is used for both C++ code and Lisp code. For Lisp code built late in the process, the actual file lists may be part of normal ASDF system definitions rather than listed in the cscript.lisp files which are used to specify file lists and library dependencies.

Koga's directions

Koga puts together build instructions primarily by looking for files called cscript.lisp within Clasp's source directories. These are Lisp files primarily containing calls to the koga:sources, which registers source files and directories in Koga.

Scraping

The first step of the build process is the "scraper". The scraper, which lives in src/scraper, is a Lisp program (run in SBCL) that runs the Clang preprocessor on the C++ portions of Clasp's source code in order to build up the initially exposed Lisp environment. This is allows some of Clasp's lisp-accessible functions and classes to be written in C++. See Source Scraping Markup for more information on scraper markup.

The scraper will output several C++ header files into the build's generated/ directory. When building with precise garbage collection, it will also output clasp_gc.cc, which contains information for the garbage collector on how objects are laid out. This garbage collection information is gathered by a different process; see "The Static Analyzer" below.

Bootstrap Stages

Using the scraper output, Clasp can now be compiled, starting with iclasp.

iclasp

iclasp is the initial version of Clasp, written in C++. iclasp is a mostly functional Common Lisp, but some standard facilities are missing, e.g. CLOS and compile-file, as they will be defined later in Lisp. The evaluator at this stage is written to use the virtual machine: A compiler written in C++ compiles Lisp, targeting the VM, which is also written in C++. See Virtual machine design elsewhere on the wiki for more detail on the virtual machine.

iclasp is already capable of calling the scraped C++ functions, including the LLVM functions (the llvm-sys package).

While the virtual machine compiler implements all of the basic Lisp semantics, and performs some general optimizations, it is more geared towards simple and fast compilation than fully optimizing.

cclasp

Once iclasp is built, it is used to compile and load the standard library, as well as the Cleavir-based compiler. iclasp with these components built is called cclasp, or just Clasp, and it is a full Common Lisp. When building without extensions, this will be the final product. Clasp uses Cleavir to compile Lisp to native code, but the VM is still used when heavy optimization is not required.

The "cclasp" stage in Koga also includes two modules, ASDF and serve-event.

eclasp

When Clasp is compiled with extensions such as Cando an additional image is built called eclasp. This image is built from the compiled results of the standard library (cclasp image) along with the additional lisp source files in each extension's ASDF definition. For extended Clasp implementations such as Cando this makes startup much faster since the standard library and the Lisp code needed by extensions is pre-compiled into a single image.

Other development builds

A few other processes are relevant to building, but should only have to be used by Clasp developers.

The static analyzer

The static analyzer is a Clang-based Lisp program that performs a deeper analysis of C++ source code than the preprocessor-based scraper is capable of. The analyzer looks through class and structure definitions and outputs a .sif file describing them. As mentioned above, this Scraper InFormation is used by the scraper to generate code for precise garbage collection. This analysis must be run separately for every combination of Clasp and extensions (e.g. Clasp by itself has one sif, and Clasp+Cando has another sif) as extensions can add C++ classes/structures with new layout information.

The analyzer program must itself be run in Clasp, because it uses Clang's ASTMatchers and other analysis libraries quite directly using Clasp's C++ interoperation rather than a foreign function interface. As such, to run the analyzer, you will generally need to build Clasp without precise garbage collection (i.e. the "boehm" targets), and then use this imprecise Clasp to analyze Clasp's source files.

The analysis takes minutes to hours even on a good machine, and only needs to be done when the definition of a Lisp-accessible structure or class has its memory layout changed, so the sif is checked into version control (as src/analysis/clasp_gc.sif, etc.) rather than being regenerated each build. If you do need to run the analyzer, Koga generates an analysis target for ninja, or you can use the analyze script to run the analyzer on both Clasp and Cando.

Virtual machine definitions

The virtual machine, the "dtree" virtual machine used for discriminating functions, and the "literals" virtual machine used to construct objects during FASO loading are all defined in src/lisp/kernel/cmp/bytecode-machines.lisp. During build, a function in this file, generate-virtual-machine-header, is called to produce virtualMachine.h, which contains the definitions used in C++.

Note that these definitions inclue the instruction opcodes and formats, not the actual machine operations. Those are implemented in, respectively, bytecode.cc, funcallableInstance.cc, and compiler.cc, all in src/core.

Unicode

Clasp uses Unicode. To keep up with the periodic changes in Unicode, we have a subsystem to automatically generate Unicode-related Lisp definitions from the canonical UnicodeData.txt and other files. This is hooked up in Koga so that just building the update-unicode target will automatically download the Unicode files from upstream Unicode, parse them, and generate the files. The generated files are checked into source control as src/core/character-generated.cc and tools-for-build/character-names.sexp. This is not run automatically when building Clasp, as it only needs to be done when the Unicode definition has changed.

Notes

Relevant IRC discussions

  • here drmeister is talking about plans on how to refactor the current startup into saving images
Clone this wiki locally