This repository serves as a mirror for the source code of Codex. Codex is two things:
- Codex semantics Library: An OCaml library to write static analysers based on abstract interpretation;
- Sound static analysers based on this library. Two are currently available:
Codex has notably been used for automated verification of security properties of a whole embedded kernels directly from its binary executable [RTAS’21], to verify the absence of spatial memory corruption in challenging code manipulating data structure libraries [VMCAI’22]. The structure of the analysis is based on the relational composition of expressions, and we are in the early stages of establishing its theoretical foundations, as detailed in [POPL’23].
- [RTAS’21]
- No Crash, No Exploit: Automated Verification of Embedded Kernels. O. Nicole, M. Lemerre, S. Bardin, X. Rival (Best paper)
- [VMCAI’22]
- Lightweight Shape Analysis based on Physical Types, O. Nicole, M. Lemerre, X. Rival
- [POPL’23]
- SSA Translation Is an Abstract Interpretation, M. Lemerre (Distinguished paper).
The repository example contains the results of a simple program (computing the absolute value of an integer) in both C and machine code analysis. The results of the analysis can be presented as a standalone html file, that you can find for the C analysis here. The binary analysis example (here) analyses the machine code version of same file, but is also given a type specification (which can be found here) that restricts the possible values at the start of the program to be in some interval.
Codex is capable of handling much more than these simple examples! We will soon provide tutorials to help you perform your own analyses, but you can already try Codex by replacing these examples by your own code.
This is our first beta release, not intended to be a plug-and-play static analysis tool. The theory is sound and the tool has been used in several important case studies, but please be aware of the current lack of comprehensive documentation or polished user interface. Codex is rapidly evolving and the interface of some internal components and command line options may also change in the future. If you are require assistance, feel free to reach out (matthieu dot lemerre at cea dot fr).
We have now entered a phase where the primary objective is to enhance the tool’s usability beyond our internal team, so anticipate forthcoming improvements in user-friendliness and updates to the documentation.
Additionally, with this release, we aim to initiate the exportation of abstract interpretation components that may prove beneficial to other static analysis developers. We welcome feedback to aid us in refining and enhancing the tool.
The installation instructions is summarized in the Dockerfile (that you can use to build with docker).
The simplest way to try Binsec/Codex and Frama-C/Codex is to download the compiled version. It should work on any platform supporting the execution of x86-64bit linux executables (including WSL)
(note that framac_codex.exe is a stripped version of Frama-C, with no plugins and support for compiler builtins removed)
The easiest way is to use opam ( https://opam.ocaml.org/ ).
Just locally pin the package with:
$ opam pin . -n
You can then install the different packages:
$ opam install codex # For the codex library
$ opam install binsec_codex # For the binsec_codex executable
$ opam install frama_c_codex # For the frama_c_codex executable.
You can also install the Frama-c/Codex plugin if you already have a version of Frama-C installed.
$ opam install frama_c_codex_plugin
You can test that frama_c_codex is working by typing:
frama_c_codex -no-plugin-autoload -machdep x86_32 -codex test.c -codex-use-type-domain -codex-type-file frama-c/test.types -codex-html-dump output.html
This should output an output.html with the results of the analysis, that you can inspect using your browser.
Install opam:
sudo apt install opam # For debian
opam init --compiler 4.14.1 # Takes a while
eval $(opam env)
Install the dependencies:
opam install frama-c binsec lmdb qcheck-core ppx_inline_test sexplib mlcuddidl bheap pacomb
Note: there is a bug in Frama-C >= 28.0 that breaks some of the testsg, so stick to 27.1 if this matters to you (e.g. if you develop new Codex features).
dune build
opam install binsec
dune build binsec/binsec_codex.exe
We need a patched version of Frama-C to link statically with cudd, that cannot be loaded dynamically.
You can do it manually by adding “cudd” in the “libraries” field for the “frama_c_kernel” target in the file src/dune in Frama-C, which you should do if you otherwise use Frama-C.
Otherwise, you can install a stripped-down version of Frama-C using this command:
make opam-install-frama-c-with-cudd
You can build the docker image with
docker build -t codex -f Dockerfile.debian .
(replace with Dockerfile.alpine for the alpine version).
You can then run the docker image with
docker run -ti –network host codex bash
We describe the structure of the library top-down:
- binsec
- The BINSEC/Codex plugin, doing binary code analysis using BINSEC as a frontend.
- frama_c
- The Frama-C/Codex plugin, doing C code analysis using Frama-C as a frontend.
- codex
- The main library, used to implement static analyses.
- codex.fixpoint
- Fixpoint engines (for forward flow-sensitive analyses).
- codex.domains
- Memory and numerical domains of Codex.
- codex.types
- Code used by the type-based abstract domains , e.g. parser or types
- codex.single_value_abstraction
- Abstraction of scalar values (e.g. interval, congruence, tristate/bitwise abstraction, etc.) including transfer functions.
- codex.lattices
- Lattices used in single_value_abstraction.
- codex.constraints
- Symbolic expressions; this corresponds to the SSA expressions described in the “SSA Translation Is an Abstract Interpretation” POPL 2023 paper.
- codex.smtbackend
- Translating the SSA/Symbolic constraints into a SMT formula for symbolic model-checking.
- codex.transfer_functions
- Definitions of function symbols, syntax and semantics and their implementation in the concrete. The main file is Transfer_functions_sig.ml. Transfer_functions/term allows building terms containing Codex expressions.
- codex.codex_log
- Logging facilities.
- codex.codex_config
- Compile-time configuration options.
- utils
- a collection of reusable datastructures and algorithms suitable to abstract interpretation, such as maps, sets etc. We are in the process of outsourcing some of them to make them more easily reusable.
- ext
- is a collection of code taken from other projects, and the place where we install the patched version of Frama-C.