SafePKT project is implemented in the context of the European NGI LEDGER program (see https://ledger-3rd-open-call.fundingbox.com/).
SafePKT is a project to study and build static analysis technology for the Rust code used in the PKT project. PKT is a project for disintermediating telecom monopolies by de-coupling the roles of infrastructure operation from (internet) service provision. PKT uses a blockchain based on the bitcoin protocol and a proof of work algorithm called PacketCrypt which requires bandwidth in order to mine. The vision of PacketCrypt is to drive investment into network infrastructure by creating artificial demand for bandwidth.
You can find out more about the overall PKT project by going to: https://pkt.cash/
In the SafePKT project we are focused on improving software development efficiency (and therefore time to market) for software used within the PKT ecosystem (e.g. PacketCrypt / cjdns). As security breaches in cryptocurrency software often lead to irrecoverable loss, such projects have higher than normal security requirements. However in this innovative and competitive space, time to market is also a critical to a project's overall success. With the help of cutting edge research in the academic space, we are developing improved software verification tools which will be easier to use and more helpful to developers who will apply them to improving software development efficiency and security in projects within the PKT ecosystem.
An online demo of a rust-based smart contract verification is available from
Rust-based smart contract analysis and verification is also available to developers and researchers by installing VS Code and SafePKT Verifier extension from VS Code Marketplace.
We're very grateful towards the following organizations, projects and people:
- the Project Oak maintainers for making Rust Verifications Tools, a dual-licensed open-source project (MIT / Apache). The RVT tools allowed us to integrate with industrial-grade verification tools in a very effective way.
- the KLEE Symbolic Execution Engine maintainers
- the Rust community at large
- the JavaScript and NuxtJS community at large
- The University of Reunion Island and the University of Verona
- All members of the NGI-Ledger Consortium for accompanying us
The technology of PKT overall includes PacketCrypt bandwidth-hard proof of work, Lightning Network, and cjdns networking protocol.
SafePKT technology consists of a web based frontend and server-side backend, which work together to provide a software developer with reports about potential bugs or security issues in their code.
Both a command-line (CLI) application and a Visual Studio Code plugin offer independently from the two previous components, reports containing the same level of detail offered by the backend when it comes to analyzing a rust-based smart-contract written on top of Parity's ink! eDSL - see https://github.com/paritytech/ink/tree/v2.1.0.
While the backend is responsible for handling all the transformation logic from a program source file or a rust library-oriented or binary-oriented project, the frontend (single-page application) is responsible for enabling developers and researchers to triggers the verification process from a web browser before receiving reports usually made available and formatted for command-line shells.
In the same vein as for the frontend component, the command-line binary emitted at compilation of the backend, provides with the capability to verify a rust program when the constraints needed by the verifier are checked on a per instruction basis. Such verification process consists in running successively containerized jobs consisting in the following steps:
- Scaffolding of a new rust project uniquely identified from a rust program source file
- Generation of LLVM bitcode from the newly created project
- Symbolic execution of the program based on two strategies
- the declaration of assumed input values types before having the program intermediate representation (obtained from the previous step) run by KLEE symbolic execution engine
- the addition of tests also run by KLEE for library-oriented project (what we've eventually decided we would be focusing on)
SafePKT depends on the following technologies:
- Single-page application:
- Server-side back-end and command-line application:
- VS Code extension:
- Implementation of backend application skeleton,
responding to HTTP requests matching the following routes:
- POST
/source
to upload program source code (rust-based program for compiling a binary) - GET
/steps
to list verifications steps - POST
/llvm-bitcode-generation/{{ project_id }}
to run LLVM bitcode generation from a previously uploaded source - GET
/llvm-bitcode-generation/{{ project_id }}/report
to get a report for some LLVM bitcode generation - GET
/llvm-bitcode-generation/{{ project_id }}/progress
to get the completion status for some LLVM bitcode generation - POST
/symbolic-execution/{ project_id }
to run KLEE symbolic execution from a previously uploaded program after having generated intermediate representation with LLVM - GET
/symbolic-execution/{{ project_id }}/report
to get a report for some symbolic execution - GET
/symbolic-execution/{{ project_id }}/progress
to get the completion status for some symbolic execution
- POST
- Implementation of a frontend application enabling a developer or a researcher to chain commands required to verify a program written in Rust by using the Rust Verification Tools from the comfort of a web browser.
In the initial phase, our web-based prototype clearly revealed some limitations in terms of usability and ergonomy. There were too many steps involved (three of them, which had to be taken in a specific and tedious order). However working with a web application also allowed us to better modularize and encapsulate the logic behind the verification process. It turned out that three distinct steps were enough to close a verification job for a trivial program (a multiplication of two 32-bits unsigned integers).
This basic arithmetic example taken from the RVT project let us discover the primitives of klee functions implemented by RVT maintainers for rust programs to be verified. It also prevented us from wandering off in too many directions associated with the complexity of the tools involved (LLVM and KLEE, being both two important pieces of complex technologies required by this project, which saved us tremendous amount of time and headaches thanks to the incredible work and deep documentation efforts provided by the RVT project authors).
Based on what we have learnt in Phase 1, we've realized from the feedback we've received (especially thanks to Dyne's feedback while accompanying us all along the way) that
- there were two many steps involved in the verification process
- the web-based UI was costly for large programs (without further optimization of the editor)
- the verification process was quite fragile since it depended on intermediate representation obtained from a rust program with LLVM linker and bitcode generation
- the most recent programs would not be well supported as the KLEE version we could rely upon was not compiled against the most up-to-date version of LLVM (available for
rustc
compiler available out there).
As a result the rust compiler version, we used to leverage Rust Verification Tools would not be most of the time compatible with the targeted programs, which were in essence alway quite edgy given the rapid pace of evolution in the field of blockchain smart contract implementation.
All these concerns led us to making the following decisions
- 🎯 To reduce the number of steps down to the very strict minimum
From 3 steps at the beginning of the project, we simplified the verification process down to 1 step at the end by combining all of them from the backend component and optimizing intermediate operations like project dependencies download and caching
- 🎯 To part away from the web interface by considering the implementation of command-line interface (CLI). The web app was great for prototyping and collecting feedback without having to install the whole suite of tools, which would take at best 20 min to set up. However and as it was stated before, large programs made the UI sluggish and would have required further optimization, new libraries selection.
As a consequence, the HTTP API exposed by the backend inherited two additional steps (program verification and source restoration), whereas all previous routes except the one listing the available steps have been deprecated:
- POST
/program-verification/{ project_id }
to run a smart contract verification - GET
/program-verification/{{ project_id }}/report
to get a report for some program verification - GET
/program-verification/{{ project_id }}/progress
to get the completion status of a verification - POST
/source-restoration/{ project_id }
to restore a previously uploaded source - GET
/source-restoration/{{ project_id }}/report
to get a report for some source restoration - GET
/source-restoration/{{ project_id }}/progress
to get the completion status of a source restoration
Additional routes have been added to the backend to facilitate project analysis management:
POST
/uploaded_sources_listing/{project_id}
to run a container executingls
from the directory where uploaded sources areGET
/uploaded_sources_listing/{project_id}/report
to get the logs of the container which output contains all the uploaded sources id (project_id
)DELETE
/program-verification/{project_id}
to stop container executing the verification process for a previously uploaded source under project id passed as parameter
Optimizing the construction of the RVT-based toolset itself was a subtask described in this public issue. No official Docker image could be attached to project-oak/rvt project but we eventually managed to publish our own set of images and tags to ease the backend component set up from the official Docker registry, with the latest tag being worth of about 2Gb of compressed image layers available from https://hub.docker.com/repository/docker/safepkt/rvt
In the end, here is a screenshot showing some of the differences between the first iteration result and the latest UI release:
- A more significant program (real-world case based on
ink!
) against a trivial multiplication example - The reduction of steps down to a single-step program verification (responsible for source upload, with LLVM bitcode generation and symbolic execution with KLEE being abstracted away by the backend).
We have also put some efforts to convert a smart contract rust source into web assembly,
right before returning to plain C code by relying on the WebAssembly Binary Toolkit (wabt).
However we had to quickly acknowledge such approach would be a dead-end for our particular use case,
as the amount of C code generated from wasm
, for instance for some contracts like
- a simple NFT implementation or
- by relying on a Hyperledger Sawtooth SDK implementation in rust.
At some point, either we ended up with
- too much code to be analyzed with klee (symbolic execution engine running under the wood) or
- too many failures hard to debug because of the numerous dependencies for the sdk we've tried to analyze.
Most of the experiments, which has led us to this conclusion, can be found or replayed as targets of a Makefile,
available in our contribution clone of the substrate-nft project,
which is based on ink! eDSL to write smart contracts,
depending on the paritytech substrate platform.
Since the CLI application was working fine by the time when our second internal deadline has been hit (See Initial Mentoring Plan to learn out more about the overall chronology and what plan has been proposed at the beginning of the project), we have decided to follow up with the VS Code extension implementation, that has been anticipated to be deployed if we were to be on schedule with regards the previous critical steps (the verification of smart contracts itself and the report presentation to fellow users, developers and researchers).
As SafePKT verifier has been designed and implemented since the beginning as a set of loosely coupled components, it was fairly easy (all things considered AND after having cleared up how to best rely on rust verification tools) for us
- to reduce the number of steps for the overall verification process (from the backend or/and the frontend by considering deployment for both components or one of them only)
- to add new steps to the verification pipeline matching a new route of the HTTP API exposed by backend
- to remove deprecated steps once everything was working as expected from the frontend
- to introduce the SafePKT CLI command communicating internally with the backend API.
The SafePKT library is consumed by two separate entrypoints:
- the CLI entrypoint
- the HTTP entrypoint serving responses one can expect for each of the compliant HTTP requests
- to port the logic implemented from the frontend when reaching out to the API exposed by the backend to a VS Code extension
Here are some of the learnings that we could have gathered all along the project:
- The use of type-safe languages like Rust and TypeScript for both backend, frontend, CLI application and VS Code extension was very helful for our frequent refactoring operations (text editors and IDEs can make our lives so much better thanks to the robustness of modification signaling wrong-doing against more or less explicitely declared types)
- The use of a memory-safe language like Rust was also of course critical in gaining confidence while implementing our initial design and its continual refinement series over the course of those 4 months
- Writing tests for the backend also helped us to better understand how to separate the backend responsibilities when it came down to standardize
- a Rust-based project scaffolding from a library or a program source code and a template manifest with a specific set of dependencies mandatory for the execution of RVT based on specific version of LLVM, KLEE and rustc nightly flavors
- the execution of arbitrary commands in Docker containers used as controlled environments where all the tools which would need could be set up and tear down at will
- the collection of reports and job completion status
When a test calls a function, which should panic (annotated with #[should_panic])
KLEE runs the intermediate representation for this function in the background,
and since KLEE is not yet made aware at this point of expected failures from panicking program,
there are false positives raised in these cases.
In order to prevent such false positives from ruining the development experience,
it is now possible to add a conventional suffix to the end of test function names
e.g. adding _fails
to the end of zero_requirement_construction
would hint the backend component
about the that a failing test is expected to be equivalent to a passing test.
Discarding the use of web UI at the end let us focus on the researcher / developer experience when it would come to relying on verification results provided by KLEE.
Downloading VS Code text editor and installing SafePKT verifier extension is a matter of a couple of minutes, whereas running the single-page application could take longer at the moment because of the embedded text editor limitations (VS Code being an actual optimized text editor and our first online version being very far from perfect in supporting medium to large program when it comes to activating basic options like syntax coloring or symbols navigation).
However, we could imagine replacing in the future the current editor component with something more production-ready like Monaco Editor if it would make sense to keep offering such a solution as alternative to editing program from a developer tool like VS Code.
Since program execution is required for analysis, there is still the open question of how to securely execute the programs so that it doesn't do anything malicious. This question has not been solved yet and we're not quite sure that it could as the attack surface is so large. For now, we consider that isolating program execution within unprivileged containers as a very first step.
From a very naive perspective, there would be a need for limiting drastically i/o and networking operation along with CPU and memory allocation. All of those resources could be made available or reduced with options activated or deactivated by relying on the Docker engine. We've also been thinking about using Firecracker to further isolate the verification process and for now, we've considered that interested parties would be able to run verification pipelines by using their own internal infrastructure while offering a basic hard-coded example test project verification demo to begin with from one of our dedicated server.
A list of possible improvements could be as long as a Christmas wishlist so let us make it as short as possible instead by focusing on the essential part:
- Addition of single-test run by providing a RegExp pattern
- Parallelization of test-based verification by the backend
- Upgrade / Fix of rust dependencies (such as
ink!
eDSL) mandatory for running- compilation with rustc nightly emitting LLVM bitcode with compatibility fo one of most recent version of LLVM
- symbolic execution with KLEE to be made compatible with latest version of LLVM
See this issue from project-oak/rvt about cargo veriy
being incompatible with LLVM 11
- Port of the verifier for continuous delivery with GitHub Actions pipelines
- Port of the verifier as an extension for Jetbrains products
- Port of the verifier as a Sonarqube plugin
The default branch (buggy-smart-contract
) available from SafePKT smart contract example,
offers a clear demonstration of how to leverage SafePKT verification tools,
in order to discover issues with a rust-based Smart Contract.
The smart contract example here is based on ERC20 technical standard,
and can be found in the following repository:
https://github.com/LedgerProject/safepkt_smart-contract-example
Installing a backend separatedly from an web server proxying HTTP requests can prove to be tricky,
because of access list permissions.
In a configuration, for which we have
- an nginx service running as
www-data
user
proxying HTTP requests to a SafePKT backend instance running as user having namervt
and - we would rely on system
uid:gid
to make an explicit mapping
between a host and containers file systems1000:1000
forwww-data
in this case1001:1001
forrvt
A working strategy consists in ensuring that
safepkt_backend
repository clone belongs torvt.www-data
rvt
is the user running containers and writing to the file system mounted via docker enginervt
belongs towww-data
grouprvt
belongs todocker
groupwww-data
is the user running the nginx instance
As of today, the verification process takes about 90s when executed from our dedicated server for a suite of about 30 tests without fuzzing.
Provided that such verification involves
- program compilation emitting LLVM bitcode,
- KLEE symbolic execution of a test suite and
- the report extraction from a tool writing on the standard output of a container running for each verification request (an actual HTTP request being sent out to a remote backend for the whole testing suite on which the software verification is based on)
We consider that further optimization would indeed be mandatory in order to decrease the current latency in a way, which would be significant enough for a daily industry-grade feedback loop to be considered satisfying enough.
The upside is that the packaging and publication of new versions of the extension is now pretty much semi-automated with VS Code Extension Manager.
The backend deployment remains to be fully automated even though the binaries pre-built with GitHub Actions workflow saves us some time (no more manual compilation needed and a nice history of all previously available versions for both backend and CLI applications can be found from the backend project releases page)
The overall project is maintainable and contributions are very welcomed by cloning the https://github.com/LedgerProject/safepkt repository before following instructions from its README documentation.
This repository carries pointers to all other repositories of the project by leveraging git submodules.
If we were to develop further a web-frontend, we would only need to make sure that our backend would scale properly since the frontend is also continuously delivered when pushing on the frontend main branch and the projects builds successfully with vercel.com allowing us to try variations of our app upfront pretty easily thanks to out of the box nuxt.js airtight integration with Vercel.
Since our backend component is responsible for all the heavy lifting, we have opted for relying on a plain description of web and job-oriented services involved in the verification process: SafePKT Web service description following Compose API reference. Please see also Compose v3 API reference.
As a result, we don't need anymore to build the verifier Docker image run by the backend for each verification job when the backend component needs to be installed before first being deployed on a dedicated server or among other collocated services, but only to download it once from the official Docker registry by running the required Docker command to pull the image needed (safepkt/rvt:verifier).
Relying on a ready-for-use container image is now a matter of configuring the backend by setting an environment variable holding the name and tag of the expected image to be pulled from the registry.
- The main SafePKT project repository is made of a documentation and git submodules pointing to