Skip to content

Latest commit

 

History

History
107 lines (85 loc) · 4.16 KB

dev_guide.md

File metadata and controls

107 lines (85 loc) · 4.16 KB

Dev Guide

1) Overview

Directory Structure

.
├── data                                # directory with scanner results
│
├── kalm_benchmark                      # source code of the package
│   ├── cli.py                          # Command line interface
│   ├── evaluation                      # module for evaluation the scanner results
│   │   ├── scanner                     # module containing dedicated scripts per scanner for parsing its results
│   │   │   ├── ...
│   │   │   └── scanner_evaluator.py    # the base for the scanner specific scripts
│   │   └── evaluation.py               # the core of the evaluation module
│   ├── manifest_generator              # module for generating the benchmark manifests
│   │   ├── cdk8s_imports               # (generated) imports of k8s definitions generated by cdk8s
│   │   ├── constants.py                # collection of constants shared across manifests
│   │   ├── gen_manifests.py            # entry-point for generating the manifests
│   │   ├── ...
│   │   └── workload                    # definition of workload related manifests
│   └── ui                              # module for the visualization of the evaluation
│
├── manifests                                # (generated) the target directory for generated manifests
├── notebooks                           # folder containing all notebooks used for the analysis
│── tests                               # all unit-tests mirroring source code structure
└── tox.ini                             # tox file with settings for flake8 and running tox

The project consists of 2 main modules:

  • manifest_generator: the code for the generation and management of the manifests for the benchmark
  • evaluation: contains all the code for scanners and their evaluations

Setup

1) Installation

To install the project and all the dependencies execute:

poetry run install

After the installation the pre-commit hooks must be installed:

poetry run pre-commit install

This installs the following tools with minor adjustments:

  • black to format the code
  • flake8 to lint the code
  • isort to sort the import statements

2) Dependency management

Poetry is used for the management of this project. Thus, to install new dependency also use poetry instead of pip:

poetry add <dependency>

3) Run Tests

For unit-tests pytest is used. To run all tests enter:

poetry run pytest

Evaluation pipeline

The evaluation of the scanner results run through a unified pipeline. At the beginning the results of a scanner are loaded and parsed. These steps are specified to every scanner and can be customized in the respective implementation. Afterwardse, the results are structured as a table and any missing check ids are imputed. In parallel, the benchmark table is created by implicetely generating the manifests using cdk8s.

Both tables are merged on their check_id column and the respective column containing information of the checked path. Finally, the outcome is post-processed by removing duplicates and missing values and categorizing the checks.

The full pipeline is as follows:

flowchart TD;
    subgraph scanner
        direction LR;
        A[Load Scan Result] --> scan_parse[Parse Results JSON]
        scan_parse --> impute[Impute Check ID]
        impute --> tab_scan[Tabulate Results]
    end
tab_scan --> merge{Merge <br/> dataframes}
    subgraph benchmark
        B[Load Benchmark] --> tab_bench[Tabulate Benchmark]
    end
tab_bench --> merge
merge --> drop_redundants[Drop Redundant Checks]
drop_redundants --> drop_dups[Drop Duplicates]
drop_dups --> single_name[Unify Names]
single_name --> cat[Categorize Check]
cat --> oos[Filter Out of Scope Results]
oos --> fill_na[Fill NAs]
fill_na --> classify[Classify Benchmark Result]
classify --> Z([Done])
Loading