ml-lab-ss24-challenge

This repos contains the code for the Embedded Machine Learning Lab Challenge. The focus of the lab was to speedup inference of TinyYolov2 by applying different steps: finetuning, layer fusion, pruning and other optimizations.

TODOs

Step 1

refactor training
tune hyperparam (lr?)
add early stopping to not overfit
save model after each epoch

Step 2

Add quantized version of yolo
add fusion and quantization with pytorch API (model 1: PTQ, model 2: QAT)
test model 1
train and test model 2
add self-implemented fused conv_bn layer and quantized tinyyolo (model 3)
train model 3

Resources:

tutorial quantization with pytorch
pytorch quantization api

Step 3

Add pruning

Step 4

export to ONNX
test inference with ONNX

Step 5

Add detection pipeline to camera loop
Add framerate measurements
Test each model for demo

Other tasks

add tensorboard logging for every step
Add visuals for different models
see this graphs from this paper
compare roc curves of different models (one plot with all curves)
compare size of models (stae_dicts) after each step (histogram)
implement test for inference time improvement (see last cell in quantization notebook)

Extensions

Other pruning method: https://github.com/NVlabs/Taylor_pruning
ONNX Graph optimization: https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html
Up-to-date Jetson Docker container https://github.com/dusty-nv/jetson-containers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ml-lab-ss24-challenge

TODOs

Step 1

Step 2

Step 3

Step 4

Step 5

Other tasks

Extensions

Files

README.md

Latest commit

History

README.md

File metadata and controls

ml-lab-ss24-challenge

TODOs

Step 1

Step 2

Step 3

Step 4

Step 5

Other tasks

Extensions