Rethinking Efficient Lane Detection via Curve Modeling CVPR 2022
State-of-the-art lane detectors are typically based on semantic segmentation (SCNN, RESA) or point detection (LaneATT). However, semantic segmentation requires customized post-processing & cannot deal with a variable number of lanes. Point detection methods are currently anchor-based, with NMS as post-processing. The more natural way would be getting a curve representation directly. Methods like LSTR made the first steps in this direction, but didn't really achieved comparable performance against SOTA methods. BézierLaneNet use a fully convolutional network to predict cubic Bézier curves, the ease of optimization of Bézier control points made it possible for direct curve methods to compete with SOTAs. A fusion of flipped feature maps is also employed to exploit symmetry in the car's front-view. BézierLaneNet (ResNet-34) achieves 75.6 F-1 on CULane, and attained the 1st place (of all published methods) in the LLAMAS leaderboard at its time, while running at 150 FPS in our benchmark.
For another earlier attempt on learning Bézier curves for lane detection with (almost) the same name BezierLaneNet, please refer to wiki 9. BézierLaneNet disclaimer and this repo.
Training time estimated with single 2080 Ti.
ImageNet pre-training, 3-times average/best.
backbone | aug | resolution | training time | precision | accuracy (avg) | accuracy | FP | FN | |
---|---|---|---|---|---|---|---|---|---|
ResNet18 | level 1b | 360 x 640 | 5.5h | full | 95.01% | 95.41% | 0.0531 | 0.0458 | model | shell |
ResNet34 | level 1b | 360 x 640 | 6.5h | full | 95.17% | 95.65% | 0.0513 | 0.0386 | model | shell |
backbone | aug | resolution | training time | precision | F1 (avg) | F1 | normal | crowded | night | no line | shadow | arrow | dazzle light |
curve | crossroad | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ResNet18 | level 1b | 288 x 800 | 9.9h | mix | 73.36 | 73.67 | 90.22 | 71.55 | 68.70 | 45.30 | 70.91 | 84.09 | 62.49 | 58.98 | 996 | model | shell |
ResNet34 | level 1b | 288 x 800 | 11.0h | mix | 75.30 | 75.57 | 91.59 | 73.20 | 69.90 | 48.05 | 76.74 | 87.16 | 69.20 | 62.45 | 888 | model | shell |
backbone | aug | resolution | training time | precision | F1 (avg) | F1 | TP | FP | FN | Precision | Recall | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ResNet18 | level 1b | 360 x 640 | 5.5h | mix | 95.42 | 95.52 | 70515 | 3102 | 3520 | 95.79 | 95.25 | model | shell |
ResNet34 | level 1b | 360 x 640 | 6.1h | mix | 96.04 | 96.11 | 70959 | 2667 | 3076 | 96.38 | 95.85 | model | shell |
Their test performance can be found at the LLAMAS leaderboard.
FPS is best trial-avg among 3 trials on a 2080 Ti.
backbone | resolution | FPS | FLOPS(G) | Params(M) |
---|---|---|---|---|
ResNet18 | 360 x 640 | 212.83 | 14.77 | 4.10 |
ResNet34 | 360 x 640 | 149.52 | 29.85 | 9.49 |
ResNet18 | 288 x 800 | 210.79 | 14.66 | 4.10 |
ResNet34 | 288 x 800 | 144.65 | 29.54 | 9.49 |
@inproceedings{feng2022rethinking,
title={Rethinking efficient lane detection via curve modeling},
author={Feng, Zhengyang and Guo, Shaohua and Tan, Xin and Xu, Ke and Wang, Min and Ma, Lizhuang},
booktitle={Computer Vision and Pattern Recognition},
year={2022}
}