Accurate Low Latency Visual Perception for Autonomous Racing: Challenges Mechanisms and Practical Solutions

This is the Pytorch side code for the accurate low latency visual perception system introduced by Kieran Strobel, Sibo Zhu, Raphael Chang, and Skanda Koppula. "Accurate Low Latency Visual Perception for Autonomous Racing: Challenges Mechanisms and Practical Solutions" . If you use the code, please cite the paper:

@misc{strobel2020accurate,
    title={Accurate, Low-Latency Visual Perception for Autonomous Racing:Challenges, Mechanisms, and Practical Solutions},
    author={Kieran Strobel and Sibo Zhu and Raphael Chang and Skanda Koppula},
    year={2020},
    eprint={2007.13971},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Abstract

Autonomous racing provides the opportunity to test safety-critical perception pipelines at their limit. This paper describes the practical challenges and solutions to applying state-of-the-art computer vision algorithms to build a low-latency, high-accuracy perception system for DUT18 Driverless(DUT18D), a 4WD electric race car with podium finishes at all Formula Driverless competitions for which it raced. The key components of DUT18D include YOLOv3-based object detection, pose estimation and time synchronization on its dual stereovision/monovision camera setup. We highlight modifications required to adapt perception CNNs to racing domains, improvements to loss functions used for pose estimation, and methodologies for sub-microsecond camera synchronization among other improvements. We perform an extensive experimental evaluation of the system, demonstrating its accuracy and low-latency in real-world racing scenarios.

CVC-YOLOv3

CVC-YOLOv3 is the MIT Driverless Custom implementation of YOLOv3.

One of our main contributions to vanilla YOLOv3 is the custom data loader we implemented:

Each set of training images from a specific sensor/lens/perspective combination is uniformly rescaled such that their landmark size distributions matched that of the camera system on the vehicle. Each training image was then padded if too small or split up into multiple images if too large.

Our final accuracy metrics for detecting traffic cones on the racing track:

mAP	Recall	Precision
89.35%	92.77%	86.94%

CVC-YOLOv3 Dataset with Formula Student Standard is open-sourced here

RektNet

RektNet is the MIT Driverless Custom Key Points Detection Network.

RektNet takes in bounding boxes outputed from CVC-YOLOv3 and outputs seven key points on the traffic cone, which is responsible for depth estimation of traffic cones on the 3D map. v Our final Depth estimation error VS Distance graph (The Monocular part):

RektNet Dataset with Formula Student Driverless Standard is open-sourced here

License

This repository is released under the Apache-2.0 license. See LICENSE for additional details.

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
CVC-YOLOv3		CVC-YOLOv3
RektNet		RektNet
.gitattributes		.gitattributes
.gitignore		.gitignore
Driverless_CV_Paper.pdf		Driverless_CV_Paper.pdf
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accurate Low Latency Visual Perception for Autonomous Racing: Challenges Mechanisms and Practical Solutions

CVC-YOLOv3

CVC-YOLOv3 Dataset with Formula Student Standard is open-sourced here

RektNet

RektNet Dataset with Formula Student Driverless Standard is open-sourced here

License

About

Releases

Packages

Languages

License

nikhilisaac/MIT-Driverless-CV-TrainingInfra

Folders and files

Latest commit

History

Repository files navigation

Accurate Low Latency Visual Perception for Autonomous Racing: Challenges Mechanisms and Practical Solutions

CVC-YOLOv3

CVC-YOLOv3 Dataset with Formula Student Standard is open-sourced here

RektNet

RektNet Dataset with Formula Student Driverless Standard is open-sourced here

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages