This repository provides an implementation of Quo Vadis presented at NeurIPS 2022:
This codebase builds upon MG-GAN, AdaBins, Detectron2, Deep-Person-ReID, and TrackEval.
Recent developments in monocular multi-object tracking have been very successful in tracking visible objects and bridging short occlusion gaps. However, tracking objects over a long time and under long term-occlusions is still challenging. We suggest that the missing key is reasoning about future trajectories over a longer time horizon. Intuitively, the longer the occlusion gap, the larger the search space for possible associations. In this paper, we show that even a small yet diverse set of trajectory predictions for moving agents will significantly reduce this search space and thus improve long-term tracking robustness.
Our tracking-by-forecasting method consists of 4 steps.
We provide in docs/QUICKSTART a minimum installation and single sequence dataset to run Quo Vadis.
See docs/INSTALL.md for detailed installation instructions.
python run/experiments/run_quovadis.py \
--config-file ./run/cfgs/base.yaml \
--dataset MOT17 \
--tracker-name CenterTrack \
--sequences MOT17-02 MOT17-04 \
--save
Arguments:
--config-file
: Path to the config file
--dataset
: Name of the dataset (MOT17 or MOT20)
--tracker-name
: Name of the tracker
--sequences
: List of sequences
Optional arguments:
--save
: Save result tracking file
--vis-results
: Visualize results in image and bird's-eye view space
--save-vis
: Save visualized images
--make-video
: Make video from visualized images
--eval
: Run evaluation script (requires --save
)
--opts [config pairs]
: Update key-value pairs for cfgs in <config_file_path>
You can find a list of configuration arguments in ./docs/CONFIGS.
See docs/BEV_RECONSTRUCTION.md for detailed installation instructions.
In the paper, we provide results on the MOT17 and MOT20 datasets. Here, we run Quo Vadis on 8 different state-of-the-art trackers on the MOTChallenge benchmark.
For the MOT17 dataset, we follow the evaluation protocol used in ByteTrack and we use the first half for training and the second for evaluation. To re-run the evaluation, run bash run_MOT17.sh
. The numbers in () refer to the changes compared to the results of input baseline tracker.
MOT17 | ByteTrack | CenterTrack | QDTrack | CSTrack | FairMOT | JDE | TraDeS | TransTrack |
---|---|---|---|---|---|---|---|---|
HOTA | 71.37 (+0.23) | 61.37 (+3.15) | 58.62 (+0.29) | 61.32 (+0.14) | 58.18 (-0.15) | 51.08 (+0.23) | 62.27 (+0.49) | 60.77 (-0.15) |
IDSW | 82 (-5) | 147 (-136) | 230 (-23) | 276 (-21) | 195 (-15) | 323 (-12) | 106 (-32) | 114 (-1) |
MOTA | 80.09 (+0.01) | 70.75 (+0.37) | 69.58 (+0.05) | 71.28 (+0.02) | 71.81 (+0.04) | 59.54 (+0.03) | 70.93 (+0.09) | 69.50 (+0.00) |
IDF1 | 83.06 (+0.56) | 73.75 (+6.41) | 69.96 (+0.32) | 73.96 (+0.75) | 73.25 (-0.09) | 64.22 (+0.48) | 76.13 (+0.98) | 71.33 (-0.11) |
In this experiment, we use the baseline tracker trained on MOT17 and evaluate the performance of ByteTrack on the MOT20 training dataset. To re-run the evaluation, run bash run_MOT20.sh
. The numbers in () refer to the changes compared to the results of input baseline tracker.
MOT20 | ByteTrack | CenterTrack |
---|---|---|
HOTA | 56.90 (+0.11) | 34.27 (+2.18) |
IDSW | 1791 (-102) | 5096 (-2844) |
MOTA | 73.38 (+0.01) | 47.58 (+0.25) |
IDF1 | 72.67 (+0.58) | 46.84 (+5.12) |
Our method is not yet end-to-end trainable so we rely on pre-trained components for our model. If you wish to re-train some of these components, please consult their repositories.
- Depth Estimator AdaBins
- Re-ID features Deep-Person-ReID
- Trajectory Predictor MG-GAN
QuoVadis is released under the MIT license.
If you use this codebase in your research, please cite our publication:
@inproceedings{dendorfer2020accv,
title={Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking?},
author = {Dendorfer, Patrick and Yugay, Vladimir and Ošep, Aljoša and Leal-Taixé, Laura},
year={2022},
booktitle={Conference on Neural Information Processing Systems},
}