The purpose of an Unsupervised Domain Adaptation (UDA) task is to learn a generalized model or backbone
-
Different domains present inconsistent object-size distribution, as illustrated in file of object-size statistics. Thus, Statistical Normalization (SN) is used to rescale the object-size during the source-domain training process, where both the bounding box size and point cloud within this bounding box are rescaled. For Waymo-to-KITTI adaptation, we found that the object-size variation is a major reason of cross-domain detection accuracy drop.
-
LiDAR beam is also constantly changing for different AD manufacturers. For waymo-to-nuScenes adaptation, we argue that the LiDAR-beam variation is a major challenge, and leverage the range-map provided by Waymo tfrecords to produce the low-beam point clouds (such as 32-beam or 16-beam). Please refer to Results for more details.
- For some dataset where the range-map is not provided, such as ONCE dataset, one can employ the clustering algorithm on height-angles to obtain the pseudo-labeled low-beam point clouds, which is also verified to be effective in our codebase.
Here, we take Waymo-to-KITTI adaptation as an example.
- Train FEAT=3 (X,Y,Z) with SN (statistical normalization) using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml
- Train FEAT=3 (X,Y,Z) with SN (statistical normalization) using multiple machines
sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml
- Train FEAT=3 (X,Y,Z) without SN (statistical normalization) using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml
- Train FEAT=3 (X,Y,Z) without SN (statistical normalization) using multiple machines
sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml
- Train other baseline detectors such as PV-RCNN++ using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pv_rcnn_plus_feat_3_vehi_full_train.yaml
- Train other baseline detectors such as Voxel-RCNN using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/voxel_rcnn_feat_3_vehi.yaml
-
Note that for the cross-domain setting where the KITTI dataset is regarded as the target domain, please try --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True to enable front view point cloud only. We report the best model for all epochs on the validation set.
-
Test the source-only models using multiple GPUs
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--ckpt ${CKPT}
- Test the source-only models using multiple machines
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--ckpt ${CKPT}
- Test the source-only models of all ckpts using multiple GPUs
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--eval_all
- Test the source-only models of all ckpts using multiple machines
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--eval_all
-
You need to set the
--pretrained_model ${PRETRAINED_MODEL}
when finish the above pretraining model stage. -
If you train the source-only model using the SN (statistical normalization). For example, you train the model with pvrcnn_old_anchor_sn_kitti.yaml, you should perform the pre-SN script as follows, where pre-SN represents that we perform the SN (statistical normalization) operation before the adaptation stage.
-
Train FEAT=3 (X,Y,Z) with pre-SN (statistical normalization) using multiple machines
sh scripts/UDA/slurm_train_uda.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} ${QUOTATYPE} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_pre_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
- Train FEAT=3 (X,Y,Z) with pre-SN (statistical normalization) using multiple GPUs
sh scripts/UDA/dist_train_uda.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_pre_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
-
If you train the source-only model without using the SN (statistical normalization), you should perform the post-SN script as follows, where post-SN represents that we perform the SN (statistical normalization) operation during the adaptation stage.
-
Train FEAT=3 (X,Y,Z) with post-SN (statistical normalization) using multiple machines
sh scripts/UDA/slurm_train_uda.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} ${QUOTATYPE} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_post_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
- Train FEAT=3 (X,Y,Z) with post-SN (statistical normalization) using multiple GPUs
sh scripts/UDA/dist_train_uda.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_post_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
-
Note that for the cross-domain setting where the KITTI dataset is regarded as the target domain, please try --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True to enable front view point cloud only. We report the best model for all epochs on the validation set.
-
Test with a ckpt file:
python test.py \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--ckpt ${CKPT}
- To test all the saved checkpoints of a specific training setting and draw the performance curve on the Tensorboard, add the
--eval_all
argument:
python test.py \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--eval_all
- To test with multiple GPUs:
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--ckpt ${CKPT}
- To test all ckpts with multiple GPUs
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--eval_all
- To test with multiple machines:
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}
- To test all ckpts with multiple machines:
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all
We report the cross-dataset adaptation results including Waymo-to-KITTI, nuScenes-to-KITTI, and Waymo-to-nuScenes.
- All LiDAR-based models are trained with 4 NVIDIA A100 GPUs and are available for download.
- The domain adaptation time is measured with 4 NVIDIA A100 GPUs and PyTorch 1.8.1.
- All results are reported using the BEV/3D AP performance as the evaluation metric. We report the moderate case for KITTI dataset.
- Pre-SN represents that we perform the SN (statistical normalization) operation during the pre-training stage (SN for source domain).
- Post-SN represents that we perform the SN (statistical normalization) operation during the adaptation stage (SN for target domain).
training time | Adaptation | Car@R40 | download | |
---|---|---|---|---|
PointPillar | ~7.1 hours | Source-only with SN | 74.98 / 49.31 | - |
PointPillar | ~0.6 hours | Pre-SN | 81.71 / 57.11 | model-57M |
PV-RCNN | ~23 hours | Source-only with SN | 69.92 / 60.17 | - |
PV-RCNN | ~23 hours | Source-only | 74.42 / 40.35 | - |
PV-RCNN | ~3.5 hours | Pre-SN | 84.00 / 74.57 | model-156M |
PV-RCNN | ~1 hours | Post-SN | 84.94 / 75.20 | model-156M |
Voxel R-CNN | ~16 hours | Source-only with SN | 75.83 / 55.50 | - |
Voxel R-CNN | ~16 hours | Source-only | 64.88 / 19.90 | - |
Voxel R-CNN | ~2.5 hours | Pre-SN | 82.56 / 67.32 | model-201M |
Voxel R-CNN | ~2.2 hours | Post-SN | 85.44 / 76.78 | model-201M |
PV-RCNN++ | ~20 hours | Source-only with SN | 67.22 / 56.50 | - |
PV-RCNN++ | ~20 hours | Source-only | 67.68 / 20.82 | - |
PV-RCNN++ | ~2.2 hours | Post-SN | 86.86 / 79.86 | model-193M |
training time | Adaptation | Car@R40 | download | |
---|---|---|---|---|
PV-RCNN | ~15.7 hours | Source-only with SN | 60.16 / 49.63 | model-156M |
PV-RCNN | ~15.7 hours | Source-only | 64.58 / 27.12 | model-156M |
PV-RCNN | ~1.5 hours | Pre-SN | 86.07 / 74.72 | model-156M |
PV-RCNN | ~1 hours | Post-SN | 88.79 / 72.50 | model-156M |
Voxel R-CNN | ~8.5 hours | Source-only | 66.94 / 30.33 | model-201M |
Voxel R-CNN | ~2.2 hours | Post-SN | 87.11 / 66.02 | model-201M |
PV-RCNN++ | ~18 hours | Source-only with SN | 54.47 / 36.05 | model-193M |
PV-RCNN++ | ~18 hours | Source-only | 67.68 / 20.82 | model-193M |
PV-RCNN++ | ~1 hours | Post-SN | 85.50 / 67.85 | model-193M |
- [16-beam Waymo Train] deontes that we down-sample the point clouds of Waymo dataset from 64-beam to 16-beam, according to the given range map of the corresponding point clouds, and then we train the source-only model on the 16-beam Waymo data.
training time | Adaptation | Car@R40 | download | |
---|---|---|---|---|
PV-RCNN | ~23 hours | Source-only | 31.02 / 21.21 | - |
PV-RCNN | ~8 hours | Self-training | 33.29 / 22.15 | model-156M |
PV-RCNN | ~19 hours | 32-beam Waymo Train | 34.19 / 21.37 | model-156M |
PV-RCNN | ~15 hours | 16-beam Waymo Train | 40.23 / 23.33 | model-156M |
PV-RCNN | ~8 hours | 16-beam Waymo + Self-training | - | - |
Voxel R-CNN | ~16 hours | Source-only | 29.08 / 19.42 | - |
Voxel R-CNN | ~2.2 hours | Self-training | 32.48 / 20.87 | model-201M |
Voxel R-CNN | ~11 hours | 16-beam Waymo Train | 38.63 / 22.64 | model-201M |
PV-RCNN++ | ~20 hours | Source-only | 31.96 / 19.80 | - |
PV-RCNN++ | ~2.2 hours | Self-training | - | - |
PV-RCNN++ | ~15.5 hours | 16-beam Waymo Train | 42.62 / 25.02 | model-193M |
PV-RCNN++ | ~2.2 hours | 16-beam Waymo + Self-training | - | - |