Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rgb sptm #28

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added NSF_1.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added NSF_2.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added NSF_3.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
98 changes: 85 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,112 @@
# PointNetVlad-Pytorch
Unofficial PyTorch implementation of PointNetVlad (https://github.com/mikacuy/pointnetvlad)
# Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods
[Chao Chen](https://scholar.google.com/citations?hl=en&user=WOBQbwQAAAAJ), [Xinhao Liu](https://gaaaavin.github.io), [Xuchu Xu](https://www.xuchuxu.com), [Li Ding](https://www.hajim.rochester.edu/ece/lding6/), [Yiming Li](https://scholar.google.com/citations?user=i_aajNoAAAAJ), [Ruoyu Wang](https://github.com/ruoyuwangeel4930), [Chen Feng](https://scholar.google.com/citations?user=YeG8ZM0AAAAJ)

**"A Novel self-supervised VPR model capable of retrieving positives from various orientations."**

![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?logo=PyTorch&logoColor=white)
[![Linux](https://svgshare.com/i/Zhy.svg)](https://svgshare.com/i/Zhy.svg)
[![GitLab issues total](https://badgen.net/github/issues/ai4ce/V2X-Sim)](https://github.com/Joechencc/TF-VPR)
[![GitHub stars](https://img.shields.io/github/stars/ai4ce/V2X-Sim.svg?style=social&label=Star&maxAge=2592000)](https://github.com/Joechencc/TF-VPR/stargazers/)
<div align="center">
<img src="https://s2.loli.net/2022/07/30/ZldqmQGFhajCxRn.png" height="300">
</div>
<br>

## Abstract

Visual place recognition (VPR) using deep networks has achieved state-of-the-art performance. However, most of the related approaches require a training set with ground truth sensor poses to obtain the positive and negative samples of each observation's spatial neighborhoods. When such knowledge is unknown, the temporal neighborhoods from a sequentially collected data stream could be exploited for self-supervision, although with suboptimal performance. Inspired by noisy label learning, we propose a novel self-supervised VPR framework that uses both the temporal neighborhoods and the learnable feature neighborhoods to discover the unknown spatial neighborhoods. Our method follows an iterative training paradigm which alternates between: (1) representation learning with data augmentation, (2) positive set expansion to include the current feature space neighbors, and (3) positive set contraction via geometric verification. We conduct comprehensive experiments on both simulated and real datasets, with input of both images and point clouds. The results demonstrate that our method outperforms the baselines in both recall rate, robustness, and a novel metric we proposed for VPR, the orientation diversity.

## Dataset

Download links:
- For Pointcloud: Please refer to DeepMapping paper, https://github.com/ai4ce/PointCloudSimulator
- For Real-world Panoramic RGB: https://drive.google.com/drive/u/0/folders/1ErXzIx0je5aGSRFbo5jP7oR8gPrdersO

You could find more detailed documents on our [website](https://github.com/Joechencc/TF-VPR/edit/RGB_SPTM/README.md)!

TF-VPR follows the same file structure as the [PointNetVLAD](https://github.com/mikacuy/pointnetvlad):
```
TF-VPR
├── loss # loss function
├── models # network model
| ├── PointNetVlad.py # PointNetVLAD network model
| ├── ImageNetVlad.py # NetVLAD network model
| ...
├── generating_queries # Preprocess the data, initial the label, and generate Pickle file
| ├── generate_test_RGB_sets.py # Generate the test pickle file
| ├── generate_training_tuples_RGB_baseline_batch.py # Generate the train pickle file
| ...
├── results # Results are saved here
├── config.py # Config file
├── evaluate.py # evaluate file
├── loading_pointcloud.py # file loading script
├── train_pointnetvlad.py # Main file to train TF-VPR
| ...
```
Point cloud TF-VPR result:

![](NSF_1.gif)

RGB TF-VPR result:

![](NSF_2.gif)

Real-world RGB TF-VPR result:

![](NSF_3.gif)

# Note

I kept almost everything not related to tensorflow as the original implementation.
The main differences are:
* Multi-GPU support
* Configuration file (config.py)
* Evaluation on the eval dataset after every epochs

This implementation achieved an average top 1% recall on oxford baseline of 84.81%

### Pre-Requisites
* PyTorch 0.4.0
* tensorboardX
- PyTorch 0.4.0
- tensorboardX
- open3d-python 0.4
- scipy
- matplotlib
- numpy

### Generate pickle files
```
cd generating_queries/

# For training tuples in our baseline network
python generate_training_tuples_baseline.py

# For training tuples in our refined network
python generate_training_tuples_refine.py
python generate_training_tuples_RGB_baseline_batch.py

# For network evaluation
python generate_test_sets.py
python generate_test_RGB_sets.py
```

### Train
```
python train_pointnetvlad.py --dataset_folder $DATASET_FOLDER
python train_pointnetvlad.py
```

### Evaluate
```
python evaluate.py --dataset_folder $DATASET_FOLDER
python evaluate.py
```

Take a look at train_pointnetvlad.py and evaluate.py for more parameters

## Benchmark

We implement SPTM, TF-VPR, and supervise version, please check the other branches for reference

<!-- ## Citation

If you find TF-VPR useful in your research, please cite:

```bibtex
@article{Chen_2022_RAL,
title = {Self-Supervised Visual Place Recognition by Mining Temporal and Feature Neighborhoods},
author = {Chen, Chao and Liu, Xinhao and Xu, Xuchu and Ding, Li and Li, Yiming and Wang, Ruoyu and Feng, Chen},
booktitle = {IEEE Robotics and Automation Letters},
year = {2022}
}
``` -->
16 changes: 11 additions & 5 deletions config.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
# GLOBAL
NUM_POINTS = 256
FEATURE_OUTPUT_DIM = 3839
GRID_X = 1080
GRID_Y = 1920
SIZED_GRID_X = 64*4
SIZED_GRID_Y = 64
FEATURE_OUTPUT_DIM = 512
RESULTS_FOLDER = "results/"
OUTPUT_FILE = "results/results.txt"
file_name = "Goffs"

LOG_DIR = 'log/'
MODEL_FILENAME = "model.ckpt"

DATASET_FOLDER = '../../benchmark_datasets/'
DATASET_FOLDER = '/mnt/NAS/home/cc/data/habitat/Goffs'

# TRAIN
BATCH_NUM_QUERIES = 2
Expand All @@ -18,7 +23,7 @@
BASE_LEARNING_RATE = 0.000005
MOMENTUM = 0.9
OPTIMIZER = 'ADAM'
MAX_EPOCH = 20
MAX_EPOCH = 50

MARGIN_1 = 0.5
MARGIN_2 = 0.2
Expand All @@ -31,9 +36,10 @@

TRAIN_FILE = 'generating_queries/training_queries_baseline.pickle'
TEST_FILE = 'generating_queries/test_queries_baseline.pickle'
scene_list = ['Goffs']#,'Nimmons','Reyno','Spotswood','Springhill','Stilwell']

# LOSS
LOSS_FUNCTION = 'quadruplet'
LOSS_FUNCTION = 'triplet'
LOSS_LAZY = True
TRIPLET_USE_BEST_POSITIVES = False
LOSS_IGNORE_ZERO_BATCH = False
Expand All @@ -45,7 +51,7 @@

EVAL_DATABASE_FILE = 'generating_queries/evaluation_database.pickle'
EVAL_QUERY_FILE = 'generating_queries/evaluation_query.pickle'

INIT_TRUST = 3

def cfg_str():
out_string = ""
Expand Down
Loading