Ray-Patch: An Efficient Querying for Light Field Transformers

Official implementation of the paper "Ray-Patch: An Efficient Querying for Light Field Transformers".

Results

MSN-Easy

$60\times 80$

Run	PSNR	SSIM	LPIPS	Rendering Speed
`srt`	30.98	0.903	0.173	117 fps
`RP-srt k=2`	31.16	0.906	0.163	288 fps
`RP-srt k=4`	30.92	0.901	0.175	341 fps

$120\times160$

Run	PSNR	SSIM	LPIPS	Rendering Speed	Checkpoint
`srt`	32.842	0.935	0.250	192 fps	Link
`RP-srt k=4`	32.818	0.935	0.254	275 fps	Link
`RP-srt k=8`	32.306	0.929	0.274	305 fps	Link
`osrt`	30.95	0.916	0.287	21 fps	Link
`RP-osrt k=8`	31.03	0.915	0.303	278 fps	Link

ScanNet

In $240\times320$/Out $480\times640$

Run	PSNR	SSIM	LPIPS	RMSE	Abs.Rel.	Square Rel.	Rendering Speed	Download
`DeFiNe`	23.46	0.783	0.495	0.275	0.108	0.053	7 fps	Link
`RP-DeFiNe k=16`	24.54	0.801	0.453	0.263	0.103	0.050	208 fps	Link

Setup

The implementation has been done using Pytorch 2, Pytorch Lightning 2, and cuda 11.7. To run the repository we suggest to use the conda environment:

Clone the repository

git clone [email protected]:tberriel/RayPatchQuerying.git

Create a conda environment

conda env create -n PT2 --file=pt2.yml 
conda activate PT2

Data

The models are evaluated on two datasets:

MultiShapeNet-Easy dataset, introduced by Stelzner et al.: Download from Link

ScanNet dataset Dai et al.: Follow the original repository instructions to acces the dataset. Then, to decode NASDE stereo pairs used for training and evaluation by DeFiNe, follow these intructions:

After downloading ScanNet data, uncompress it with our modified scripts:

  cp /<path to RayPatch>/src/SensReader/* /<path to scannet>/ScanNet/SensReader/.
  cd /<path to scannet>/ScanNet/SensReader
  python decode.py --dataset_path /<path to scannet>/scans --output_path /<path to scannet>/data/val/ --split_file scannetv2_val.txt --frames_list frames_val.txt
  python decode.py --dataset_path /<path to scannet>/scans --output_path /<path to scannet>/data/train/ --split_file scannetv2_train.txt --frames_list frames_train.txt

Then run the following script to preprocess it:

  cd /<path to RayPatch>/
  python src/data/preproces_scannet.py /<path to scannet>/data/ /<path to RayPatch>/data/scannet/ --parallel --num-cores 12
  mv /<path to RayPatch>/data/stereo_pairs_* /<path to RayPatch>/data/scannet/.

Preprocessing consist of resizing RGB data to 480x640 resolution. Set --num-cores to the number of cores of your cpu to process multiple scenes in parallel.

Ensure the data is placed in their respective folders:

|-- RayPatch
   |-- data
     |-- msn_easy
        |-- train
        |-- val
        |-- test
     |-- scannet
        |-- train
        |-- val

Experiments

Each training run should be stored inside the runs folder of the respective dataset, with its corresponding configuration file:

|-- RayPatch
  |-- runs
      |-- scannet
        |-- define_32_stereo_acc
            |-- config.yaml
            |-- model_best.ckpt
        |-- rpdefine_16_32_stereo_acc
            |-- config.yaml
            |-- model_best.ckpt

Test

To evaluate a model run:

  python test.py /<path to config file>/ --full-scale --eval-split <split>

To evaluate on MSN-Easy, use --eval-split test. For ScanNet use --eval-split val.

Add flag --vis to render a batch of images. Use flag --num_batches to set the number of batches to save.

By defalut evaluation does not compute neither LPIPS nor SSIM. To compute them add respective flags:

  python test.py /<path to config file>/ --full-scale --lpips --ssim

SSIM computation has a huge memory footprint. To evaluate define_stereo_32 we had to run evaluation on CPU with 160 GB of RAM.

To evaluate profiling metrics for render a single image run:

python test_profile.py <path to config folder> --batch --flops --time

The <path to config folder> should be like runs/scannet/define_32_stereo_acc/.

To execute on a GPU device add --cuda flag.

Train

To train a model run:

  python train.py /<path to config file>/

To train a Ray-Patch querying model add argument --full-scale.

Training the model also has a huge memory footprint. We trained both models using 4 Nvidia Tesla V100 with 32 GBytes of VRAM each. For ScanNet experiments, we used batch 16 and gradient accumulation of 2 to simulate batch size 32.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ray-Patch: An Efficient Querying for Light Field Transformers

Results

MSN-Easy

ScanNet

Setup

Data

Experiments

Test

Train

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
runs		runs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pt2.yml		pt2.yml
test.py		test.py
test_profile.py		test_profile.py
train.py		train.py

License

tberriel/RayPatchQuerying

Folders and files

Latest commit

History

Repository files navigation

Ray-Patch: An Efficient Querying for Light Field Transformers

Results

MSN-Easy

ScanNet

Setup

Data

Experiments

Test

Train

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages