Skip to content

Official repository of the paper "Ray-Patch: An Efficient Querying for Light Field Transformers

License

Notifications You must be signed in to change notification settings

tberriel/RayPatchQuerying

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ray-Patch: An Efficient Querying for Light Field Transformers

Official implementation of the paper "Ray-Patch: An Efficient Querying for Light Field Transformers".

Querying comparison

Architecture

Results

MSN-Easy

  • $60\times 80$
Run PSNR SSIM LPIPS Rendering Speed
srt 30.98 0.903 0.173 117 fps
RP-srt k=2 31.16 0.906 0.163 288 fps
RP-srt k=4 30.92 0.901 0.175 341 fps
  • $120\times160$
Run PSNR SSIM LPIPS Rendering Speed Checkpoint
srt 32.842 0.935 0.250 192 fps Link
RP-srt k=4 32.818 0.935 0.254 275 fps Link
RP-srt k=8 32.306 0.929 0.274 305 fps Link
osrt 30.95 0.916 0.287 21 fps Link
RP-osrt k=8 31.03 0.915 0.303 278 fps Link

ScanNet

  • In $240\times320$/Out $480\times640$
Run PSNR SSIM LPIPS RMSE Abs.Rel. Square Rel. Rendering Speed Download
DeFiNe 23.46 0.783 0.495 0.275 0.108 0.053 7 fps Link
RP-DeFiNe k=16 24.54 0.801 0.453 0.263 0.103 0.050 208 fps Link

Setup

The implementation has been done using Pytorch 2, Pytorch Lightning 2, and cuda 11.7. To run the repository we suggest to use the conda environment:

  • Clone the repository
    git clone [email protected]:tberriel/RayPatchQuerying.git
    
  • Create a conda environment
    conda env create -n PT2 --file=pt2.yml 
    conda activate PT2 
    

Data

The models are evaluated on two datasets:

  • MultiShapeNet-Easy dataset, introduced by Stelzner et al.: Download from Link
  • ScanNet dataset Dai et al.: Follow the original repository instructions to acces the dataset. Then, to decode NASDE stereo pairs used for training and evaluation by DeFiNe, follow these intructions:
    • After downloading ScanNet data, uncompress it with our modified scripts:
        cp /<path to RayPatch>/src/SensReader/* /<path to scannet>/ScanNet/SensReader/.
        cd /<path to scannet>/ScanNet/SensReader
        python decode.py --dataset_path /<path to scannet>/scans --output_path /<path to scannet>/data/val/ --split_file scannetv2_val.txt --frames_list frames_val.txt
        python decode.py --dataset_path /<path to scannet>/scans --output_path /<path to scannet>/data/train/ --split_file scannetv2_train.txt --frames_list frames_train.txt
      
    • Then run the following script to preprocess it:
        cd /<path to RayPatch>/
        python src/data/preproces_scannet.py /<path to scannet>/data/ /<path to RayPatch>/data/scannet/ --parallel --num-cores 12
        mv /<path to RayPatch>/data/stereo_pairs_* /<path to RayPatch>/data/scannet/.
      
      Preprocessing consist of resizing RGB data to 480x640 resolution. Set --num-cores to the number of cores of your cpu to process multiple scenes in parallel.

Ensure the data is placed in their respective folders:

|-- RayPatch
   |-- data
     |-- msn_easy
        |-- train
        |-- val
        |-- test
     |-- scannet
        |-- train
        |-- val

Experiments

Each training run should be stored inside the runs folder of the respective dataset, with its corresponding configuration file:

|-- RayPatch
  |-- runs
      |-- scannet
        |-- define_32_stereo_acc
            |-- config.yaml
            |-- model_best.ckpt
        |-- rpdefine_16_32_stereo_acc
            |-- config.yaml
            |-- model_best.ckpt

Test

To evaluate a model run:

  python test.py /<path to config file>/ --full-scale --eval-split <split>

To evaluate on MSN-Easy, use --eval-split test. For ScanNet use --eval-split val.

Add flag --vis to render a batch of images. Use flag --num_batches to set the number of batches to save.

By defalut evaluation does not compute neither LPIPS nor SSIM. To compute them add respective flags:

  python test.py /<path to config file>/ --full-scale --lpips --ssim

SSIM computation has a huge memory footprint. To evaluate define_stereo_32 we had to run evaluation on CPU with 160 GB of RAM.

To evaluate profiling metrics for render a single image run:

python test_profile.py <path to config folder> --batch --flops --time

The <path to config folder> should be like runs/scannet/define_32_stereo_acc/.

To execute on a GPU device add --cuda flag.

Train

To train a model run:

  python train.py /<path to config file>/ 

To train a Ray-Patch querying model add argument --full-scale.

Training the model also has a huge memory footprint. We trained both models using 4 Nvidia Tesla V100 with 32 GBytes of VRAM each. For ScanNet experiments, we used batch 16 and gradient accumulation of 2 to simulate batch size 32.

About

Official repository of the paper "Ray-Patch: An Efficient Querying for Light Field Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages