LaRa: Efficient Large-Baseline Radiance Fields

Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction. But they are mostly applied in per-scene optimization or small-baseline settings. While several recent works investigate feed-forward reconstruction with large baselines by utilizing transformers, they all operate with a standard global attention mechanism and hence ignore the local nature of 3D reconstruction. We propose a method that unifies local and global reasoning in transformer layers, resulting in improved quality and faster convergence. Our model represents scenes as Gaussian Volumes and combines this with an image encoder and Group Attention Layers for efficient feed-forward reconstruction. Experimental results demonstrate that our model, trained for two days on four GPUs, demonstrates high fidelity in reconstructing 360&deg radiance fields, and robustness to zero-shot and out-of-domain testing.

辐射场方法已经实现了逼真的新视角合成和几何重建。但它们大多应用于每个场景的优化或小基线设置。尽管最近有几项研究探讨了利用变压器进行大基线的前向重建，但它们都使用了标准的全局注意力机制，因此忽略了3D重建的局部特性。我们提出了一种方法，在变压器层中统一了局部和全局推理，从而提高了质量并加快了收敛速度。我们的模型将场景表示为高斯体，并结合图像编码器和群组注意力层进行高效的前向重建。实验结果表明，我们的模型在四个GPU上训练两天后，展示了在重建360度辐射场方面的高保真度，并对零样本和域外测试具有稳健性。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2407.04699.md

2407.04699.md

LaRa: Efficient Large-Baseline Radiance Fields

Files

2407.04699.md

Latest commit

History

2407.04699.md

File metadata and controls

LaRa: Efficient Large-Baseline Radiance Fields