This paper proposes a novel framework for large-scale scene reconstruction based on 3D Gaussian splatting (3DGS) and aims to address the scalability and accuracy challenges faced by existing methods. For tackling the scalability issue, we split the large scene into multiple cells, and the candidate point-cloud and camera views of each cell are correlated through a visibility-based camera selection and a progressive point-cloud extension. To reinforce the rendering quality, three highlighted improvements are made in comparison with vanilla 3DGS, which are a strategy of the ray-Gaussian intersection and the novel Gaussians density control for learning efficiency, an appearance decoupling module based on ConvKAN network to solve uneven lighting conditions in large-scale scenes, and a refined final loss with the color loss, the depth distortion loss, and the normal consistency loss. Finally, the seamless stitching procedure is executed to merge the individual Gaussian radiance field for novel view synthesis across different cells. Evaluation of Mill19, Urban3D, and MatrixCity datasets shows that our method consistently generates more high-fidelity rendering results than state-of-the-art methods of large-scale scene reconstruction. We further validate the generalizability of the proposed approach by rendering on self-collected video clips recorded by a commercial drone.
本文提出了一种基于3D Gaussian Splatting(3DGS)的全新框架,用于大规模场景重建,旨在解决现有方法面临的可扩展性和精度挑战。为了解决可扩展性问题,我们将大场景划分为多个单元,通过基于可见性的相机选择和渐进式点云扩展来关联每个单元的候选点云和相机视图。为了提升渲染质量,相比于基础版的3DGS,我们做了三项改进:一种射线与高斯交叉的新策略和高斯密度控制方法,以提高学习效率;基于ConvKAN网络的外观解耦模块,用于解决大规模场景中的不均匀光照问题;以及经过改进的最终损失函数,结合了颜色损失、深度畸变损失和法线一致性损失。最后,执行无缝拼接程序,将各个单元的高斯辐射场合并,实现跨单元的新视角合成。在Mill19、Urban3D和MatrixCity数据集上的评估显示,我们的方法在大规模场景重建中,生成的高保真渲染结果始终优于现有的最先进方法。我们还通过在商业无人机录制的自采视频片段上渲染,验证了所提方法的广泛适应性。