We propose a novel framework to remove transient objects from input videos for 3D scene reconstruction using Gaussian Splatting. Our framework consists of the following steps. In the first step, we propose an unsupervised training strategy for a classification network to distinguish between transient objects and static scene parts based on their different training behavior inside the 3D Gaussian Splatting reconstruction. In the second step, we improve the boundary quality and stability of the detected transients by combining our results from the first step with an off-the-shelf segmentation method. We also propose a simple and effective strategy to track objects in the input video forward and backward in time. Our results show an improvement over the current state of the art in existing sparsely captured datasets and significant improvements in a newly proposed densely captured (video) dataset.
我们提出了一种新颖的框架,用于通过高斯散射方法去除输入视频中的瞬态物体,以实现3D场景重建。我们的框架包含以下步骤。 在第一步中,我们提出了一种无监督训练策略,通过分类网络基于瞬态物体和静态场景在3D高斯散射重建中的不同训练行为来区分它们。 在第二步中,我们将第一步的结果与现成的分割方法结合,提升了检测出的瞬态物体的边界质量和稳定性。 我们还提出了一种简单有效的策略,用于在输入视频中对物体进行前后时间的跟踪。 我们的结果表明,在现有稀疏捕获数据集上,我们的方法优于当前最先进的方法,并且在新提出的密集捕获(视频)数据集中表现出显著改进。