Interactive segmentation of 3D Gaussians opens a great opportunity for real-time manipulation of 3D scenes thanks to the real-time rendering capability of 3D Gaussian Splatting. However, the current methods suffer from time-consuming post-processing to deal with noisy segmentation output. Also, they struggle to provide detailed segmentation, which is important for fine-grained manipulation of 3D scenes. In this study, we propose Click-Gaussian, which learns distinguishable feature fields of two-level granularity, facilitating segmentation without time-consuming post-processing. We delve into challenges stemming from inconsistently learned feature fields resulting from 2D segmentation obtained independently from a 3D scene. 3D segmentation accuracy deteriorates when 2D segmentation results across the views, primary cues for 3D segmentation, are in conflict. To overcome these issues, we propose Global Feature-guided Learning (GFL). GFL constructs the clusters of global feature candidates from noisy 2D segments across the views, which smooths out noises when training the features of 3D Gaussians. Our method runs in 10 ms per click, 15 to 130 times as fast as the previous methods, while also significantly improving segmentation accuracy.
交互式3D高斯分割为实时操作3D场景提供了巨大的机会,这得益于3D高斯喷溅的实时渲染能力。然而,当前方法因为要处理噪声分割输出而遭受耗时的后处理问题。同时,它们难以提供详细的分割,这对于精细操作3D场景非常重要。在本研究中,我们提出了Click-Gaussian,该方法学习两级粒度的可区分特征场,促进了无需耗时后处理的分割。我们深入探讨了由独立于3D场景获得的2D分割产生的特征场学习不一致性带来的挑战。当2D分割结果(3D分割的主要线索)在不同视图中存在冲突时,3D分割精度会恶化。为克服这些问题,我们提出了全局特征引导学习(GFL)。GFL从各视图的噪声2D分割中构建全局特征候选者群,这在训练3D高斯的特征时平滑了噪声。我们的方法每次点击运行时间为10毫秒,比以前的方法快15到130倍,同时显著提高了分割精度。