3D Gaussians, as a low-level scene representation, typically involve thousands to millions of Gaussians. This makes it difficult to control the scene in ways that reflect the underlying dynamic structure, where the number of independent entities is typically much smaller. In particular, it can be challenging to animate and move objects in the scene, which requires coordination among many Gaussians. To address this issue, we develop a mutual information shaping technique that enforces movement resonance between correlated Gaussians in a motion network. Such correlations can be learned from putative 2D object masks in different views. By approximating the mutual information with the Jacobians of the motions, our method ensures consistent movements of the Gaussians composing different objects under various perturbations. In particular, we develop an efficient contrastive training pipeline with lightweight optimization to shape the motion network, avoiding the need for re-shaping throughout the motion sequence. Notably, our training only touches a small fraction of all Gaussians in the scene yet attains the desired compositional behavior according to the underlying dynamic structure. The proposed technique is evaluated on challenging scenes and demonstrates significant performance improvement in promoting consistent movements and 3D object segmentation while inducing low computation and memory requirements.
3D高斯作为一种低层次的场景表示,通常涉及数千至数百万个高斯。这使得难以控制反映底层动态结构的场景,其中独立实体的数量通常要小得多。特别是,场景中的物体动画化和移动可能具有挑战性,这需要多个高斯之间的协调。为了解决这个问题,我们开发了一种互信息塑形技术,该技术在运动网络中强制使相关高斯之间的移动产生共振。这种相关性可以从不同视图中的假定2D对象遮罩中学习得到。通过用运动的雅可比近似互信息,我们的方法确保在各种干扰下,组成不同对象的高斯能够保持一致的移动。特别是,我们开发了一种高效的对比训练管道和轻量级优化,以塑造运动网络,避免在整个运动序列中重新塑形的需求。值得注意的是,我们的训练只触及场景中很小一部分的高斯,但仍能根据底层动态结构达到期望的组合行为。所提出的技术在挑战性场景中进行了评估,并在促进一致的移动和3D对象分割方面显示出显著的性能改进,同时引起的计算和内存需求较低。