Accurately and efficiently modeling dynamic scenes and motions is considered so challenging a task due to temporal dynamics and motion complexity. To address these challenges, we propose DynMF, a compact and efficient representation that decomposes a dynamic scene into a few neural trajectories. We argue that the per-point motions of a dynamic scene can be decomposed into a small set of explicit or learned trajectories. Our carefully designed neural framework consisting of a tiny set of learned basis queried only in time allows for rendering speed similar to 3D Gaussian Splatting, surpassing 120 FPS, while at the same time, requiring only double the storage compared to static scenes. Our neural representation adequately constrains the inherently underconstrained motion field of a dynamic scene leading to effective and fast optimization. This is done by biding each point to motion coefficients that enforce the per-point sharing of basis trajectories. By carefully applying a sparsity loss to the motion coefficients, we are able to disentangle the motions that comprise the scene, independently control them, and generate novel motion combinations that have never been seen before. We can reach state-of-the-art render quality within just 5 minutes of training and in less than half an hour, we can synthesize novel views of dynamic scenes with superior photorealistic quality. Our representation is interpretable, efficient, and expressive enough to offer real-time view synthesis of complex dynamic scene motions, in monocular and multi-view scenarios.
准确高效地建模动态场景和运动被认为是一个极具挑战性的任务,因为时间动态和运动复杂性。为了应对这些挑战,我们提出了DynMF,一种紧凑高效的表示,将动态场景分解为少量神经轨迹。我们认为,动态场景的每个点的运动可以分解为一小组显式或学习的轨迹。我们精心设计的神经框架由一小组仅在时间上查询的学习基础组成,允许与3D高斯喷溅相似的渲染速度,超过120 FPS,同时仅需要与静态场景相比两倍的存储空间。我们的神经表示充分约束了动态场景本质上不受约束的运动场,从而实现了有效且快速的优化。这是通过将每个点绑定到运动系数上实现的,这些运动系数强制每个点共享基础轨迹。通过对运动系数仔细应用稀疏损失,我们能够分离构成场景的运动,独立控制它们,并生成之前从未见过的新的运动组合。我们可以在短短5分钟的训练内达到最新的渲染质量,并在不到半小时内,我们可以合成具有卓越真实感质量的动态场景的新视图。我们的表示是可解释的、高效的,并且足够表现力,以提供复杂动态场景运动的实时视图合成,无论是单眼还是多视图场景。