We consider the problem of novel view synthesis (NVS) for dynamic scenes. Recent neural approaches have accomplished exceptional NVS results for static 3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Prior efforts often encode dynamics by learning a canonical space plus implicit or explicit deformation fields, which struggle in challenging scenarios like sudden movements or capturing high-fidelity renderings. In this paper, we introduce 4D Gaussian Splatting (4DGS), a novel method that represents dynamic scenes with anisotropic 4D XYZT Gaussians, inspired by the success of 3D Gaussian Splatting in static scenes. We model dynamics at each timestamp by temporally slicing the 4D Gaussians, which naturally compose dynamic 3D Gaussians and can be seamlessly projected into images. As an explicit spatial-temporal representation, 4DGS demonstrates powerful capabilities for modeling complicated dynamics and fine details, especially for scenes with abrupt motions. We further implement our temporal slicing and splatting techniques in a highly optimized CUDA acceleration framework, achieving real-time inference rendering speeds of up to 277 FPS on an RTX 3090 GPU and 583 FPS on an RTX 4090 GPU. Rigorous evaluations on scenes with diverse motions showcase the superior efficiency and effectiveness of 4DGS, which consistently outperforms existing methods both quantitatively and qualitatively.
我们考虑动态场景下的新视角合成(NVS)问题。近期的神经方法已经在静态3D场景的NVS问题上取得了卓越的成果,但将这些方法扩展到4D时变场景仍然非常具有挑战性。先前的努力通常通过学习一个规范空间加上隐式或显式的形变场来编码动态,这在面对突然运动或捕获高保真渲染的困难场景时往往会遇到挑战。在这篇论文中,我们引入了4D高斯喷溅(4DGS),一种新的方法,通过各向异性的4D XYZT高斯函数来表示动态场景,这一方法的灵感来源于静态场景中3D高斯喷溅的成功。我们通过时间切片4D高斯函数来模拟每个时间戳的动态,这自然而然地组成了动态的3D高斯函数,并且可以无缝地投影到图像中。作为一个显式的时空表示方法,4DGS展示了在模拟复杂动态和细节方面的强大能力,特别是对于有突然运动的场景。我们进一步在一个高度优化的CUDA加速框架中实现了我们的时间切片和喷溅技术,达到了在RTX 3090 GPU上每秒最高277帧,在RTX 4090 GPU上每秒最高583帧的实时推理渲染速度。对于具有多样运动的场景的严格评估展示了4DGS的卓越效率和有效性,它在定量和定性上均一致超越了现有方法。