Reconstructing dynamic urban scenes presents significant challenges due to their intrinsic geometric structures and spatiotemporal dynamics. Existing methods that attempt to model dynamic urban scenes without leveraging priors on potentially moving regions often produce suboptimal results. Meanwhile, approaches based on manual 3D annotations yield improved reconstruction quality but are impractical due to labor-intensive labeling. In this paper, we revisit the potential of 2D semantic maps for classifying dynamic and static Gaussians and integrating spatial and temporal dimensions for urban scene representation. We introduce Urban4D, a novel framework that employs a semantic-guided decomposition strategy inspired by advances in deep 2D semantic map generation. Our approach distinguishes potentially dynamic objects through reliable semantic Gaussians. To explicitly model dynamic objects, we propose an intuitive and effective 4D Gaussian splatting (4DGS) representation that aggregates temporal information through learnable time embeddings for each Gaussian, predicting their deformations at desired timestamps using a multilayer perceptron (MLP). For more accurate static reconstruction, we also design a k-nearest neighbor (KNN)-based consistency regularization to handle the ground surface due to its low-texture characteristic. Extensive experiments on real-world datasets demonstrate that Urban4D not only achieves comparable or better quality than previous state-of-the-art methods but also effectively captures dynamic objects while maintaining high visual fidelity for static elements.
重建动态城市场景因其固有的几何结构和时空动态性而充满挑战。现有不利用潜在动态区域先验的建模方法往往产生次优结果,而基于手动3D标注的方法尽管提高了重建质量,但由于标注过程劳动强度大而不实用。 本文重新探索了利用二维语义图对动态与静态高斯进行分类,并将空间和时间维度集成以表示城市场景的潜力。我们提出了Urban4D,一种基于语义引导分解策略的创新框架,受到深度二维语义图生成技术进展的启发。该方法通过语义高斯可靠地区分潜在的动态物体。为显式建模动态物体,我们提出了一种直观且高效的**四维高斯散点(4D Gaussian Splatting, 4DGS)**表示方法,通过可学习的时间嵌入为每个高斯聚合时间信息,使用多层感知机(MLP)预测其在目标时间戳下的变形。 针对静态场景的更精确重建,我们设计了一种基于k近邻(KNN)一致性正则化的策略,以处理地面等低纹理特性区域。大量真实数据集实验表明,Urban4D不仅在质量上与现有最先进方法相当或更优,还能够有效捕捉动态物体,同时保持静态元素的高视觉保真度。 Urban4D为动态城市场景的高效、精确重建提供了新思路,特别在处理复杂时空变化场景中展现了卓越的表现力。