4D head capture aims to generate dynamic topological meshes and corresponding texture maps from videos, which is widely utilized in movies and games for its ability to simulate facial muscle movements and recover dynamic textures in pore-squeezing. The industry often adopts the method involving multi-view stereo and non-rigid alignment. However, this approach is prone to errors and heavily reliant on time-consuming manual processing by artists. To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images. Specifically, we first represent the time-series faces as a set of dynamic 3D Gaussians with fixed topology in which the Gaussian centers are bound to the mesh vertices. Afterward, we perform alternative geometry and texture optimization frame-by-frame for high-quality geometry and texture learning while maintaining temporal topology stability. Finally, we can extract dynamic facial meshes in regular wiring arrangement and high-fidelity textures with pore-level details from the learned Gaussians. Extensive experiments show that our method achieves superior results than the current SOTA face reconstruction methods both in the quality of meshes and textures.
4D 头部捕捉旨在从视频中生成动态拓扑网格和相应的纹理贴图,这在电影和游戏中得到了广泛使用,因为它能够模拟面部肌肉运动并恢复毛孔挤压中的动态纹理。该行业通常采用多视图立体和非刚性对齐方法。然而,这种方法容易出错,并且严重依赖于艺术家耗时的手动处理。为了简化这个过程,我们提出了 Topo4D,这是一个用于自动几何和纹理生成的新框架,它可以直接从校准的多视图时间序列图像中优化密集对齐的 4D 头部和 8K 纹理贴图。具体来说,我们首先将时间序列面部表示为一组具有固定拓扑的动态 3D 高斯分布,其中高斯中心绑定到网格顶点。之后,我们逐帧进行几何和纹理的交替优化,以实现高质量的几何和纹理学习,同时保持时间拓扑稳定性。最后,我们可以从学习到的高斯分布中提取具有规则布线排列和具有毛孔级细节的高保真纹理的动态面部网格。广泛的实验表明,我们的方法在网格和纹理的质量上均优于当前最先进的面部重建方法。