Learn to Optimize Denoising Scores for 3D Generation: A Unified and Improved Diffusion Prior on NeRF and 3D Gaussian Splatting
We propose a unified framework aimed at enhancing the diffusion priors for 3D generation tasks. Despite the critical importance of these tasks, existing methodologies often struggle to generate high-caliber results. We begin by examining the inherent limitations in previous diffusion priors. We identify a divergence between the diffusion priors and the training procedures of diffusion models that substantially impairs the quality of 3D generation. To address this issue, we propose a novel, unified framework that iteratively optimizes both the 3D model and the diffusion prior. Leveraging the different learnable parameters of the diffusion prior, our approach offers multiple configurations, affording various trade-offs between performance and implementation complexity. Notably, our experimental results demonstrate that our method markedly surpasses existing techniques, establishing new state-of-the-art in the realm of text-to-3D generation. Furthermore, our approach exhibits impressive performance on both NeRF and the newly introduced 3D Gaussian Splatting backbones. Additionally, our framework yields insightful contributions to the understanding of recent score distillation methods, such as the VSD and DDS loss.
我们提出了一个旨在增强三维生成任务扩散先验的统一框架。尽管这些任务极其重要,现有方法在生成高质量结果方面往往面临挑战。我们首先审视了以往扩散先验中的固有局限性。我们发现了扩散先验与扩散模型训练程序之间的偏差,这大大降低了三维生成的质量。为了解决这个问题,我们提出了一个新颖的统一框架,该框架迭代优化三维模型和扩散先验。利用扩散先验的不同可学习参数,我们的方法提供了多种配置,允许在性能和实施复杂性之间进行各种权衡。值得注意的是,我们的实验结果表明,我们的方法显著超越了现有技术,在文本到三维生成领域确立了新的最先进水平。此外,我们的方法在神经辐射场(NeRF)和新引入的三维高斯分散骨架上都表现出色。此外,我们的框架对最近的得分蒸馏方法,如 VSD 和 DDS 损失的理解,提供了有意义的贡献。