Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

伟大的英伟达,这一晃就过了六年,英伟达栽树后人乘凉 #180

Open
stone100010 opened this issue Apr 25, 2024 · 0 comments

Comments

@stone100010
Copy link

VASA-1 是由一组研究人员推出的尖端框架,旨在通过单个静态图像和随附的语音音频片段实时生成逼真的说话面孔。该模型名为 VASA-1,擅长生成与音频高度同步的唇部运动,同时还能捕捉各种面部表情和自然的头部运动,从而增强生成面孔的真实感和生动感。这项创新的核心是面部动态和头部运动的整体模型,该模型在由视频数据制作的独特潜在空间内运行。

广泛的测试和新指标证实了 VASA-1 在多个方面优于现有方法。值得注意的是,VASA-1 支持以每秒高达 40 帧的速度传输高质量 512x512 视频,延迟极低,为与真正模仿人类对话模式的虚拟形象进行引人入胜的实时互动铺平了道路。
https://www.microsoft.com/en-us/research/project/vasa-1/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant