版权说明 操作指南
首页 > 成果 > 详情

TellMeTalk: Multimodal-driven talking face video generation

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
期刊论文
作者:
Li, Pengfei;Zhao, Huihuang;Liu, Qingyun;Tang, Peng;Zhang, Lin
通讯作者:
Zhao, HH
作者机构:
[Tang, Peng; Zhao, Huihuang; Zhao, HH; Li, Pengfei; Zhang, Lin; Liu, Qingyun] Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Hunan, Peoples R China.
[Zhao, Huihuang; Liu, Qingyun] Hunan Prov Key Lab Intelligent Informat Proc & App, Changsha 421008, Hunan, Peoples R China.
[Zhao, Huihuang] Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Hunan, Peoples R China.
通讯机构:
[Zhao, HH ] H
Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Hunan, Peoples R China.
语种:
英文
关键词:
Talking face generation;Lip sync;Face motion;Virtual reality;Multimodality
期刊:
Computers & Electrical Engineering
ISSN:
0045-7906
年:
2024
卷:
114
页码:
109049
基金类别:
National Natural Science Foundation of China [61772179]; Hunan Provincial Natural Science Foundation of China [2022JJ50016, 2023JJ50095]; Science and Technology Plan Project of Hunan Province [2016TP1020]; Scientific Research Fund of Hunan Provincial Education Department [21B0649]; Postgraduate Scientific Research Innovation Project of Hunan Province [CX20221285]; Science and Technology Innovation Project of Hengyang [202250045231]; Industry University Research Innovation Foundation of Ministry of Education Science and Technology Development Center [2020QT09]; The 14th Five-Year Plan Key Disciplines and Application-oriented Special Disciplines of Hunan Province (Xiangjiaotong) [[2022] 351]
机构署名:
本校为第一且通讯机构
院系归属:
计算机科学与技术学院
摘要:
In this paper, we present TellMeTalk, an innovative approach for generating expressive talking face videos based on multimodal inputs. Our approach demonstrates robustness across various identities, languages, expressions, and head movements. It overcomes four key limitations of existing talking face video generation methods: (1) reliance on single -modal learning from audio or text, lacking the complementary nature of multimodal inputs; (2) deployment of traditional convolutional neural network generation, leading to restricted capture of spatial features; (3) the absence of natural head move...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com