This work was supported by the National Natural Science Foundation of China (61772179,12442056), Hunan Provincial Natural Science Foundation of China (2024JJ5059,2023JJ50095), The Science and Technology Innovation Program of Hunan Province(2016TP1020), The Science and Technology Innovation Project of Hengyang(202250045231), The Industry University Research Innovation Foundation of Ministry of Education Science and Technology Development Center (2020QT09), The "14th Five-Year Plan" Key Disciplines and Application-oriented Special Disciplines of Hunan Province (Xiangjiaotong [2022] 351), and Open Research Fund of The State Key Laboratory of Multimodal Artificial Intelligence Systems(MAIS-2023-09).