EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion

Li, Xinru; Lin, Jingzhong; Zhang, Bohao; Qi, Yuanyuan; Wang, Changbo; He, Gaoqi

EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion

dc.contributor.author	Li, Xinru	en_US
dc.contributor.author	Lin, Jingzhong	en_US
dc.contributor.author	Zhang, Bohao	en_US
dc.contributor.author	Qi, Yuanyuan	en_US
dc.contributor.author	Wang, Changbo	en_US
dc.contributor.author	He, Gaoqi	en_US
dc.contributor.editor	Christie, Marc	en_US
dc.contributor.editor	Pietroni, Nico	en_US
dc.contributor.editor	Wang, Yu-Shuen	en_US
dc.date.accessioned	2025-10-07T05:03:04Z
dc.date.available	2025-10-07T05:03:04Z
dc.date.issued	2025
dc.description.abstract	Co-speech gesture generation, driven by emotional expression and synergistic bodily movements, is essential for applications such as virtual avatars and human-robot interaction. Existing co-speech gesture generation methods face two fundamental limitations: (1) producing inexpressive gestures due to ignoring the temporal evolution of emotion; (2) generating incoherent and unnatural motions as a result of either holistic body oversimplification or independent part modeling. To address the above limitations, we propose EmoDiffGes, a diffusion-based framework grounded in embodied emotion theory, unifying dynamic emotion conditioning and part-aware synergistic modeling. Specifically, a Dynamic Emotion-Alignment Module (DEAM) is first applied to extract dynamic emotional cues and inject emotion guidance into the generation process. Then, a Progressive Synergistic Gesture Generator (PSGG) iteratively refines region-specific latent codes while maintaining full-body coordination, leveraging a Body Region Prior for part-specific encoding and Progressive Inter-Region Synergistic Flow for global motion coherence. Extensive experiments validate the effectiveness of our methods, showcasing the potential for generating expressive, coordinated, and emotionally grounded human gestures.	en_US
dc.description.number	7
dc.description.sectionheaders	Digital Human
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	44
dc.identifier.doi	10.1111/cgf.70261
dc.identifier.issn	1467-8659
dc.identifier.pages	13 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.70261
dc.identifier.uri	https://diglib.eg.org/handle/10.1111/cgf70261
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	CCS Concepts: Computing methodologies → Computer graphics; Animation; Motion processing
dc.subject	Computing methodologies → Computer graphics
dc.subject	Animation
dc.subject	Motion processing
dc.title	EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: cgf70261.pdf
Size:: 13.57 MB
Format:: Adobe Portable Document Format

Download

Name:: paper1396_mm1.zip
Size:: 332.8 MB
Format:: Zip file

Download

Collections

44-Issue 7