EG2025

Permanent URI for this community

https://diglib.eg.org/handle/10.2312/3607134

Browse

Now showing 1 - 2 of 2

REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Almog, Gal; Shamir, Ariel; Fried, Ohad; Bousseau, Adrien; Day, Angela
While latent diffusion models achieve impressive image editing results, their application to iterative editing of the same image is severely restricted. When trying to apply consecutive edit operations using current models, they accumulate artifacts and noise due to repeated transitions between pixel and latent spaces. Some methods have attempted to address this limitation by performing the entire edit chain within the latent space, sacrificing flexibility by supporting only a limited, predetermined set of diffusion editing operations. We present a re-encode decode (REED) training scheme for variational autoencoders (VAEs), which promotes image quality preservation even after many iterations. Our work enables multi-method iterative image editing: users can perform a variety of iterative edit operations, with each operation building on the output of the previous one using both diffusion based operations and conventional editing techniques. We demonstrate the advantage of REED-VAE across a range of image editing scenarios, including text-based and mask-based editing frameworks. In addition, we show how REEDVAE enhances the overall editability of images, increasing the likelihood of successful and precise edit operations. We hope that this work will serve as a benchmark for the newly introduced task of multi-method image editing.
Shape-Conditioned Human Motion Diffusion Model with Mesh Representation
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xue, Kebing; Seo, Hyewon; Bobenrieth, Cédric; Luo, Guoliang; Bousseau, Adrien; Day, Angela
Human motion generation is a key task in computer graphics. While various conditioning signals such as text, action class, or audio have been used to harness the generation process, most existing methods neglect the case where a specific body is desired to perform the motion. Additionally, they rely on skeleton-based pose representations, necessitating additional steps to produce renderable meshes of the intended body shape. Given that human motion involves a complex interplay of bones, joints, and muscles, focusing solely on the skeleton during generation neglects the rich information carried by muscles and soft tissues, as well as their influence on movement, ultimately limiting the variability and precision of the generated motions. In this paper, we introduce Shape-conditioned Motion Diffusion model (SMD), which enables the generation of human motion directly in the form of a mesh sequence, conditioned on both a text prompt and a body mesh. To fully exploit the mesh representation while minimizing resource costs, we employ spectral representation using the graph Laplacian to encode body meshes into the learning process. Unlike retargeting methods, our model does not require source motion data and generates a variety of desired semantic motions that is inherently tailored to the given identity shape. Extensive experimental evaluations show that the SMD model not only maintains the body shape consistently with the conditioning input across motion frames but also achieves competitive performance in text-to-motion and action-to-motion tasks compared to state-of-the-art methods.

Browse

Browsing EG2025 by Subject "Artificial intelligence"

Results Per Page

Sort Options