EG 2025 - Full Papers - CGF 44-Issue 2

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607136

Browse

Now showing 1 - 20 of 75

Learning Image Fractals Using Chaotic Differentiable Point Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Djeacoumar, Adarsh; Mujkanovic, Felix; Seidel, Hans-Peter; Leimkühler, Thomas; Bousseau, Adrien; Day, Angela
Fractal geometry, defined by self-similar patterns across scales, is crucial for understanding natural structures. This work addresses the fractal inverse problem, which involves extracting fractal codes from images to explain these patterns and synthesize them at arbitrary finer scales. We introduce a novel algorithm that optimizes Iterated Function System parameters using a custom fractal generator combined with differentiable point splatting. By integrating both stochastic and gradient-based optimization techniques, our approach effectively navigates the complex energy landscapes typical of fractal inversion, ensuring robust performance and the ability to escape local minima. We demonstrate the method's effectiveness through comparisons with various fractal inversion techniques, highlighting its ability to recover high-quality fractal codes and perform extensive zoom-ins to reveal intricate patterns from just a single image.
Multi-Modal Instrument Performances (MMIP): A Musical Database
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kyriakou, Theodoros; Aristidou, Andreas; Charalambous, Panayiotis; Bousseau, Adrien; Day, Angela
Musical instrument performances are multimodal creative art forms that integrate audiovisual elements, resulting from musicians' interactions with instruments through body movements, finger actions, and facial expressions. Digitizing such performances for archiving, streaming, analysis, or synthesis requires capturing every element that shapes the overall experience, which is crucial for preserving the performance's essence. In this work, following current trends in large-scale dataset development for deep learning analysis and generative models, we introduce the Multi-Modal Instrument Performances (MMIP) database (https://mmip.cs.ucy.ac.cy). This is the first dataset to incorporate synchronized high-quality 3D motion capture data for the body, fingers, facial expressions, and instruments, along with audio, multi-angle videos, and MIDI data. The database currently includes 3.5 hours of performances featuring three instruments: guitar, piano, and drums. Additionally, we discuss the challenges of acquiring these multi-modal data, detailing our approach to data collection, signal synchronization, annotation, and metadata management. Our data formats align with industry standards for ease of use, and we have developed an open-access online repository that offers a user-friendly environment for data exploration, supporting data organization, search capabilities, and custom visualization tools. Notable features include a MIDI-to-instrument animation project for visualizing the instruments and a script for playing back FBX files with synchronized audio in a web environment.
Learning Metric Fields for Fast Low-Distortion Mesh Parameterizations
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fargion, Guy; Weber, Ofir; Bousseau, Adrien; Day, Angela
We present a fast and robust method for computing an injective parameterization with low isometric distortion for disk-like triangular meshes. Harmonic function-based methods, with their rich mathematical foundation, are widely used. Harmonic maps are particularly valuable for ensuring injectivity under certain boundary conditions. In addition, they offer computational efficiency by forming a linear subspace [FW22]. However, this restricted subspace often leads to significant isometric distortion, especially for highly curved surfaces. Conversely, methods that operate in the full space of piecewise linear maps [SPSH∗17] achieve lower isometric distortion, but at a higher computational cost. Aigerman et al. [AGK∗22] pioneered a parameterization method that uses deep neural networks to predict the Jacobians of the map at mesh triangles, and integrates them into an explicit map by solving a Poisson equation. However, this approach often results in significant Poisson reconstruction errors due to the inability to ensure the integrability of the predicted neural Jacobian field, leading to unbounded distortion and lack of local injectivity. We propose a hybrid method that combines the speed and robustness of harmonic maps with the generality of deep neural networks to produce injective maps with low isometric distortion much faster than state-of-the-art methods. The core concept is simple but powerful. Instead of learning Jacobian fields, we learn metric tensor fields over the input mesh, resulting in a customized Laplacian matrix that defines a harmonic map in a modified metric [WGS23]. Our approach ensures injectivity, offers great computational efficiency, and produces significantly lower isometric distortion compared to straightforward harmonic maps.
"Wild West" of Evaluating Speech-Driven 3D Facial Animation Synthesis: A Benchmark Study
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Haque, Kazi Injamamul; Pavlou, Alkiviadis; Yumak, Zerrin; Bousseau, Adrien; Day, Angela
Recent advancements in the field of audio-driven 3D facial animation have accelerated rapidly, with numerous papers being published in a short span of time. This surge in research has garnered significant attention from both academia and industry with its potential applications on digital humans. Various approaches, both deterministic and non-deterministic, have been explored based on foundational advancements in deep learning algorithms. However, there remains no consensus among researchers on standardized methods for evaluating these techniques. Additionally, rather than converging on a common set of datasets and objective metrics suited for specific methods, recent works exhibit considerable variation in experimental setups. This inconsistency complicates the research landscape, making it difficult to establish a streamlined evaluation process and rendering many cross-paper comparisons challenging. Moreover, the common practice of A/B testing in perceptual studies focus only on two common metrics and not sufficient for non-deterministic and emotion-enabled approaches. The lack of correlations between subjective and objective metrics points out that there is a need for critical analysis in this space. In this study, we address these issues by benchmarking state-of-the-art deterministic and non-deterministic models, utilizing a consistent experimental setup across a carefully curated set of objective metrics and datasets. We also conduct a perceptual user study to assess whether subjective perceptual metrics align with the objective metrics. Our findings indicate that model rankings do not necessarily generalize across datasets, and subjective metric ratings are not always consistent with their corresponding objective metrics. The supplementary video, edited code scripts for training on different datasets and documentation related to this benchmark study are made publicly available- https://galib360.github.io/face-benchmark-project/.
A Semi-Implicit SPH Method for Compressible and Incompressible Flows with Improved Convergence
(The Eurographics Association and John Wiley & Sons Ltd., 2025) He, Xiaowei; Liu, Shusen; Guo, Yuzhong; Shi, Jian; Qiao, Ying; Bousseau, Adrien; Day, Angela
In simulating fluids using position-based dynamics, the accuracy and robustness depend on numerous numerical parameters, including the time step size, iteration count, and particle size, among others. This complexity can lead to unpredictable control of simulation behaviors. In this paper, we first reformulate the problem of enforcing fluid compressibility/incompressibility into an nonlinear optimization problem, and then introduce a semi-implicit successive substitution method (SISSM) to solve the nonlinear optimization problem by adjusting particle positions in parallel. In contrast to calculating an intermediate variable, such as pressure, to enforce fluid incompressibility within the position-based dynamics (PBD) framework, the proposed semiimplicit approach eliminates the necessity of such calculations. Instead, it directly employs successive substitution of particle positions to correct density errors. This method exhibits reduced dependency to numerical parameters, such as particle size and time step variations, and improves consistency and stability in simulating fluids that range from highly compressible to nearly incompressible. We validates the effectiveness of applying a variety of different techniques in accelerating the convergence rate.
Neural Face Skinning for Mesh-agnostic Facial Expression Cloning
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cha, Sihun; Yoon, Serin; Seo, Kwanggyoon; Noh, Junyong; Bousseau, Adrien; Day, Angela
Accurately retargeting facial expressions to a face mesh while enabling manipulation is a key challenge in facial animation retargeting. Recent deep-learning methods address this by encoding facial expressions into a global latent code, but they often fail to capture fine-grained details in local regions. While some methods improve local accuracy by transferring deformations locally, this often complicates overall control of the facial expression. To address this, we propose a method that combines the strengths of both global and local deformation models. Our approach enables intuitive control and detailed expression cloning across diverse face meshes, regardless of their underlying structures. The core idea is to localize the influence of the global latent code on the target mesh. Our model learns to predict skinning weights for each vertex of the target face mesh through indirect supervision from predefined segmentation labels. These predicted weights localize the global latent code, enabling precise and region-specific deformations even for meshes with unseen shapes. We supervise the latent code using Facial Action Coding System (FACS)-based blendshapes to ensure interpretability and allow straightforward editing of the generated animation. Through extensive experiments, we demonstrate improved performance over state-of-the-art methods in terms of expression fidelity, deformation transfer accuracy, and adaptability across diverse mesh structures.
Linearly Transformed Spherical Distributions for Interactive Single Scattering with Area Lights
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kt, Aakash; Shah, Ishaan; Narayanan, P. J.; Bousseau, Adrien; Day, Angela
Single scattering in scenes with participating media is challenging, especially in the presence of area lights. Considerable variance still remains, in spite of good importance sampling strategies. Analytic methods that render unshadowed surface illumination have recently gained interest since they achieve biased but noise-free plausible renderings while being computationally efficient. In this work, we extend the theory of Linearly Transformed Spherical Distributions (LTSDs) which is a well-known analytic method for surface illumination, to work with phase functions. We show that this is non-trivial, and arrive at a solution with in-depth analysis. This enables us to analytically compute in-scattered radiance, which we build on to semi-analytically render unshadowed single scattering. We ground our derivations and formulations on the Volume Rendering Equation (VRE) which paves the way for realistic renderings despite the biased nature of our method. We also formulate ratio estimators for the VRE to work in conjunction with our formulation, enabling the rendering of shadows. We extensively validate our method, analyze its characteristics and demonstrate better performance compared to Monte Carlo single-scattering.
Synchronized Multi-Frame Diffusion for Temporally Consistent Video Stylization
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xie, Minshan; Liu, Hanyuan; Li, Chengze; Wong, Tien-Tsin; Bousseau, Adrien; Day, Angela
Text-guided video-to-video stylization transforms the visual appearance of a source video to a different appearance guided on textual prompts. Existing text-guided image diffusion models can be extended for stylized video synthesis. However, they struggle to generate videos with both highly detailed appearance and temporal consistency. In this paper, we propose a synchronized multi-frame diffusion framework to maintain both the visual details and the temporal consistency. Frames are denoised in a synchronous fashion, and more importantly, information of different frames is shared since the beginning of the denoising process. Such information sharing ensures that a consensus, in terms of the overall structure and color distribution, among frames can be reached in the early stage of the denoising process before it is too late. The optical flow from the original video serves as the connection, and hence the venue for information sharing, among frames. We demonstrate the effectiveness of our method in generating high-quality and diverse results in extensive experiments. Our method shows superior qualitative and quantitative results compared to state-of-the-art video editing methods.
Differential Diffusion: Giving Each Pixel Its Strength
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Levin, Eran; Fried, Ohad; Bousseau, Adrien; Day, Angela
Diffusion models have revolutionized image generation and editing, producing state-of-the-art results in conditioned and unconditioned image synthesis. While current techniques enable user control over the degree of change in an image edit, the controllability is limited to global changes over an entire edited region. This paper introduces a novel framework that enables customization of the amount of change per pixel or per image region. Our framework can be integrated into any existing diffusion model, enhancing it with this capability. Such granular control opens up a diverse array of new editing capabilities, such as control of the extent to which individual objects are modified, or the ability to introduce gradual spatial changes. Furthermore, we showcase the framework's effectiveness in soft-inpainting-the completion of portions of an image while subtly adjusting the surrounding areas to ensure seamless integration. Additionally, we introduce a new tool for exploring the effects of different change quantities. Our framework operates solely during inference, requiring no model training or fine-tuning. We demonstrate our method with the current open state-of-the-art models, and validate it via both quantitative and qualitative comparisons, and a user study. Our code is published and integrated into several platforms.
Screentone-Preserved Manga Retargeting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xie, Minshan; Xia, Menghan; Li, Chengze; Liu, Xueting; Wong, Tien-Tsin; Bousseau, Adrien; Day, Angela
As a popular comic style, manga offers a unique impression by utilizing a rich set of bitonal patterns, or screentones, for illustration. However, screentones can easily be degraded when manga is resized in terms of aspect ratio and resolution for manga re-layout and e-manga migration applications. To tackle this problem, we propose the first automatic manga retargeting method that synthesizes a retargeted manga image while preserving the prominent structure and fine screentone intended by the manga artist. While modern natural photo retargeting methods can achieve prominent structure preservation, preserving screentones within arbitrarily shaped regions is very challenging due to two properties of manga: (i) pattern constancy under translation, and (ii) non-compatibility with interpolation. To circumvent this barrier, we propose learning a quantized representation of screentones that is translation-invariant and pointwisely representable through a tailored manga reconstruction network with a screentone-anchored codebook. Thanks to these merits, we can perform the re-synthesis operation using existing photo retargeting methods and achieve the desired manga retargeting results.We conducted extensive qualitative and quantitative experiments to validate the effectiveness of our method, and we achieved notably compelling results compared to alternative methods.
Mesh Compression with Quantized Neural Displacement Fields
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Pentapati, Sai Karthikey; Phillips, Gregoire; Bovik, Alan C.; Bousseau, Adrien; Day, Angela
Implicit neural representations (INRs) have been successfully used to compress a variety of 3D surface representations such as Signed Distance Functions (SDFs), voxel grids, and also other forms of structured data such as images, videos, and audio. However, these methods have been limited in their application to unstructured data such as 3D meshes and point clouds. This work presents a simple yet effective method that extends the usage of INRs to compress 3D triangle meshes. Our method encodes a displacement field that refines the coarse version of the 3D mesh surface to be compressed using a small neural network. Once trained, the neural network weights occupy much lower memory than the displacement field or the original surface. We show that our method is capable of preserving intricate geometric textures and demonstrates state-of-the-art performance for compression ratios ranging from 4x to 380x (See Figure 1 for an example).
BlendSim: Simulation on Parametric Blendshapes using Spacetime Projective Dynamics
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wu, Yuhan; Umetani, Nobuyuki; Bousseau, Adrien; Day, Angela
We propose BlendSim, a novel framework for editable simulation using spacetime optimization on the lightweight animation representation. Traditional spacetime control methods suffer from a high computational complexity, which limits their use in interactive animation. The proposed approach effectively reduces the dimensionality of the problem by representing the motion trajectories of each vertex using continuous parametric Bézier splines with variable keyframe times. Because this mesh animation representation is continuous and fully differentiable, it can be optimized such that it follows the laws of physics under various constraints. The proposed method also integrates constraints, such as collisions and cyclic motion, making it suitable for real-world applications where seamless looping and physical interactions are required. Leveraging projective dynamics, we further enhance the computational efficiency by decoupling the optimization into local parallelizable and global quadratic steps, enabling a fast and stable simulation. In addition, BlendSim is compatible with modern animation workflows and file formats, such as the glTF, making it practical way for authoring and transferring mesh animation.
Neural Two-Level Monte Carlo Real-Time Rendering
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Dereviannykh, Mikhail; Klepikov, Dmitrii; Hanika, Johannes; Dachsbacher, Carsten; Bousseau, Adrien; Day, Angela
We introduce an efficient Two-Level Monte Carlo (subset of Multi-Level Monte Carlo, MLMC) estimator for real-time rendering of scenes with global illumination. Using MLMC we split the shading integral into two parts: the radiance cache integral and the residual error integral that compensates for the bias of the first one. For the first part, we developed the Neural Incident Radiance Cache (NIRC) leveraging the power of tiny neural networks [MRNK21] as a building block, which is trained on the fly. The cache is designed to provide a fast and reasonable approximation of the incident radiance: an evaluation takes 2-25× less compute time than a path tracing sample. This enables us to estimate the radiance cache integral with a high number of samples and by this achieve faster convergence. For the residual error integral, we compute the difference between the NIRC predictions and the unbiased path tracing simulation. Our method makes no assumptions about the geometry, materials, or lighting of a scene and has only few intuitive hyper-parameters. We provide a comprehensive comparative analysis in different experimental scenarios. Since the algorithm is trained in an on-line fashion, it demonstrates significant noise level reduction even for dynamic scenes and can easily be combined with other noise reduction techniques.
Physically Based Real-Time Rendering of Eclipses
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Schneegans, Simon; Gilg, Jonas; Ahlers, Volker; Zachmann, Gabriel; Gerndt, Andreas; Bousseau, Adrien; Day, Angela
We present a novel approach for simulating eclipses, incorporating effects of light scattering and refraction in the occluder's atmosphere. Our approach not only simulates the eclipse shadow, but also allows for watching the Sun being eclipsed by the occluder. The latter is a spectacular sight which has never been seen by human eyes: For an observer on the lunar surface, the atmosphere around Earth turns into a glowing red ring as sunlight is refracted around the planet. To simulate this, we add three key contributions: First, we extend the Bruneton atmosphere model to simulate refraction. This allows light rays to be bent into the shadow cone. Refraction also adds realism to the atmosphere as it deforms and displaces the Sun during sunrise and sunset. Second, we show how to precompute the eclipse shadow using this extended atmosphere model. Third, we show how to efficiently visualize the glowing atmosphere ring around the occluder. Our approach produces visually accurate results suited for scientific visualizations, science communication, and video games. It is not limited to the Earth-Moon system, but can also be used to simulate the shadow of Mars and potentially other bodies. We demonstrate the physical soundness of our approach by comparing the results to reference data. Because no data is available for eclipses beyond the Earth-Moon system, we predict how an eclipse on a Martian moon will look like. Our implementation is available under the terms of the MIT license.
Many-Light Rendering Using ReSTIR-Sampled Shadow Maps
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Zhang, Song; Lin, Daqi; Wyman, Chris; Yuksel, Cem; Bousseau, Adrien; Day, Angela
We present a practical method targeting dynamic shadow maps for many light sources in real-time rendering. We compute fullresolution shadow maps for a subset of lights, which we select with spatiotemporal reservoir resampling (ReSTIR). Our selection strategy automatically regenerates shadow maps for lights with the strongest contributions to pixels in the current camera view. The remaining lights are handled using imperfect shadow maps, which provide low-resolution shadow approximation. We significantly reduce the computation and storage compared to using all full-resolution shadow maps and substantially improve shadow quality compared to handling all lights with imperfect shadow maps.
Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bemana, Mojtaba; Leimkühler, Thomas; Myszkowski, Karol; Seidel, Hans-Peter; Ritschel, Tobias; Bousseau, Adrien; Day, Angela
We demonstrate generating HDR images using the concerted action of multiple black-box, pre-trained LDR image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and, second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called ''exposure brackets'', to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce a brackets consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.
Fast Sphere Tracing of Procedural Volumetric Noise for very Large and Detailed Scenes
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Moinet, Mathéo; Neyret, Fabrice; Bousseau, Adrien; Day, Angela
Real-time walk through very large and detailed scenes is a challenge for both content design, data management, and rendering, and requires LOD to handle the scale range. In the case of partly stochastic content (clouds, cosmic dust, fire, terrains, etc.), proceduralism allows arbitrary large and detailed scenes with no or little storage and offers embedded LOD, but the rendering gets even costlier. In this paper, we propose to boost the performance of Fractional Brownian Motion (FBM)-based noise rendering (e.g., 3D Perlin noise, hypertextures) in two ways: improving the stepping efficiency of Sphere Tracing of general Signed Distance Functions (SDF) considering the first and second derivatives, and treating cascaded sums such as FBM as nested bounding volumes. We illustrate this on various scenes made of either opaque material, constant semi-transparent material, or non-constant (i.e., full volumetric inside) material, including animated content - thanks to on-the-fly proceduralism. We obtain real-time performances with speedups up to 12-folds on opaque or constant semi-transparent scenes compared to classical Sphere tracing, and up to 2-folds (through empty space skipping optimization) on non-constant density volumetric scenes.
FastAtlas: Real-Time Compact Atlases for Texture Space Shading
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Vining, Nicholas; Majercik, Zander; Gu, Floria; Takikawa, Towaki; Trusty, Ty; Lalonde, Paul; McGuire, Morgan; Sheffer, Alla; Bousseau, Adrien; Day, Angela
Texture-space shading (TSS) methods decouple shading and rasterization, allowing shading to be performed at a different framerate and spatial resolution than rasterization. TSS has many potential applications, including streaming shading across networks, and reducing rendering cost via shading reuse across consecutive frames and/or shading at reduced resolutions relative to display resolution. Real-time TSS shading requires texture atlases small enough to be easily stored in GPU memory. Using static atlases leads to significant space wastage, motivating real-time per-frame atlassing strategies that pack only the content visible in each frame. We propose FastAtlas, a novel atlasing method that runs entirely on the GPU and is fast enough to be performed at interactive rates per-frame. Our method combines new per-frame chart computation and parametrization strategies and an efficient general chart packing algorithm. Our chartification strategy removes visible seams in output renders, and our parameterization ensures a constant texel-to-pixel ratio, avoiding undesirable undersampling artifacts. Our packing method is more general, and produces more tightly packed atlases, than previous work. Jointly, these innovations enable us to produce shading outputs of significantly higher visual quality than those produced using alternative atlasing strategies. We validate FastAtlas by shading and rendering challenging scenes using different atlasing settings, reflecting the needs of different TSS applications (temporal reuse, streaming, reduced or elevated shading rates). We extensively compare FastAtlas to prior alternatives and demonstrate that it achieves better shading quality and reduces texture stretch compared to prior approaches using the same settings.
REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Almog, Gal; Shamir, Ariel; Fried, Ohad; Bousseau, Adrien; Day, Angela
While latent diffusion models achieve impressive image editing results, their application to iterative editing of the same image is severely restricted. When trying to apply consecutive edit operations using current models, they accumulate artifacts and noise due to repeated transitions between pixel and latent spaces. Some methods have attempted to address this limitation by performing the entire edit chain within the latent space, sacrificing flexibility by supporting only a limited, predetermined set of diffusion editing operations. We present a re-encode decode (REED) training scheme for variational autoencoders (VAEs), which promotes image quality preservation even after many iterations. Our work enables multi-method iterative image editing: users can perform a variety of iterative edit operations, with each operation building on the output of the previous one using both diffusion based operations and conventional editing techniques. We demonstrate the advantage of REED-VAE across a range of image editing scenarios, including text-based and mask-based editing frameworks. In addition, we show how REEDVAE enhances the overall editability of images, increasing the likelihood of successful and precise edit operations. We hope that this work will serve as a benchmark for the newly introduced task of multi-method image editing.
A Unified Discrete Collision Framework for Triangle Primitives
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kikuchi, Tomoyo; Kanai, Takashi; Bousseau, Adrien; Day, Angela
We present a unified, primitive-first framework with DCD for collision response in physics-based simulations. Previous methods do not provide sufficient solutions on a framework that resolves edge-triangle and edge-edge collisions when handling selfcollisions and inter-object collisions in a unified manner. We define a scalar function and its gradient, representing the distance between two triangles and the movement direction for collision response, respectively. The resulting method offers an effective solution for collisions with minor computational overhead and robustness for any type of deformable object, such as solids or cloth. The algorithm is conceptually simple and easy to implement. When using PBD/XPBD, it is straightforward to incorporate our method into a collision constraint.

Browse

Browsing EG 2025 - Full Papers - CGF 44-Issue 2 by Issue Date

Results Per Page

Sort Options