44-Issue 2
Permanent URI for this collection
Browse
Browsing 44-Issue 2 by Issue Date
Now showing 1 - 20 of 75
Results Per Page
Sort Options
Item Learning Image Fractals Using Chaotic Differentiable Point Splatting(The Eurographics Association and John Wiley & Sons Ltd., 2025) Djeacoumar, Adarsh; Mujkanovic, Felix; Seidel, Hans-Peter; Leimkühler, Thomas; Bousseau, Adrien; Day, AngelaFractal geometry, defined by self-similar patterns across scales, is crucial for understanding natural structures. This work addresses the fractal inverse problem, which involves extracting fractal codes from images to explain these patterns and synthesize them at arbitrary finer scales. We introduce a novel algorithm that optimizes Iterated Function System parameters using a custom fractal generator combined with differentiable point splatting. By integrating both stochastic and gradient-based optimization techniques, our approach effectively navigates the complex energy landscapes typical of fractal inversion, ensuring robust performance and the ability to escape local minima. We demonstrate the method's effectiveness through comparisons with various fractal inversion techniques, highlighting its ability to recover high-quality fractal codes and perform extensive zoom-ins to reveal intricate patterns from just a single image.Item Multi-Modal Instrument Performances (MMIP): A Musical Database(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kyriakou, Theodoros; Aristidou, Andreas; Charalambous, Panayiotis; Bousseau, Adrien; Day, AngelaMusical instrument performances are multimodal creative art forms that integrate audiovisual elements, resulting from musicians' interactions with instruments through body movements, finger actions, and facial expressions. Digitizing such performances for archiving, streaming, analysis, or synthesis requires capturing every element that shapes the overall experience, which is crucial for preserving the performance's essence. In this work, following current trends in large-scale dataset development for deep learning analysis and generative models, we introduce the Multi-Modal Instrument Performances (MMIP) database (https://mmip.cs.ucy.ac.cy). This is the first dataset to incorporate synchronized high-quality 3D motion capture data for the body, fingers, facial expressions, and instruments, along with audio, multi-angle videos, and MIDI data. The database currently includes 3.5 hours of performances featuring three instruments: guitar, piano, and drums. Additionally, we discuss the challenges of acquiring these multi-modal data, detailing our approach to data collection, signal synchronization, annotation, and metadata management. Our data formats align with industry standards for ease of use, and we have developed an open-access online repository that offers a user-friendly environment for data exploration, supporting data organization, search capabilities, and custom visualization tools. Notable features include a MIDI-to-instrument animation project for visualizing the instruments and a script for playing back FBX files with synchronized audio in a web environment.Item Learning Metric Fields for Fast Low-Distortion Mesh Parameterizations(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fargion, Guy; Weber, Ofir; Bousseau, Adrien; Day, AngelaWe present a fast and robust method for computing an injective parameterization with low isometric distortion for disk-like triangular meshes. Harmonic function-based methods, with their rich mathematical foundation, are widely used. Harmonic maps are particularly valuable for ensuring injectivity under certain boundary conditions. In addition, they offer computational efficiency by forming a linear subspace [FW22]. However, this restricted subspace often leads to significant isometric distortion, especially for highly curved surfaces. Conversely, methods that operate in the full space of piecewise linear maps [SPSH∗17] achieve lower isometric distortion, but at a higher computational cost. Aigerman et al. [AGK∗22] pioneered a parameterization method that uses deep neural networks to predict the Jacobians of the map at mesh triangles, and integrates them into an explicit map by solving a Poisson equation. However, this approach often results in significant Poisson reconstruction errors due to the inability to ensure the integrability of the predicted neural Jacobian field, leading to unbounded distortion and lack of local injectivity. We propose a hybrid method that combines the speed and robustness of harmonic maps with the generality of deep neural networks to produce injective maps with low isometric distortion much faster than state-of-the-art methods. The core concept is simple but powerful. Instead of learning Jacobian fields, we learn metric tensor fields over the input mesh, resulting in a customized Laplacian matrix that defines a harmonic map in a modified metric [WGS23]. Our approach ensures injectivity, offers great computational efficiency, and produces significantly lower isometric distortion compared to straightforward harmonic maps.Item "Wild West" of Evaluating Speech-Driven 3D Facial Animation Synthesis: A Benchmark Study(The Eurographics Association and John Wiley & Sons Ltd., 2025) Haque, Kazi Injamamul; Pavlou, Alkiviadis; Yumak, Zerrin; Bousseau, Adrien; Day, AngelaRecent advancements in the field of audio-driven 3D facial animation have accelerated rapidly, with numerous papers being published in a short span of time. This surge in research has garnered significant attention from both academia and industry with its potential applications on digital humans. Various approaches, both deterministic and non-deterministic, have been explored based on foundational advancements in deep learning algorithms. However, there remains no consensus among researchers on standardized methods for evaluating these techniques. Additionally, rather than converging on a common set of datasets and objective metrics suited for specific methods, recent works exhibit considerable variation in experimental setups. This inconsistency complicates the research landscape, making it difficult to establish a streamlined evaluation process and rendering many cross-paper comparisons challenging. Moreover, the common practice of A/B testing in perceptual studies focus only on two common metrics and not sufficient for non-deterministic and emotion-enabled approaches. The lack of correlations between subjective and objective metrics points out that there is a need for critical analysis in this space. In this study, we address these issues by benchmarking state-of-the-art deterministic and non-deterministic models, utilizing a consistent experimental setup across a carefully curated set of objective metrics and datasets. We also conduct a perceptual user study to assess whether subjective perceptual metrics align with the objective metrics. Our findings indicate that model rankings do not necessarily generalize across datasets, and subjective metric ratings are not always consistent with their corresponding objective metrics. The supplementary video, edited code scripts for training on different datasets and documentation related to this benchmark study are made publicly available- https://galib360.github.io/face-benchmark-project/.Item Many-Light Rendering Using ReSTIR-Sampled Shadow Maps(The Eurographics Association and John Wiley & Sons Ltd., 2025) Zhang, Song; Lin, Daqi; Wyman, Chris; Yuksel, Cem; Bousseau, Adrien; Day, AngelaWe present a practical method targeting dynamic shadow maps for many light sources in real-time rendering. We compute fullresolution shadow maps for a subset of lights, which we select with spatiotemporal reservoir resampling (ReSTIR). Our selection strategy automatically regenerates shadow maps for lights with the strongest contributions to pixels in the current camera view. The remaining lights are handled using imperfect shadow maps, which provide low-resolution shadow approximation. We significantly reduce the computation and storage compared to using all full-resolution shadow maps and substantially improve shadow quality compared to handling all lights with imperfect shadow maps.Item Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bemana, Mojtaba; Leimkühler, Thomas; Myszkowski, Karol; Seidel, Hans-Peter; Ritschel, Tobias; Bousseau, Adrien; Day, AngelaWe demonstrate generating HDR images using the concerted action of multiple black-box, pre-trained LDR image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and, second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called ''exposure brackets'', to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce a brackets consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.Item Fast Sphere Tracing of Procedural Volumetric Noise for very Large and Detailed Scenes(The Eurographics Association and John Wiley & Sons Ltd., 2025) Moinet, Mathéo; Neyret, Fabrice; Bousseau, Adrien; Day, AngelaReal-time walk through very large and detailed scenes is a challenge for both content design, data management, and rendering, and requires LOD to handle the scale range. In the case of partly stochastic content (clouds, cosmic dust, fire, terrains, etc.), proceduralism allows arbitrary large and detailed scenes with no or little storage and offers embedded LOD, but the rendering gets even costlier. In this paper, we propose to boost the performance of Fractional Brownian Motion (FBM)-based noise rendering (e.g., 3D Perlin noise, hypertextures) in two ways: improving the stepping efficiency of Sphere Tracing of general Signed Distance Functions (SDF) considering the first and second derivatives, and treating cascaded sums such as FBM as nested bounding volumes. We illustrate this on various scenes made of either opaque material, constant semi-transparent material, or non-constant (i.e., full volumetric inside) material, including animated content - thanks to on-the-fly proceduralism. We obtain real-time performances with speedups up to 12-folds on opaque or constant semi-transparent scenes compared to classical Sphere tracing, and up to 2-folds (through empty space skipping optimization) on non-constant density volumetric scenes.Item FastAtlas: Real-Time Compact Atlases for Texture Space Shading(The Eurographics Association and John Wiley & Sons Ltd., 2025) Vining, Nicholas; Majercik, Zander; Gu, Floria; Takikawa, Towaki; Trusty, Ty; Lalonde, Paul; McGuire, Morgan; Sheffer, Alla; Bousseau, Adrien; Day, AngelaTexture-space shading (TSS) methods decouple shading and rasterization, allowing shading to be performed at a different framerate and spatial resolution than rasterization. TSS has many potential applications, including streaming shading across networks, and reducing rendering cost via shading reuse across consecutive frames and/or shading at reduced resolutions relative to display resolution. Real-time TSS shading requires texture atlases small enough to be easily stored in GPU memory. Using static atlases leads to significant space wastage, motivating real-time per-frame atlassing strategies that pack only the content visible in each frame. We propose FastAtlas, a novel atlasing method that runs entirely on the GPU and is fast enough to be performed at interactive rates per-frame. Our method combines new per-frame chart computation and parametrization strategies and an efficient general chart packing algorithm. Our chartification strategy removes visible seams in output renders, and our parameterization ensures a constant texel-to-pixel ratio, avoiding undesirable undersampling artifacts. Our packing method is more general, and produces more tightly packed atlases, than previous work. Jointly, these innovations enable us to produce shading outputs of significantly higher visual quality than those produced using alternative atlasing strategies. We validate FastAtlas by shading and rendering challenging scenes using different atlasing settings, reflecting the needs of different TSS applications (temporal reuse, streaming, reduced or elevated shading rates). We extensively compare FastAtlas to prior alternatives and demonstrate that it achieves better shading quality and reduces texture stretch compared to prior approaches using the same settings.Item REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models(The Eurographics Association and John Wiley & Sons Ltd., 2025) Almog, Gal; Shamir, Ariel; Fried, Ohad; Bousseau, Adrien; Day, AngelaWhile latent diffusion models achieve impressive image editing results, their application to iterative editing of the same image is severely restricted. When trying to apply consecutive edit operations using current models, they accumulate artifacts and noise due to repeated transitions between pixel and latent spaces. Some methods have attempted to address this limitation by performing the entire edit chain within the latent space, sacrificing flexibility by supporting only a limited, predetermined set of diffusion editing operations. We present a re-encode decode (REED) training scheme for variational autoencoders (VAEs), which promotes image quality preservation even after many iterations. Our work enables multi-method iterative image editing: users can perform a variety of iterative edit operations, with each operation building on the output of the previous one using both diffusion based operations and conventional editing techniques. We demonstrate the advantage of REED-VAE across a range of image editing scenarios, including text-based and mask-based editing frameworks. In addition, we show how REEDVAE enhances the overall editability of images, increasing the likelihood of successful and precise edit operations. We hope that this work will serve as a benchmark for the newly introduced task of multi-method image editing.Item A Unified Discrete Collision Framework for Triangle Primitives(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kikuchi, Tomoyo; Kanai, Takashi; Bousseau, Adrien; Day, AngelaWe present a unified, primitive-first framework with DCD for collision response in physics-based simulations. Previous methods do not provide sufficient solutions on a framework that resolves edge-triangle and edge-edge collisions when handling selfcollisions and inter-object collisions in a unified manner. We define a scalar function and its gradient, representing the distance between two triangles and the movement direction for collision response, respectively. The resulting method offers an effective solution for collisions with minor computational overhead and robustness for any type of deformable object, such as solids or cloth. The algorithm is conceptually simple and easy to implement. When using PBD/XPBD, it is straightforward to incorporate our method into a collision constraint.Item Infusion: Internal Diffusion for Inpainting of Dynamic Textures and Complex Motion(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cherel, Nicolas; Almansa, Andrés; Gousseau, Yann; Newson, Alasdair; Bousseau, Adrien; Day, AngelaVideo inpainting is the task of filling a region in a video in a visually convincing manner. It is very challenging due to the high dimensionality of the data and the temporal consistency required for obtaining convincing results. Recently, diffusion models have shown impressive results in modeling complex data distributions, including images and videos. Such models remain nonetheless very expensive to train and to perform inference with, which strongly reduce their applicability to videos, and yields unreasonable computational loads. We show that in the case of video inpainting, thanks to the highly auto-similar nature of videos, the training data of a diffusion model can be restricted to the input video and still produce very satisfying results. With this internal learning approach, where the training data is limited to a single video, our lightweight models perform very well with only half a million parameters, in contrast to the very large networks with billions of parameters typically found in the literature. We also introduce a new method for efficient training and inference of diffusion models in the context of internal learning, by splitting the diffusion process into different learning intervals corresponding to different noise levels of the diffusion process. We show qualitative and quantitative results, demonstrating that our method reaches or exceeds state of the art performance in the case of dynamic textures and complex dynamic backgrounds.Item 2D Neural Fields with Learned Discontinuities(The Eurographics Association and John Wiley & Sons Ltd., 2025) Liu, Chenxi; Wang, Siqi; Fisher, Matthew; Aneja, Deepali; Jacobson, Alec; Bousseau, Adrien; Day, AngelaEffective representation of 2D images is fundamental in digital image processing, where traditional methods like raster and vector graphics struggle with sharpness and textural complexity, respectively. Current neural fields offer high fidelity and resolution independence but require predefined meshes with known discontinuities, restricting their utility. We observe that by treating all mesh edges as potential discontinuities, we can represent the discontinuity magnitudes as continuous variables and optimize. We further introduce a novel discontinuous neural field model that jointly approximates the target image and recovers discontinuities. Through systematic evaluations, our neural field outperforms other methods that fit unknown discontinuities with discontinuous representations, exceeding Field of Junction and Boundary Attention by over 11dB in both denoising and super-resolution tasks and achieving 3.5× smaller Chamfer distances than Mumford-Shah-based methods. It also surpasses InstantNGP with improvements of more than 5dB (denoising) and 10dB (super-resolution). Additionally, our approach shows remarkable capability in approximating complex artistic and natural images and cleaning up diffusion-generated depth maps.Item Optimizing Free-Form Grid Shells with Reclaimed Elements under Inventory Constraints(The Eurographics Association and John Wiley & Sons Ltd., 2025) Favilli, Andrea; Laccone, Francesco; Cignoni, Paolo; Malomo, Luigi; Giorgi, Daniela; Bousseau, Adrien; Day, AngelaWe propose a method for designing 3D architectural free-form surfaces, represented as grid shells with beams sourced from inventories of reclaimed elements from dismantled buildings. In inventory-constrained design, the reused elements must be paired with elements in the target design. Traditional solutions to this assignment problem often result in cuts and material waste or geometric distortions that affect the surface aesthetics and buildability. Our method for inventory-constrained assisted design blends the traditional assignment problem with differentiable geometry optimization to reduce cut-off waste while preserving the design intent. Additionally, we extend our approach to incorporate strain energy minimization for structural efficiency. We design differentiable losses that account for inventory, geometry, and structural constraints, and streamline them into a complete pipeline, demonstrated through several case studies. Our approach enables the reuse of existing elements for new designs, reducing the need for sourcing new materials and disposing of waste. Consequently, it can serve as an initial step towards mitigating the significant environmental impact of the construction sector.Item Preconditioned Single-step Transforms for Non-rigid ICP(The Eurographics Association and John Wiley & Sons Ltd., 2025) Jung, Yucheol; Kim, Hyomin; Yoon, Hyejeong; Lee, Seungyong; Bousseau, Adrien; Day, AngelaNon-rigid iterative closest point (ICP) is a popular framework for shape alignment, typically formulated as alternating iteration of correspondence search and shape transformation. A common approach in the shape transformation stage is to solve a linear least squares problem to find a smoothness-regularized transform that fits the target shape. However, completely solving the linear least squares problem to obtain a transform is wasteful because the correspondences used for constructing the problem are imperfect, especially at early iterations. In this work, we design a novel framework to compute a transform in single step without the exact linear solve. Our key idea is to use only a single step of an iterative linear system solver, conjugate gradient, at each shape transformation stage. For this single-step scheme to be effective, appropriate preconditioning of the linear system is required. We design a novel adaptive Sobolev-Jacobi preconditioning method for our single-step transform to produce a large and regularized shape update suitable for correspondence search in the next iteration. We demonstrate that our preconditioned single-step transform stably accelerates challenging 3D surface registration tasks.Item Axis-Normalized Ray-Box Intersection(The Eurographics Association and John Wiley & Sons Ltd., 2025) Friederichs, Fabian; Benthin, Carsten; Grogorick, Steve; Eisemann, Elmar; Magnor, Marcus; Eisemann, Martin; Bousseau, Adrien; Day, AngelaRay-axis aligned bounding box intersection tests play a crucial role in the runtime performance of many rendering applications, driven not by complexity but mainly by the volume of tests required. While existing solutions were believed to be pretty much optimal in terms of runtime on current hardware, our paper introduces a new intersection test requiring fewer arithmetic operations compared to all previous methods. By transforming the ray we eliminate the need for one third of the traditional bounding-slab tests and achieve a speed enhancement of approximately 13.8% or 10.9%, depending on the compiler.We present detailed runtime analyses in various scenarios.Item From Words to Wood: Text-to-Procedurally Generated Wood Materials(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hafidi, Mohcen; Wilkie, Alexander; Bousseau, Adrien; Day, AngelaIn the domain of wood modeling, we present a new complex appearance model, coupled with a user-friendly NLP-based frontend for intuitive interactivity. First, we present a procedurally generated wood model that is capable of accurately simulating intricate wood characteristics, including growth rings, vessels/pores, rays, knots, and figure. Furthermore, newly developed features were introduced, including brushiness distortion, influence points, and individual feature control. These novel enhancements facilitate a more precise matching between procedurally generated wood and ground truth images. Second, we present a text-based user interface that relies on a trained natural language processing model that is designed to map user plain English requests into the parameter space of our procedurally generated wood model. This significantly reduces the complexity of the authoring process, thereby enabling any user, regardless of their level of woodworking expertise or familiarity with procedurally generated materials, to utilize it to its fullest potential.Item Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hahlbohm, Florian; Friederichs, Fabian; Weyrich, Tim; Franke, Linus; Kappel, Moritz; Castillo, Susana; Stamminger, Marc; Eisemann, Martin; Magnor, Marcus; Bousseau, Adrien; Day, Angela3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of 3D Gaussians is resolved, in turn breaking coherence in other ways. In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates. Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs. We further show that each of these two components can be independently integrated into Gaussian splatting systems. In combination, they achieve up to 2× higher frame rates, 2× faster optimization, and equal or better image quality with fewer rendering artifacts compared to traditional 3DGS on common benchmarks.Item Differentiable Rendering based Part-Aware Occlusion Proxy Generation(The Eurographics Association and John Wiley & Sons Ltd., 2025) Tan, Zhipeng; Zhang, Yongxiang; Xia, Fei; Ling, Fei; Bousseau, Adrien; Day, AngelaSoftware occlusion culling has become a prevalent method in modern game engines. It can significantly reduce the rendering cost by using an approximate coarse mesh (occluder) to cull hidden objects. An ideal occluder should use as few faces as possible to represent the original mesh with high culling accuracy. In contrary to mesh simplification, the process of generating a high quality occlusion proxy is not well-established. Existing methods, which simply treat the mesh as a single entity, fall short in addressing complex models with interior structures. By leveraging advanced neural segmentation techniques and the optimization capabilities of differentiable rendering, in combination with a thoughtfully designed part-aware shape fitting and camera placement strategy, our approach can generate high-quality occlusion proxy mesh applicable across a diverse range of models with satisfactory precision, recall and very few faces. Moreover, extensive experiments compellingly demonstrate that our method substantially outperforms both state-of-the-art methodologies and commercial tools in terms of occlusion quality and effectiveness.Item Shape-Conditioned Human Motion Diffusion Model with Mesh Representation(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xue, Kebing; Seo, Hyewon; Bobenrieth, Cédric; Luo, Guoliang; Bousseau, Adrien; Day, AngelaHuman motion generation is a key task in computer graphics. While various conditioning signals such as text, action class, or audio have been used to harness the generation process, most existing methods neglect the case where a specific body is desired to perform the motion. Additionally, they rely on skeleton-based pose representations, necessitating additional steps to produce renderable meshes of the intended body shape. Given that human motion involves a complex interplay of bones, joints, and muscles, focusing solely on the skeleton during generation neglects the rich information carried by muscles and soft tissues, as well as their influence on movement, ultimately limiting the variability and precision of the generated motions. In this paper, we introduce Shape-conditioned Motion Diffusion model (SMD), which enables the generation of human motion directly in the form of a mesh sequence, conditioned on both a text prompt and a body mesh. To fully exploit the mesh representation while minimizing resource costs, we employ spectral representation using the graph Laplacian to encode body meshes into the learning process. Unlike retargeting methods, our model does not require source motion data and generates a variety of desired semantic motions that is inherently tailored to the given identity shape. Extensive experimental evaluations show that the SMD model not only maintains the body shape consistently with the conditioning input across motion frames but also achieves competitive performance in text-to-motion and action-to-motion tasks compared to state-of-the-art methods.Item InterFaceRays: Interaction-Oriented Furniture Surface Representation for Human Pose Retargeting(The Eurographics Association and John Wiley & Sons Ltd., 2025) Jin, Taeil; Lee, Yewon; Lee, Sung-Hee; Bousseau, Adrien; Day, AngelaMotion retargeting is a well-established technique in computer animation that adapts source motion to fit characters with different sizes, morphologies, or environments. Recent deep learning methods have shown promising results in retargeting character motion. However, retargeting human-object interactions to new environments, especially when furniture shapes differ significantly, remains a challenging problem. In this work, we propose a novel retargeting framework to address this challenge by combining motion generative models with optimization-based pose adaptation. Our framework operates in two stages: first, a key pose generator generates the pose of key joints that preserves the interaction state relative to the new furniture; second, final whole-body pose is determined by accommodating the key joints' poses through optimization. A crucial step in our framework is generating key poses that maintain the interaction state of the source motion. To achieve this, we introduce the Interaction Intensity Weight (IIW) and structural rays, called InterFaceRays, which together capture the interaction intensity between body parts and furniture surfaces. The IIW generator, a trained MoE-based decoder from the conditional variational autoencoder (cVAE) model, infers IIWs for the target furniture based on the source motion's interaction state. Extensive experiments demonstrate that our framework effectively retargets continuous character motion across diverse furniture configurations, with the IIW generator significantly enhancing key pose consistency. This hybrid approach offers a robust solution for motion retargeting across dissimilar furniture environments.