44-Issue 2

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607133

Browse

Now showing 1 - 20 of 75

2D Neural Fields with Learned Discontinuities
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Liu, Chenxi; Wang, Siqi; Fisher, Matthew; Aneja, Deepali; Jacobson, Alec; Bousseau, Adrien; Day, Angela
Effective representation of 2D images is fundamental in digital image processing, where traditional methods like raster and vector graphics struggle with sharpness and textural complexity, respectively. Current neural fields offer high fidelity and resolution independence but require predefined meshes with known discontinuities, restricting their utility. We observe that by treating all mesh edges as potential discontinuities, we can represent the discontinuity magnitudes as continuous variables and optimize. We further introduce a novel discontinuous neural field model that jointly approximates the target image and recovers discontinuities. Through systematic evaluations, our neural field outperforms other methods that fit unknown discontinuities with discontinuous representations, exceeding Field of Junction and Boundary Attention by over 11dB in both denoising and super-resolution tasks and achieving 3.5× smaller Chamfer distances than Mumford-Shah-based methods. It also surpasses InstantNGP with improvements of more than 5dB (denoising) and 10dB (super-resolution). Additionally, our approach shows remarkable capability in approximating complex artistic and natural images and cleaning up diffusion-generated depth maps.
4-LEGS: 4D Language Embedded Gaussian Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fiebelman, Gal; Cohen, Tamir; Morgenstern, Ayellet; Hedman, Peter; Averbuch-Elor, Hadar; Bousseau, Adrien; Day, Angela
The emergence of neural representations has revolutionized our means for digitally viewing a wide range of 3D scenes, enabling the synthesis of photorealistic images rendered from novel views. Recently, several techniques have been proposed for connecting these low-level representations with the high-level semantics understanding embodied within the scene. These methods elevate the rich semantic understanding from 2D imagery to 3D representations, distilling high-dimensional spatial features onto 3D space. In our work, we are interested in connecting language with a dynamic modeling of the world. We show how to lift spatio-temporal features to a 4D representation based on 3D Gaussian Splatting. This enables an interactive interface where the user can spatiotemporally localize events in the video from text prompts. We demonstrate our system on public 3D video datasets of people and animals performing various actions.
Adaptive Multi-view Radiance Caching for Heterogeneous Participating Media
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Stadlbauer, Pascal; Tatzgern, Wolfgang; Mueller, Joerg H.; Winter, Martin; Stojanovic, Robert; Weinrauch, Alexander; Steinberger, Markus; Bousseau, Adrien; Day, Angela
Achieving lifelike atmospheric effects, such as fog, is essential in creating immersive environments and poses a formidable challenge in real-time rendering. Highly realistic rendering of complex lighting interacting with dynamic fog can be very resourceintensive, due to light bouncing through a complex participating media multiple times. We propose an approach that uses a multi-layered spherical harmonics probe grid to share computations temporarily. In addition, this world-space storage enables the sharing of radiance data between multiple viewers. In the context of cloud rendering this means faster rendering and a significant enhancement in overall rendering quality with efficient resource utilization.
All-frequency Full-body Human Image Relighting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Tajima, Daichi; Kanamori, Yoshihiro; Endo, Yuki; Bousseau, Adrien; Day, Angela
Relighting of human images enables post-photography editing of lighting effects in portraits. The current mainstream approach uses neural networks to approximate lighting effects without explicitly accounting for the principle of physical shading. As a result, it often has difficulty representing high-frequency shadows and shading. In this paper, we propose a two-stage relighting method that can reproduce physically-based shadows and shading from low to high frequencies. The key idea is to approximate an environment light source with a set of a fixed number of area light sources. The first stage employs supervised inverse rendering from a single image using neural networks and calculates physically-based shading. The second stage then calculates shadow for each area light and sums up to render the final image. We propose to make soft shadow mapping differentiable for the area-light approximation of environment lighting. We demonstrate that our method can plausibly reproduce all-frequency shadows and shading caused by environment illumination, which have been difficult to reproduce using existing methods.
Approximating Procedural Models of 3D Shapes with Neural Networks
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hossain, Ishtiaque; Shen, I-Chao; Kaick, Oliver van; Bousseau, Adrien; Day, Angela
Procedural modeling is a popular technique for 3D content creation and offers a number of advantages over alternative techniques for modeling 3D shapes. However, given a procedural model, predicting the procedural parameters of existing data provided in different modalities can be challenging. This is because the data may be in a different representation than the one generated by the procedural model, and procedural models are usually not invertible, nor are they differentiable. In this paper, we address these limitations and introduce an invertible and differentiable representation for procedural models. We approximate parameterized procedures with a neural network architecture NNProc that learns both the forward and inverse mapping of the procedural model by aligning the latent spaces of shape parameters and shapes. The network is trained in a manner that is agnostic to the inner workings of the procedural model, implying that models implemented in different languages or systems can be used. We demonstrate how the proposed representation can be used for both forward and inverse procedural modeling. Moreover, we show how NNProc can be used in conjunction with optimization for applications such as shape reconstruction from an image or a 3D Gaussian Splatting.
ASMR: Adaptive Skeleton-Mesh Rigging and Skinning via 2D Generative Prior
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hong, Seokhyeon; Choi, Soojin; Kim, Chaelin; Cha, Sihun; Noh, Junyong; Bousseau, Adrien; Day, Angela
Despite the growing accessibility of skeletal motion data, integrating it for animating character meshes remains challenging due to diverse configurations of both skeletons and meshes. Specifically, the body scale and bone lengths of the skeleton should be adjusted in accordance with the size and proportions of the mesh, ensuring that all joints are accurately positioned within the character mesh. Furthermore, defining skinning weights is complicated by variations in skeletal configurations, such as the number of joints and their hierarchy, as well as differences in mesh configurations, including their connectivity and shapes. While existing approaches have made efforts to automate this process, they hardly address the variations in both skeletal and mesh configurations. In this paper, we present a novel method for the automatic rigging and skinning of character meshes using skeletal motion data, accommodating arbitrary configurations of both meshes and skeletons. The proposed method predicts the optimal skeleton aligned with the size and proportion of the mesh as well as defines skinning weights for various meshskeleton configurations, without requiring explicit supervision tailored to each of them. By incorporating Diffusion 3D Features (Diff3F) as semantic descriptors of character meshes, our method achieves robust generalization across different configurations. To assess the performance of our method in comparison to existing approaches, we conducted comprehensive evaluations encompassing both quantitative and qualitative analyses, specifically examining the predicted skeletons, skinning weights, and deformation quality.
Axis-Normalized Ray-Box Intersection
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Friederichs, Fabian; Benthin, Carsten; Grogorick, Steve; Eisemann, Elmar; Magnor, Marcus; Eisemann, Martin; Bousseau, Adrien; Day, Angela
Ray-axis aligned bounding box intersection tests play a crucial role in the runtime performance of many rendering applications, driven not by complexity but mainly by the volume of tests required. While existing solutions were believed to be pretty much optimal in terms of runtime on current hardware, our paper introduces a new intersection test requiring fewer arithmetic operations compared to all previous methods. By transforming the ray we eliminate the need for one third of the traditional bounding-slab tests and achieve a speed enhancement of approximately 13.8% or 10.9%, depending on the compiler.We present detailed runtime analyses in various scenarios.
BlendSim: Simulation on Parametric Blendshapes using Spacetime Projective Dynamics
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wu, Yuhan; Umetani, Nobuyuki; Bousseau, Adrien; Day, Angela
We propose BlendSim, a novel framework for editable simulation using spacetime optimization on the lightweight animation representation. Traditional spacetime control methods suffer from a high computational complexity, which limits their use in interactive animation. The proposed approach effectively reduces the dimensionality of the problem by representing the motion trajectories of each vertex using continuous parametric Bézier splines with variable keyframe times. Because this mesh animation representation is continuous and fully differentiable, it can be optimized such that it follows the laws of physics under various constraints. The proposed method also integrates constraints, such as collisions and cyclic motion, making it suitable for real-world applications where seamless looping and physical interactions are required. Leveraging projective dynamics, we further enhance the computational efficiency by decoupling the optimization into local parallelizable and global quadratic steps, enabling a fast and stable simulation. In addition, BlendSim is compatible with modern animation workflows and file formats, such as the glTF, making it practical way for authoring and transferring mesh animation.
Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bemana, Mojtaba; Leimkühler, Thomas; Myszkowski, Karol; Seidel, Hans-Peter; Ritschel, Tobias; Bousseau, Adrien; Day, Angela
We demonstrate generating HDR images using the concerted action of multiple black-box, pre-trained LDR image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and, second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called ''exposure brackets'', to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce a brackets consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.
CEDRL: Simulating Diverse Crowds with Example-Driven Deep Reinforcement Learning
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Panayiotou, Andreas; Aristidou, Andreas; Charalambous, Panayiotis; Bousseau, Adrien; Day, Angela
The level of realism in virtual crowds is strongly affected by the presence of diverse crowd behaviors. In real life, we can observe various scenarios, ranging from pedestrians moving on a shopping street, people talking in static groups, or wandering around in a public park. Most of the existing systems optimize for specific behaviors such as goal-seeking and collision avoidance, neglecting to consider other complex behaviors that are usually challenging to capture or define. Departing from the conventional use of Supervised Learning, which requires vast amounts of labeled data and often lacks controllability, we introduce Crowds using Example-driven Deep Reinforcement Learning (CEDRL), a framework that simultaneously leverages multiple crowd datasets to model a broad spectrum of human behaviors. This approach enables agents to adaptively learn and exhibit diverse behaviors, enhancing their ability to generalize decisions across unseen states. The model can be applied to populate novel virtual environments while providing real-time controllability over the agents' behaviors. We achieve this through the design of a reward function aligned with real-world observations and by employing curriculum learning that gradually diminishes the agents' observation space. A complexity characterization metric defines each agent's high-level crowd behavior, linking it to the agent's state and serving as an input to the policy network. Additionally, a parametric reward function, influenced by the type of crowd task, facilitates the learning of a diverse and abstract behavior ''skill'' set. We evaluate our model on both training and unseen real-world data, comparing against other simulators, showing its ability to generalize across scenarios and accurately reflect the observed complexity of behaviors. We also examine our system's controllability by adjusting the complexity weight, discovering that higher values lead to more complex behaviors such as wandering, static interactions, and group dynamics like joining or leaving. Finally, we demonstrate our model's capabilities in novel synthetic scenarios.
Cloth Animation with Time-dependent Persistent Wrinkles
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Gong, Deshan; Yang, Yin; Shao, Tianjia; Wang, He; Bousseau, Adrien; Day, Angela
Persistent wrinkles are often observed on crumpled garments e.g., the wrinkles around the knees after sitting for a while. Such wrinkles can be easily recovered if not deformed for long, and otherwise be persistent. Since they are vital to the visual realism of cloth animation, we aim to simulate realistic looking persistent wrinkles. To this end, we present a physics-inspired finegrained wrinkle model. Different from existing methods, we recognize the importance of the interplay between internal friction and plasticity during wrinkle formation. Furthermore, we model their time dependence for persistent wrinkles. Our model is capable of not only simulating realistic wrinkle patterns, but also their time-dependent changes according to how long the deformation is maintained. Through extensive experiments, we show that our model is effective in simulating realistic spatial and temporal varying wrinkles, versatile in simulating different materials, and capable of generating more fine-grained wrinkles than the state of the art.
Corotational Hinge-based Thin Plates/Shells
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Liang, Qixin; Bousseau, Adrien; Day, Angela
We present six thin plate/shell models, derived from three distinct types of curvature operators formulated within the corotational frame, for simulating both rest-flat and rest-curved triangular meshes. Each curvature operator derives a curvature expression corresponding to both a plate model and a shell model. The corotational edge-based hinge model uses an edge-based stencil to compute directional curvature, while the corotational FVM hinge model utilizes a triangle-centered stencil, applying the finite volume method (FVM) to superposition directional curvatures across edges, yielding a generalized curvature. The corotational smoothed hinge model also employs a triangle-centered stencil but transforms directional curvatures into a generalized curvature based on a quadratic surface fit. All models assume small strain and small curvature, leading to constant bending energy Hessians, which benefit implicit integrators. Through quantitative benchmarks and qualitative elastodynamic simulations with large time steps, we demonstrate the accuracy, efficiency, and stability of these models. Our contributions enhance the thin plate/shell library for use in both computer graphics and engineering applications.
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kappel, Moritz; Hahlbohm, Florian; Scholz, Timon; Castillo, Susana; Theobalt, Christian; Eisemann, Martin; Golyanik, Vladislav; Magnor, Marcus; Bousseau, Adrien; Day, Angela
Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing a new method for dynamic novel view synthesis from monocular video, such as casual smartphone captures. Our approach represents the scene as a dynamic neural point cloud, an implicit time-conditioned point distribution that encodes local geometry and appearance in separate hash-encoded neural feature grids for static and dynamic regions. By sampling a discrete point cloud from our model, we can efficiently render high-quality novel views using a fast differentiable rasterizer and neural rendering network. Similar to recent work, we leverage advances in neural scene analysis by incorporating data-driven priors like monocular depth estimation and object segmentation to resolve motion and depth ambiguities originating from the monocular captures. In addition to guiding the optimization process, we show that these priors can be exploited to explicitly initialize our scene representation to drastically improve optimization speed and final image quality. As evidenced by our experimental evaluation, our dynamic point cloud model not only enables fast optimization and real-time frame rates for interactive applications, but also achieves competitive image quality on monocular benchmark sequences. Our code and data are available online https://moritzkappel.github.io/projects/dnpc/.
Deformed Tiling and Blending: Application to the Correction of Distortions Implied by Texture Mapping
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wendling, Quentin; Ravaglia, Joris; Sauvage, Basile; Bousseau, Adrien; Day, Angela
The prevailing model in virtual 3D scenes is a 3D surface, which a texture is mapped onto, through a parameterization from the texture plane. We focus on accounting for the parameterization during the texture creation process, to control the deformations and remove the cuts induced by the mapping. We rely on the tiling and blending, a real-time and parallel algorithm that generates an arbitrary large texture from a small input example. Our first contribution is to enhance the tiling and blending with a deformation field, which controls smooth spatial variations in the texture plane. Our second contribution is to derive, from a parameterized triangle mesh, a deformation field to compensate for texture distortions and to control for the texture orientation. Our third contribution is a technique to enforce texture continuity across the cuts, thanks to a proper tile selection. This opens the door to interactive sessions with artistic control, and real-time rendering with improved visual quality.
Differentiable Rendering based Part-Aware Occlusion Proxy Generation
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Tan, Zhipeng; Zhang, Yongxiang; Xia, Fei; Ling, Fei; Bousseau, Adrien; Day, Angela
Software occlusion culling has become a prevalent method in modern game engines. It can significantly reduce the rendering cost by using an approximate coarse mesh (occluder) to cull hidden objects. An ideal occluder should use as few faces as possible to represent the original mesh with high culling accuracy. In contrary to mesh simplification, the process of generating a high quality occlusion proxy is not well-established. Existing methods, which simply treat the mesh as a single entity, fall short in addressing complex models with interior structures. By leveraging advanced neural segmentation techniques and the optimization capabilities of differentiable rendering, in combination with a thoughtfully designed part-aware shape fitting and camera placement strategy, our approach can generate high-quality occlusion proxy mesh applicable across a diverse range of models with satisfactory precision, recall and very few faces. Moreover, extensive experiments compellingly demonstrate that our method substantially outperforms both state-of-the-art methodologies and commercial tools in terms of occlusion quality and effectiveness.
Differential Diffusion: Giving Each Pixel Its Strength
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Levin, Eran; Fried, Ohad; Bousseau, Adrien; Day, Angela
Diffusion models have revolutionized image generation and editing, producing state-of-the-art results in conditioned and unconditioned image synthesis. While current techniques enable user control over the degree of change in an image edit, the controllability is limited to global changes over an entire edited region. This paper introduces a novel framework that enables customization of the amount of change per pixel or per image region. Our framework can be integrated into any existing diffusion model, enhancing it with this capability. Such granular control opens up a diverse array of new editing capabilities, such as control of the extent to which individual objects are modified, or the ability to introduce gradual spatial changes. Furthermore, we showcase the framework's effectiveness in soft-inpainting-the completion of portions of an image while subtly adjusting the surrounding areas to ensure seamless integration. Additionally, we introduce a new tool for exploring the effects of different change quantities. Our framework operates solely during inference, requiring no model training or fine-tuning. We demonstrate our method with the current open state-of-the-art models, and validate it via both quantitative and qualitative comparisons, and a user study. Our code is published and integrated into several platforms.
Does 3D Gaussian Splatting Need Accurate Volumetric Rendering?
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Celarek, Adam; Kopanas, Georgios; Drettakis, George; Wimmer, Michael; Kerbl, Bernhard; Bousseau, Adrien; Day, Angela
Since its introduction, 3D Gaussian Splatting (3DGS) has become an important reference method for learning 3D representations of a captured scene, allowing real-time novel-view synthesis with high visual quality and fast training times. Neural Radiance Fields (NeRFs), which preceded 3DGS, are based on a principled ray-marching approach for volumetric rendering. In contrast, while sharing a similar image formation model with NeRF, 3DGS uses a hybrid rendering solution that builds on the strengths of volume rendering and primitive rasterization. A crucial benefit of 3DGS is its performance, achieved through a set of approximations, in many cases with respect to volumetric rendering theory. A naturally arising question is whether replacing these approximations with more principled volumetric rendering solutions can improve the quality of 3DGS. In this paper, we present an in-depth analysis of the various approximations and assumptions used by the original 3DGS solution. We demonstrate that, while more accurate volumetric rendering can help for low numbers of primitives, the power of efficient optimization and the large number of Gaussians allows 3DGS to outperform volumetric rendering despite its approximations.
DragPoser: Motion Reconstruction from Variable Sparse Tracking Signals via Latent Space Optimization
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Ponton, Jose Luis; Pujol, Eduard; Aristidou, Andreas; Andujar, Carlos; Pelechano, Nuria; Bousseau, Adrien; Day, Angela
High-quality motion reconstruction that follows the user's movements can be achieved by high-end mocap systems with many sensors. However, obtaining such animation quality with fewer input devices is gaining popularity as it brings mocap closer to the general public. The main challenges include the loss of end-effector accuracy in learning-based approaches, or the lack of naturalness and smoothness in IK-based solutions. In addition, such systems are often finely tuned to a specific number of trackers and are highly sensitive to missing data, e.g., in scenarios where a sensor is occluded or malfunctions. In response to these challenges, we introduce DragPoser, a novel deep-learning-based motion reconstruction system that accurately represents hard and dynamic constraints, attaining real-time high end-effectors position accuracy. This is achieved through a pose optimization process within a structured latent space. Our system requires only one-time training on a large human motion dataset, and then constraints can be dynamically defined as losses, while the pose is iteratively refined by computing the gradients of these losses within the latent space. To further enhance our approach, we incorporate a Temporal Predictor network, which employs a Transformer architecture to directly encode temporality within the latent space. This network ensures the pose optimization is confined to the manifold of valid poses and also leverages past pose data to predict temporally coherent poses. Results demonstrate that DragPoser surpasses both IK-based and the latest data-driven methods in achieving precise end-effector positioning, while it produces natural poses and temporally coherent motion. In addition, our system showcases robustness against on-the-fly constraint modifications, and exhibits adaptability to various input configurations and changes. The complete source code, trained model, animation databases, and supplementary material used in this paper can be found at https://upc-virvig.github.io/DragPoser
Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hahlbohm, Florian; Friederichs, Fabian; Weyrich, Tim; Franke, Linus; Kappel, Moritz; Castillo, Susana; Stamminger, Marc; Eisemann, Martin; Magnor, Marcus; Bousseau, Adrien; Day, Angela
3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of 3D Gaussians is resolved, in turn breaking coherence in other ways. In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates. Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs. We further show that each of these two components can be independently integrated into Gaussian splatting systems. In combination, they achieve up to 2× higher frame rates, 2× faster optimization, and equal or better image quality with fewer rendering artifacts compared to traditional 3DGS on common benchmarks.
Eigenvalue Blending for Projected Newton
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cheng, Yuan-Yuan; Liu, Ligang; Fu, Xiao-Ming; Bousseau, Adrien; Day, Angela
We propose a novel method to filter eigenvalues for projected Newton. Central to our method is blending the clamped and absolute eigenvalues to adaptively compute the modified Hessian matrix. To determine the blending coefficients, we rely on (1) a key observation and (2) an objective function descent constraint. The observation is that if the quadratic form defined by the Hessian matrix maps the descent direction to a negative real number, the decrease in the objective function is limited. The constraint is that our eigenvalue filtering leads to more reduction in objective function than the absolute eigenvalue filtering [CLL∗24] in the case of second-order Taylor approximation. Our eigenvalue blending is easy to implement and leads to fewer optimization iterations than the state-of-the-art eigenvalue filtering methods.

Browse

Browsing 44-Issue 2 by Title

Results Per Page

Sort Options