44-Issue 7

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607235

Browse

Now showing 1 - 20 of 49

Accelerating Signed Distance Functions
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hubert-Brierre, Pierre; Guérin, Eric; Peytavie, Adrien; Galin, Eric; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Processing and particularly visualizing implicit surfaces remains computationally intensive when dealing with complex objects built from construction trees. We introduce optimization nodes to reduce the computational cost of the field function evaluation for hierarchical construction trees, while preserving the Lipschitz or conservative properties of the function. Our goal is to propose acceleration nodes directly embedded in the construction tree, and avoid external, accompanying data-structures such as octrees. We present proxy and continuous level of detail nodes to reduce the overall evaluation cost, along with a normal warping technique that enhances surface details with negligible computational overhead. Our approach is compatible with existing algorithms that aim at reducing the number of function calls. We validate our methods by computing timings as well as the average cost for traversing the tree and evaluating the signed distance field at a given point in space. Our method speeds-up signed distance field evaluation by up to three orders or magnitude, and applies both to ray-surface intersection computation in Sphere Tracing applications, and to polygonization algorithms.
Automatic Reconstruction of Woven Cloth from a Single Close-up Image
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wu, Chenghao; Khattar, Apoorv; Zhu, Junqiu; Pettifer, Steve; Yan, Lingqi; Montazeri, Zahra; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Digital replication of woven fabrics presents significant challenges across a variety of sectors, from online retail to entertainment industries. To address this, we introduce an inverse rendering pipeline designed to estimate pattern, geometry, and appearance parameters of woven fabrics given a single close-up image as input. Our work is capable of simultaneously optimizing both discrete and continuous parameters without manual interventions. It outputs a wide array of parameters, encompassing discrete elements like weave patterns, ply and fiber number, using Simulated Annealing. It also recovers continuous parameters such as reflection and transmission components, aligning them with the target appearance through differentiable rendering. For irregularities caused by deformation and flyaways, we use 2D Gaussians to approximate them as a post-processing step. Our work does not pursue perfect matching of all fine details, it targets an automatic and end-to-end reconstruction pipeline that is robust to slight camera rotations and room light conditions within an acceptable time (15 minutes on CPU), unlike previous works which are either expensive, require manual intervention, assume given pattern, geometry or appearance, or strictly control camera and light conditions.
BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Lan, Yuqing; Zhu, Chenyang; Gao, Zhirui; Zhang, Jiazhao; Cao, Yihan; Yi, Renjiao; Wang, Yijie; Xu, Kai; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Open-vocabulary 3D object detection has gained significant interest due to its critical applications in autonomous driving and embodied AI. Existing detection methods, whether offline or online, typically rely on dense point cloud reconstruction, which imposes substantial computational overhead and memory constraints, hindering real-time deployment in downstream tasks. To address this, we propose a novel reconstruction-free online framework tailored for memory-efficient and real-time 3D detection. Specifically, given streaming posed RGB-D video input, we leverage Cubify Anything as a pre-trained visual foundation model (VFM) for single-view 3D object detection, coupled with CLIP to capture open-vocabulary semantics of detected objects. To fuse all detected bounding boxes across different views into a unified one, we employ an association module for correspondences of multi-views and an optimization module to fuse the 3D bounding boxes of the same instance. The association module utilizes 3D Non-Maximum Suppression (NMS) and a box correspondence matching module. The optimization module uses an IoU-guided efficient random optimization technique based on particle filtering to enforce multi-view consistency of the 3D bounding boxes while minimizing computational complexity. Extensive experiments on CA-1M and ScanNetV2 datasets demonstrate that our method achieves state-of-the-art performance among online methods. Benefiting from this novel reconstruction-free paradigm for 3D object detection, our method exhibits great generalization abilities in various scenarios, enabling real-time perception even in environments exceeding 1000 square meters.
ClothingTwin: Reconstructing Inner and Outer Layers of Clothing Using 3D Gaussian Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Jung, Munkyung; Lee, Dohae; Lee, In-Kwon; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
We introduce ClothingTwin, a novel end-to-end framework for reconstructing 3D digital twins of clothing that capture both the outer and inner fabric -without the need for manual mannequin removal. Traditional 2D ''ghost mannequin'' photography techniques remove the mannequin and composite partial inner textures to create images in which the garment appears as if it were worn by a transparent model. However, extending such method to photorealistic 3D Gaussian Splatting (3DGS) is far more challenging. Achieving consistent inner-layer compositing across the large sets of images used for 3DGS optimization quickly becomes impractical if done manually. To address these issues, ClothingTwin introduces three key innovations. First, a specialized image acquisition protocol captures two sets of images for each garment: one worn normally on the mannequin (outer layer exposed) and one worn inside-out (inner layer exposed). This eliminates the need to painstakingly edit out mannequins in thousands of images and provides full coverage of all fabric surfaces. Second, we employ a mesh-guided 3DGS reconstruction for each layer and leverage Non-Rigid Iterative Closest Point (ICP) to align outer and inner point-clouds despite distinct geometries. Third, our enhanced rendering pipeline-featuring mesh-guided back-face culling, back-to-front alpha blending, and recalculated spherical harmonic angles-ensures photorealistic visualization of the combined outer and inner layers without inter-layer artifacts. Experimental evaluations on various garments show that ClothingTwin outperforms conventional 3DGS-based methods, and our ablation study validates the effectiveness of each proposed component.
Computational Design of Body-Supporting Assemblies
(The Eurographics Association and John Wiley & Sons Ltd., 2025) He, Yixuan; Chen, Rulin; Deng, Bailin; Song, Peng; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
A body-supporting assembly is an assembly of parts that physically supports a human body during activities like sitting, lying, or leaning. A body-supporting assembly has a complex global shape to support a specific human body posture, yet each component part has a relatively simple geometry to facilitate fabrication, storage, and maintenance. In this paper, we aim to model and design a personalized body-supporting assembly that fits a given human body posture, aiming to make the assembly comfortable to use. We choose to model a body-supporting assembly from scratch to offer high flexibility for fitting a given body posture, which however makes it challenging to determine the assembly's topology and geometry. To address this problem, we classify parts in the assembly into two categories according the functionality: supporting parts for fitting different portions of the body and connecting parts for connecting all the supporting parts to form a stable structure. We also propose a geometric representation of supporting parts such that they can have a variety of shapes controlled by a few parameters. Given a body posture as input, we present a computational approach for designing a body-supporting assembly that fits the posture, in which the supporting parts are initialized and optimized to minimize a discomfort measure and then the connecting parts are generated using a procedural approach. We demonstrate the effectiveness of our approach by designing body-supporting assemblies that accommodate to a variety of body postures and 3D printing two of them for physical validation.
DAATSim: Depth-Aware Atmospheric Turbulence Simulation for Fast Image Rendering
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Saha, Ripon Kumar; Zhang, Yufan; Ye, Jinwei; Jayasuriya, Suren; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Simulating the effects of atmospheric turbulence for imaging systems operating over long distances is a significant challenge for optical and computer graphics models. Physically-based ray tracing over kilometers of distance is difficult due to the need to define a spatio-temporal volume of varying refractive index. Even if such a volume can be defined, Monte Carlo rendering approximations for light refraction through the environment would not yield real-time solutions needed for video game engines or online dataset augmentation for machine learning. While existing simulators based on procedurally-generated noise or textures have been proposed in these settings, these simulators often neglect the significant impact of scene depth, leading to unrealistic degradations for scenes with substantial foreground-background separation. This paper introduces a novel, physically-based atmospheric turbulence simulator that explicitly models depth-dependent effects while rendering frames at interactive/near real-time (> 10 FPS) rates for image resolutions up to 1024×1024 (real-time 35 FPS at 256×256 resolution with depth or 512×512 at 33 FPS without depth). Our hybrid approach combines spatially-varying wavefront aberrations using Zernike polynomials with pixel-wise depth modulation of both blur (via Point Spread Function interpolation) and geometric distortion or tilt. Our approach includes a novel fusion technique that integrates complementary strengths of leading monocular depth estimators to generate metrically accurate depth maps with enhanced edge fidelity. DAATSim is implemented efficiently on GPUs using Py- Torch incorporating optimizations like mixed-precision computation and caching to achieve efficient performance. We present quantitative and qualitative validation demonstrating the simulator's physical plausibility for generating turbulent video. DAATSim is made publicly available and open-source to the community: https://github.com/Riponcs/DAATSim.
EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Li, Xinru; Lin, Jingzhong; Zhang, Bohao; Qi, Yuanyuan; Wang, Changbo; He, Gaoqi; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Co-speech gesture generation, driven by emotional expression and synergistic bodily movements, is essential for applications such as virtual avatars and human-robot interaction. Existing co-speech gesture generation methods face two fundamental limitations: (1) producing inexpressive gestures due to ignoring the temporal evolution of emotion; (2) generating incoherent and unnatural motions as a result of either holistic body oversimplification or independent part modeling. To address the above limitations, we propose EmoDiffGes, a diffusion-based framework grounded in embodied emotion theory, unifying dynamic emotion conditioning and part-aware synergistic modeling. Specifically, a Dynamic Emotion-Alignment Module (DEAM) is first applied to extract dynamic emotional cues and inject emotion guidance into the generation process. Then, a Progressive Synergistic Gesture Generator (PSGG) iteratively refines region-specific latent codes while maintaining full-body coordination, leveraging a Body Region Prior for part-specific encoding and Progressive Inter-Region Synergistic Flow for global motion coherence. Extensive experiments validate the effectiveness of our methods, showcasing the potential for generating expressive, coordinated, and emotionally grounded human gestures.
FAHNet: Accurate and Robust Normal Estimation for Point Clouds via Frequency-Aware Hierarchical Geometry
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wang, Chengwei; Wu, Wenming; Fei, Yue; Zhang, Gaofeng; Zheng, Liping; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Point cloud normal estimation underpins many 3D vision and graphics applications. Precise normal estimation in regions of sharp curvature and high-frequency variation remains a major bottleneck; existing learning-based methods still struggle to isolate fine geometry details under noise and uneven sampling. We present FAHNet, a novel frequency-aware hierarchical network that precisely tackles those challenges. Our Frequency-Aware Hierarchical Geometry (FAHG) feature extraction module selectively amplifies and merges cross-scale cues, ensuring that both fine-grained local features and sharp structures are faithfully represented. Crucially, a dedicated Frequency-Aware geometry enhancement (FA) branch intensifies sensitivity to abrupt normal transitions and sharp features, preventing the common over-smoothing limitation. Extensive experiments on synthetic benchmarks (PCPNet, FamousShape) and real-world scans (SceneNN) demonstrate that FAHNet outperforms state-of-the-art approaches in normal estimation accuracy. Ablation studies further quantify the contribution of each component, and downstream surface reconstruction results validate the practical impact of our design.
Feature Disentanglement in GANs for Photorealistic Multi-view Hair Transfer
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xu, Jiayi; Wu, Zhengyang; Zhang, Chenming; Jin, Xiaogang; Ji, Yaohua; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Fast and highly realistic multi-view hair transfer plays a crucial role in evaluating the effectiveness of virtual hair try-on systems. However, GAN-based generation and editing methods face persistent challenges in feature disentanglement. Achieving pixel-level, attribute-specific modifications-such as changing hairstyle or hair color without affecting other facial features- remains a long-standing problem. To address this limitation, we propose a novel multi-view hair transfer framework that leverages a hair-only intermediate facial representation and a 3D-guided masking mechanism. Our approach disentangles triplane facial features into spatial geometric components and global style descriptors, enabling independent and precise control over hairstyle and hair color. By introducing a dedicated intermediate representation focused solely on hair and incorporating a two-stage feature fusion strategy guided by the generated 3D mask, our framework achieves fine-grained local editing across multiple viewpoints while preserving facial integrity and improving background consistency. Extensive experiments demonstrate that our method produces visually compelling and natural results in side-to-front view hair transfer tasks, offering a robust and flexible solution for high-fidelity hair reconstruction and manipulation.
FlatCAD: Fast Curvature Regularization of Neural SDFs for CAD Models
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Yin, Haotian; Plocharski, Aleksander; Wlodarczyk, Michal Jan; Kida, Mikolaj; Musialski, Przemyslaw; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Neural signed-distance fields (SDFs) are a versatile backbone for neural geometry representation, but enforcing CAD-style developability usually requires Gaussian-curvature penalties with full Hessian evaluation and second-order differentiation, which are costly in memory and time. We introduce an off-diagonal Weingarten loss that regularizes only the mixed shape operator term that represents the gap between principal curvatures and flattens the surface. We present two variants: a finitedifference version using six SDF evaluations plus one gradient, and an auto-diff version using a single Hessian-vector product. Both converge to the exact mixed term and preserve the intended geometric properties without assembling the full Hessian. On the ABC benchmarks the losses match or exceed Hessian-based baselines while cutting GPU memory and training time by roughly a factor of two. The method is drop-in and framework-agnostic, enabling scalable curvature-aware SDF learning for engineering-grade shape reconstruction. Our code is available at https://flatcad.github.io/.
FlowCapX: Physics-Grounded Flow Capture with Long-Term Consistency
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Tao, Ningxiao; Zhang, Liru; Ni, Xingyu; Chu, Mengyu; Chen, Baoquan; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
We present FlowCapX, a physics-enhanced framework for flow reconstruction from sparse video inputs, addressing the challenge of jointly optimizing complex physical constraints and sparse observational data over long time horizons. Existing methods often struggle to capture turbulent motion while maintaining physical consistency, limiting reconstruction quality and downstream tasks. Focusing on velocity inference, our approach introduces a hybrid framework that strategically separates representation and supervision across spatial scales. At the coarse level, we resolve sparse-view ambiguities via a novel optimization strategy that aligns long-term observation with physics-grounded velocity fields. By emphasizing vorticity-based physical constraints, our method enhances physical fidelity and improves optimization stability. At the fine level, we prioritize observational fidelity to preserve critical turbulent structures. Extensive experiments demonstrate state-of-the-art velocity reconstruction, enabling velocity-aware downstream tasks, e.g., accurate flow analysis, scene augmentation with tracer visualization and re-simulation. Our implementation is released at https://github.com/taoningxiao/FlowCapX.git.
G-SplatGAN: Disentangled 3D Gaussian Generation for Complex Shapes via Multi-Scale Patch Discriminators
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Li, Jiaqi; Dang, Haochuan; Zhou, Zhi; Zhu, Junke; Huang, Zhangjin; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Generating 3D objects with complex topologies from monocular images remains a challenge in computer graphics, due to the difficulty of modeling varying 3D shapes with disentangled, steerable geometry and visual attributes. While NeRF-based methods suffer from slow volumetric rendering and limited structural controllability. Recent advances in 3D Gaussian Splatting provide a more efficient alternative and its generative modeling with separate control over structure and appearance remains underexplored. In this paper, we propose G-SplatGAN, a novel 3D-aware generation framework that combines the rendering efficiency of 3D Gaussian Splatting with disentangled latent modeling. Starting from a shared Gaussian template, our method uses dual modulation branches to modulate geometry and appearance from independent latent codes, enabling precise shape manipulation and controllable generation. We adopt a progressive adversarial training scheme with multi-scale and patchbased discriminators to capture both global structure and local detail. Our model requires no 3D supervision and is trained on monocular images with known camera poses, reducing data reliance while supporting real image inversion through a geometryaware encoder. Experiments show that G-SplatGAN achieves superior performance in rendering speed, controllability and image fidelity, offering a compelling solution for controllable 3D generation using Gaussian representations.
Gaussian Splatting for Large-Scale Aerial Scene Reconstruction From Ultra-High-Resolution Images
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Sun, Qiulin; Lai, Wei; Li, Yixian; Zhang, Yanci; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Using 3D Gaussian splatting to reconstruct large-scale aerial scenes from ultra-high-resolution images is still a challenge problem because of two memory bottlenecks - excessive Gaussian primitives and the tensor sizes for ultra-high-resolution images. In this paper, we propose a task partitioning algorithm that operates in both object and image space to generate a set of small-scale subtasks. Each subtask's memory footprints is strictly limited, enabling training on a single high-end consumer-grade GPU. More specifically, Gaussian primitives are clustered into blocks in object space, and the input images are partitioned into sub-images according to the projected footprints of these blocks. This dual-space partitioning significantly reduces training memory requirements. During subtask training, we propose a depth comparison method to generate a mask map for each sub-image. This mask map isolates pixels primarily contributed by the Gaussian primitives of the current subtask, excluding all other pixels from training. Experimental results demonstrate that our method successfully achieves large-scale aerial scene reconstruction using 9K resolution images on a single RTX 4090 GPU. The novel views synthesized by our method retain significantly more details than those from current state-of-the-art methods.
Gaussians on their Way: Wasserstein-Constrained 4D Gaussian Splatting with State-Space Modeling
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Deng, Junli; , Ping Shi; Luo, Yihao; Li, Qipei; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Dynamic scene rendering has taken a leap forward with the rise of 4D Gaussian Splatting, but there is still one elusive challenge: how to make 3D Gaussians move through time as naturally as they would in the real world, all while keeping the motion smooth and consistent. In this paper, we present an approach that blends state-space modeling with Wasserstein geometry, enabling a more fluid and coherent representation of dynamic scenes. We introduce a State Consistency Filter that merges prior predictions with the current observations, enabling Gaussians to maintain coherent trajectories over time. We also employ Wasserstein Consistency Constraint to ensure smooth, consistent updates of Gaussian parameters, reducing motion artifacts. Lastly, we leverage Wasserstein geometry to capture both translational motion and shape deformations, creating a more geometrically consistent model for dynamic scenes. Our approach models the evolution of Gaussians along geodesics on the manifold of Gaussian distributions, achieving smoother, more realistic motion and stronger temporal coherence. Experimental results show consistent improvements in rendering quality and efficiency.
Geometric Integration for Neural Control Variates
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Meister, Daniel; Harada, Takahiro; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Control variates are a variance-reduction technique for Monte Carlo integration. The principle involves approximating the integrand by a function that can be analytically integrated, and integrating using the Monte Carlo method only the residual difference between the integrand and the approximation, to obtain an unbiased estimate. Neural networks are universal approximators that could potentially be used as a control variate. However, the challenge lies in the analytic integration, which is not possible in general. In this manuscript, we study one of the simplest neural network models, the multilayered perceptron (MLP) with continuous piecewise linear activation functions, and its possible analytic integration. We propose an integration method based on integration domain subdivision, employing techniques from computational geometry to solve this problem in 2D. We demonstrate that an MLP can be used as a control variate in combination with our integration method, showing applications in the light transport simulation.
GNF: Gaussian Neural Fields for Multidimensional Signal Representation and Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bouzidi, Abelaziz; Laga, Hamid; Wannous, Hazem; Sohel, Ferdous; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Neural fields have emerged as a powerful framework for representing continuous multidimensional signals such as images and videos, 3D and 4D objects and scenes, and radiance fields. While efficient, achieving high-quality representation requires the use of wide and deep neural networks. These, however, are slow to train and evaluate. Although several acceleration techniques have been proposed, they either trade memory for faster training and/or inference, rely on thousands of fitted primitives with considerable optimization time, or compromise the smooth, continuous nature of neural fields. In this paper, we introduce Gaussian Neural Fields (GNF), a novel compact neural decoder that maps learned feature grids into continuous non-linear signals, such as RGB images, Signed Distance Functions (SDFs), and radiance fields, using a single compact layer of Gaussian kernels defined in a high-dimensional feature space. Our key observation is that neurons in traditional MLPs perform simple computations, usually a dot product followed by an activation function, necessitating wide and deep MLPs or high-resolution feature grids to model complex functions. In this paper, we show that replacing MLP-based decoders with Gaussian kernels whose centers are learned features yields highly accurate representations of 2D (RGB), 3D (geometry), and 5D (radiance fields) signals with just a single layer of such kernels. This representation is highly parallelizable, operates on low-resolution grids, and trains in under 15 seconds for 3D geometry and under 11 minutes for view synthesis. GNF matches the accuracy of deep MLP-based decoders with far fewer parameters and significantly higher inference throughput. The source code is publicly available at https://grbfnet.github.io/.
GS-Share: Enabling High-fidelity Map Sharing with Incremental Gaussian Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Zhang, Xinran; Zhu, Hanqi; Duan, Yifan; Zhang, Yanyong; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Constructing and sharing 3D maps is essential for many applications, including autonomous driving and augmented reality. Recently, 3D Gaussian splatting has emerged as a promising approach for accurate 3D reconstruction. However, a practical map-sharing system that features high-fidelity, continuous updates, and network efficiency remains elusive. To address these challenges, we introduce GS-Share, a photorealistic map-sharing system with a compact representation. The core of GS-Share includes anchor-based global map construction, virtual-image-based map enhancement, and incremental map update. We evaluate GS-Share against state-of-the-art methods, demonstrating that our system achieves higher fidelity, particularly for extrapolated views, with improvements of 11%, 22%, and 74% in PSNR, LPIPS, and Depth L1, respectively. Furthermore, GS-Share is significantly more compact, reducing map transmission overhead by 36%.
High-Performance Elliptical Cone Tracing
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Emre, Umut; Kanak, Aryan; Steinberg, Shlomi; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
In this work, we discuss elliptical cone traversal in scenes that employ typical triangular meshes. We derive accurate and numerically-stable intersection tests for an elliptical conic frustum with an AABB, plane, edge and a triangle, and analyze the performance of elliptical cone tracing when using different acceleration data structures: SAH-based K-d trees, BVHs as well as a modern 8-wide BVH variant adapted for cone tracing, and compare with ray tracing. In addition, several cone traversal algorithms are analyzed, and we develop novel heuristics and optimizations that give better performance than previous traversal approaches. The results highlight the difference in performance characteristics between rays and cones, and serve to guide the design of acceleration data structures for applications that employ cone tracing.
Hybrid Sparse Transformer and Feature Alignment for Efficient Image Completion
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Chen, L.; Sun, Hao; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
In this paper, we propose an efficient single-stage hybrid architecture for image completion. Existing transformer-based image completion methods often struggle with accurate content restoration, largely due to their ineffective modeling of corrupted channel information and the attention noise introduced by softmax-based mechanisms, which results in blurry textures and distorted structures. Additionally, these methods frequently fail to maintain texture consistency, either relying on imprecise mask sampling or incurring substantial computational costs from complex similarity calculations. To address these limitations, we present two key contributions: a Hybrid Sparse Self-Attention (HSA) module and a Feature Alignment Module (FAM). The HSA module enhances structural recovery by decoupling spatial and channel attention with sparse activation, while the FAM enforces texture consistency by aligning encoder and decoder features via a mask-free, energy-gated mechanism without additional inference cost. Our method achieves state-of-the-art image completion results with the fastest inference speed among single-stage networks, as measured by PSNR, SSIM, FID, and LPIPS on CelebA-HQ, Places2, and Paris datasets.
Introducing Unbiased Depth into 2D Gaussian Splatting for High-accuracy Surface Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Yang, Yixin; Zhou, Yang; Huang, Hui; Christie, Marc; Pietroni, Nico; Wang, Yu-Shuen
Recently, 2D Gaussian Splatting (2DGS) has demonstrated superior geometry reconstruction quality than the popular 3DGS by using 2D surfels to approximate thin surfaces. However, it falls short when dealing with glossy surfaces, resulting in visible holes in these areas. We find that the reflection discontinuity causes the issue. To fit the jump from diffuse to specular reflection at different viewing angles, depth bias is introduced in the optimized Gaussian primitives. To address that, we first replace the depth distortion loss in 2DGS with a novel depth convergence loss, which imposes a strong constraint on depth continuity. Then, we rectify the depth criterion in determining the actual surface, which fully accounts for all the intersecting Gaussians along the ray. Qualitative and quantitative evaluations across various datasets reveal that our method significantly improves reconstruction quality, with more complete and accurate surfaces than 2DGS. Code is available at https://github.com/ XiaoXinyyx/Unbiased_Surfel.

Browse

Browsing 44-Issue 7 by Title

Results Per Page

Sort Options