43-Issue 7

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607047

Browse

Now showing 1 - 20 of 57

Seamless and Aligned Texture Optimization for 3D Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Wang, Lei; Ge, Linlin; Zhang, Qitong; Feng, Jieqing; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Restoring the appearance of the model is a crucial step for achieving realistic 3D reconstruction. High-fidelity textures can also conceal some geometric defects. Since the estimated camera parameters and reconstructed geometry usually contain errors, subsequent texture mapping often suffers from undesirable visual artifacts such as blurring, ghosting, and visual seams. In particular, significant misalignment between the reconstructed model and the registered images will lead to texturing the mesh with inconsistent image regions. However, eliminating various artifacts to generate high-quality textures remains a challenge. In this paper, we address this issue by designing a texture optimization method to generate seamless and aligned textures for 3D reconstruction. The main idea is to detect misalignment regions between images and geometry and exclude them from texture mapping. To handle the texture holes caused by these excluded regions, a cross-patch texture hole-filling method is proposed, which can also synthesize plausible textures for invisible faces. Moreover, for better stitching of the textures from different views, an improved camera pose optimization is present by introducing color adjustment and boundary point sampling. Experimental results show that the proposed method can eliminate the artifacts caused by inaccurate input data robustly and produce highquality texture results compared with state-of-the-art methods.
G-Style: Stylized Gaussian Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Kovács, Áron Samuel; Hermosilla, Pedro; Raidou, Renata Georgia; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
We introduce G -Style, a novel algorithm designed to transfer the style of an image onto a 3D scene represented using Gaussian Splatting. Gaussian Splatting is a powerful 3D representation for novel view synthesis, as-compared to other approaches based on Neural Radiance Fields-it provides fast scene renderings and user control over the scene. Recent pre-prints have demonstrated that the style of Gaussian Splatting scenes can be modified using an image exemplar. However, since the scene geometry remains fixed during the stylization process, current solutions fall short of producing satisfactory results. Our algorithm aims to address these limitations by following a three-step process: In a pre-processing step, we remove undesirable Gaussians with large projection areas or highly elongated shapes. Subsequently, we combine several losses carefully designed to preserve different scales of the style in the image, while maintaining as much as possible the integrity of the original scene content. During the stylization process and following the original design of Gaussian Splatting, we split Gaussians where additional detail is necessary within our scene by tracking the gradient of the stylized color. Our experiments demonstrate that G -Style generates high-quality stylizations within just a few minutes, outperforming existing methods both qualitatively and quantitatively
A Hybrid Parametrization Method for B-Spline Curve Interpolation via Supervised Learning
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Song, Tianyu; Shen, Tong; Ge, Linlin; Feng, Jieqing; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
B-spline curve interpolation is a fundamental algorithm in computer-aided geometric design. Determining suitable parameters based on data points distribution has always been an important issue for high-quality interpolation curves generation. Various parameterization methods have been proposed. However, there is no universally satisfactory method that is applicable to data points with diverse distributions. In this work, a hybrid parametrization method is proposed to overcome the problem. For a given set of data points, a classifier via supervised learning identifies an optimal local parameterization method based on the local geometric distribution of four adjacent data points, and the optimal local parameters are computed using the selected optimal local parameterization method for the four adjacent data points. Then a merging method is employed to calculate global parameters which align closely with the local parameters. Experiments demonstrate that the proposed hybrid parameterization method well adapts the different distributions of data points statistically. The proposed method has a flexible and scalable framework, which can includes current and potential new parameterization methods as its components.
FastFlow: GPU Acceleration of Flow and Depression Routing for Landscape Simulation
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Jain, Aryamaan; Kerbl, Bernhard; Gain, James; Finley, Brandon; Cordonnier, Guillaume; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Terrain analysis plays an important role in computer graphics, hydrology and geomorphology. In particular, analyzing the path of material flow over a terrain with consideration of local depressions is a precursor to many further tasks in erosion, river formation, and plant ecosystem simulation. For example, fluvial erosion simulation used in terrain modeling computes water discharge to repeatedly locate erosion channels for soil removal and transport. Despite its significance, traditional methods face performance constraints, limiting their broader applicability. In this paper, we propose a novel GPU flow routing algorithm that computes the water discharge in O(logn) iterations for a terrain with n vertices (assuming n processors). We also provide a depression routing algorithm to route the water out of local minima formed by depressions in the terrain, which converges in O(log2 n) iterations. Our implementation of these algorithms leads to a 5× speedup for flow routing and 34× to 52× speedup for depression routing compared to previous work on a 10242 terrain, enabling interactive control of terrain simulation.
Spatially and Temporally Optimized Audio-Driven Talking Face Generation
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Dong, Biao; Ma, Bo-Yao; Zhang, Lei; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Audio-driven talking face generation is essentially a cross-modal mapping from audio to video frames. The main challenge lies in the intricate one-to-many mapping, which affects lip sync accuracy. And the loss of facial details during image reconstruction often results in visual artifacts in the generated video. To overcome these challenges, this paper proposes to enhance the quality of generated talking faces with a new spatio-temporal consistency. Specifically, the temporal consistency is achieved through consecutive frames of the each phoneme, which form temporal modules that exhibit similar lip appearance changes. This allows for adaptive adjustment in the lip movement for accurate sync. The spatial consistency pertains to the uniform distribution of textures within local regions, which form spatial modules and regulate the texture distribution in the generator. This yields fine details in the reconstructed facial images. Extensive experiments show that our method can generate more natural talking faces than previous state-of-the-art methods in both accurate lip sync and realistic facial details.
GLTScene: Global-to-Local Transformers for Indoor Scene Synthesis with General Room Boundaries
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Li, Yijie; Xu, Pengfei; Ren, Junquan; Shao, Zefan; Huang, Hui; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
We present GLTScene, a novel data-driven method for high-quality furniture layout synthesis with general room boundaries as conditions. This task is challenging since the existing indoor scene datasets do not cover the variety of general room boundaries. We incorporate the interior design principles with learning techniques and adopt a global-to-local strategy for this task. Globally, we learn the placement of furniture objects from the datasets without considering their alignment. Locally, we learn the alignment of furniture objects relative to their nearest walls, according to the alignment principle in interior design. The global placement and local alignment of furniture objects are achieved by two transformers respectively. We compare our method with several baselines in the task of furniture layout synthesis with general room boundaries as conditions. Our method outperforms these baselines both quantitatively and qualitatively. We also demonstrate that our method can achieve other conditional layout synthesis tasks, including object-level conditional generation and attribute-level conditional generation. The code is publicly available at https://github.com/WWalter-Lee/GLTScene.
Robust Diffusion-based Motion In-betweening
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Qin, Jia; Yan, Peng; An, Bo; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
The emergence of learning-based motion in-betweening techniques offers animators a more efficient way to animate characters. However, existing non-generative methods either struggle to support long transition generation or produce results that lack diversity. Meanwhile, diffusion models have shown promising results in synthesizing diverse and high-quality motions driven by text and keyframes. However, in these methods, keyframes often serve as a guide rather than a strict constraint and can sometimes be ignored when keyframes are sparse. To address these issues, we propose a lightweight yet effective diffusionbased motion in-betweening framework that generates animations conforming to keyframe constraints.We incorporate keyframe constraints into the training phase to enhance robustness in handling various constraint densities. Moreover, we employ relative positional encoding to improve the model's generalization on long range in-betweening tasks. This approach enables the model to learn from short animations while generating realistic in-betweening motions spanning thousands of frames. We conduct extensive experiments to validate our framework using the newly proposed metrics K-FID, K-Diversity, and K-Error, designed to evaluate generative in-betweening methods. Results demonstrate that our method outperforms existing diffusion-based methods across various lengths and keyframe densities. We also show that our method can be applied to text-driven motion synthesis, offering fine-grained control over the generated results.
P-Hologen: An End-to-End Generative Framework for Phase-Only Holograms
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Park, JooHyun; Jeon, YuJin; Kim, HuiYong; Baek, SeungHwan; Kang, HyeongYeop; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Holography stands at the forefront of visual technology, offering immersive, three-dimensional visualizations through the manipulation of light wave amplitude and phase. Although generative models have been extensively explored in the image domain, their application to holograms remains relatively underexplored due to the inherent complexity of phase learning. Exploiting generative models for holograms offers exciting opportunities for advancing innovation and creativity, such as semantic-aware hologram generation and editing. Currently, the most viable approach for utilizing generative models in the hologram domain involves integrating an image-based generative model with an image-to-hologram conversion model, which comes at the cost of increased computational complexity and inefficiency. To tackle this problem, we introduce P-Hologen, the first endto- end generative framework designed for phase-only holograms (POHs). P-Hologen employs vector quantized variational autoencoders to capture the complex distributions of POHs. It also integrates the angular spectrum method into the training process, constructing latent spaces for complex phase data using strategies from the image processing domain. Extensive experiments demonstrate that P-Hologen achieves superior quality and computational efficiency compared to the existing methods. Furthermore, our model generates high-quality unseen, diverse holographic content from its learned latent space without requiring pre-existing images. Our work paves the way for new applications and methodologies in holographic content creation, opening a new era in the exploration of generative holographic content. The code for our paper is publicly available on https://github.com/james0223/P-Hologen.
A TransISP Based Image Enhancement Method for Visual Disbalance in Low-light Images
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Wu, Jiaqi; Guo, Jing; Jing, Rui; Zhang, Shihao; Tian, Zijian; Chen, Wei; Wang, Zehua; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Existing image enhancement algorithms often fail to effectively address issues of visual disbalance, such as brightness unevenness and color distortion, in low-light images. To overcome these challenges, we propose a TransISP-based image enhancement method specifically designed for low-light images. To mitigate color distortion, we design dual encoders based on decoupled representation learning, which enable complete decoupling of the reflection and illumination components, thereby preventing mutual interference during the image enhancement process. To address brightness unevenness, we introduce CNNformer, a hybrid model combining CNN and Transformer. This model efficiently captures local details and long-distance dependencies between pixels, contributing to the enhancement of brightness features across various local regions. Additionally, we integrate traditional image signal processing algorithms to achieve efficient color correction and denoising of the reflection component. Furthermore, we employ a generative adversarial network (GAN) as the overarching framework to facilitate unsupervised learning. The experimental results show that, compared with six SOTA image enhancement algorithms, our method obtains significant improvement in evaluation indexes (e.g., on LOL, PSNR: 15.59%, SSIM: 9.77%, VIF: 9.65%), and it can improve visual disbalance defects in low-light images captured from real-world coal mine underground scenarios.
PCLC-Net: Point Cloud Completion in Arbitrary Poses using Learnable Canonical Space
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Xu, Hanmo; Shuai, Qingyao; Chen, Xuejin; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Recovering the complete structure from partial point clouds in arbitrary poses is challenging. Recently, many efforts have been made to address this problem by developing SO(3)-equivariant completion networks or aligning the partial point clouds with a predefined canonical space before completion. However, these approaches are limited to random rotations only or demand costly pose annotation for model training. In this paper, we present a novel Network for Point cloud Completion with Learnable Canonical space (PCLC-Net) to reduce the need for pose annotations and extract SE(3)-invariant geometry features to improve the completion quality in arbitrary poses. Without pose annotations, our PCLC-Net utilizes self-supervised pose estimation to align the input partial point clouds to a canonical space that is learnable for an object category and subsequently performs shape completion in the learned canonical space. Our PCLC-Net can complete partial point clouds with arbitrary SE(3) poses without requiring pose annotations for supervision. Our PCLC-Net achieves state-of-the-art results on shape completion with arbitrary SE(3) poses on both synthetic and real scanned data. To the best of our knowledge, our method is the first to achieve shape completion in arbitrary poses without pose annotations during network training.
Controllable Anime Image Editing Based on the Probability of Attribute Tags
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Song, Zhenghao; Mo, Haoran; Gao, Chengying; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Editing anime images via probabilities of attribute tags allows controlling the degree of the manipulation in an intuitive and convenient manner. Existing methods fall short in the progressive modification and preservation of unintended regions in the input image. We propose a controllable anime image editing framework based on adjusting the tag probabilities, in which a probability encoding network (PEN) is developed to encode the probabilities into features that capture continuous characteristic of the probabilities. Thus, the encoded features are able to direct the generative process of a pre-trained diffusion model and facilitate the linear manipulation.We also introduce a local editing module that automatically identifies the intended regions and constrains the edits to be applied to those regions only, which preserves the others unchanged. Comprehensive comparisons with existing methods indicate the effectiveness of our framework in both one-shot and linear editing modes. Results in additional applications further demonstrate the generalization ability of our approach.
Multiscale Spectral Manifold Wavelet Regularizer for Unsupervised Deep Functional Maps
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Wang, Haibo; Meng, Jing; Li, Qinsong; Hu, Ling; Guo, Yueyu; Liu, Xinru; Yang, Xiaoxia; Liu, Shengjun; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
In deep functional maps, the regularizer computing the functional map is especially crucial for ensuring the global consistency of the computed pointwise map. As the regularizers integrated into deep learning should be differentiable, it is not trivial to incorporate informative axiomatic structural constraints into the deep functional map, such as the orientation-preserving term. Although commonly used regularizers include the Laplacian-commutativity term and the resolvent Laplacian commutativity term, these are limited to single-scale analysis for capturing geometric information. To this end, we propose a novel and theoretically well-justified regularizer commuting the functional map with the multiscale spectral manifold wavelet operator. This regularizer enhances the isometric constraints of the functional map and is conducive to providing it with better structural properties with multiscale analysis. Furthermore, we design an unsupervised deep functional map with the regularizer in a fully differentiable way. The quantitative and qualitative comparisons with several existing techniques on the (near-)isometric and non-isometric datasets show our method's superior accuracy and generalization capabilities. Additionally, we illustrate that our regularizer can be easily inserted into other functional map methods and improve their accuracy.
Ray Tracing Animated Displaced Micro-Meshes
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Gruen, Holger; Benthin, Carsten; Kensler, Andrew; Barczak, Joshua; McAllister, David; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
We present a new method that allows efficient ray tracing of virtually artefact-free animated displaced micro-meshes (DMMs) [MMT23] and preserves their low memory footprint and low BVH build and update cost. DMMs allow for compact representation of micro-triangle geometry through hierarchical encoding of displacements. Displacements are computed with respect to a coarse base mesh and are used to displace new vertices introduced during 1 : 4 subdivision of the base mesh. Applying non-rigid transformation to the base mesh can result in silhouette and normal artefacts (see Figure 1) during animation. We propose an approach which prevents these artefacts by interpolating transformation matrices before applying them to the DMM representation. Our interpolation-based algorithm does not change DMM data structures and it allows for efficient bounding of animated micro-triangle geometry which is essential for fast tessellation-free ray tracing of animated DMMs.
Pacific Graphics 2024 - CGF 43-7: Frontmatter
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Chen, Renjie; Ritschel, Tobias; Whiting, Emily; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Exploring Fast and Flexible Zero-Shot Low-Light Image/Video Enhancement
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Han, Xianjun; Bao, Taoli; Yang, Hongyu; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Low-light image/video enhancement is a challenging task when images or video are captured under harsh lighting conditions. Existing methods mostly formulate this task as an image-to-image conversion task via supervised or unsupervised learning. However, such conversion methods require an extremely large amount of data for training, whether paired or unpaired. In addition, these methods are restricted to specific training data, making it difficult for the trained model to enhance other types of images or video. In this paper, we explore a novel, fast and flexible, zero-shot, low-light image or video enhancement framework. Without relying on prior training or relationships among neighboring frames, we are committed to estimating the illumination of the input image/frame by a well-designed network. The proposed zero-shot, low-light image/video enhancement architecture includes illumination estimation and residual correction modules. The network architecture is very concise and does not require any paired or unpaired data during training, which allows low-light enhancement to be performed with several simple iterations. Despite its simplicity, we show that the method is fast and generalizes well to diverse lighting conditions. Many experiments on various images and videos qualitatively and quantitatively demonstrate the advantages of our method over state-of-the-art methods.
Symmetric Piecewise Developable Approximations
(The Eurographics Association and John Wiley & Sons Ltd., 2024) He, Ying; Fang, Qing; Zhang, Zheng; Dai, Tielin; Wu, Kang; Liu, Ligang; Fu, Xiao-Ming; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
We propose a novel method for generating symmetric piecewise developable approximations for shapes in approximately global reflectional or rotational symmetry. Given a shape and its symmetry constraint, the algorithm contains two crucial steps: (i) a symmetric deformation to achieve a nearly developable model and (ii) a symmetric segmentation aided by the deformed shape. The key to the deformation step is the use of the symmetric implicit neural representations of the shape and the deformation field. A new mesh extraction from the implicit function is introduced to construct a strictly symmetric mesh for the subsequent segmentation. The symmetry constraint is carefully integrated into the partition to achieve the symmetric piecewise developable approximation. We demonstrate the effectiveness of our algorithm over various meshes.
Anisotropic Specular Image-Based Lighting Based on BRDF Major Axis Sampling
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Cocco, Giovanni; Zanni, Cédric; Chermain, Xavier; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Anisotropic specular appearances are ubiquitous in the environment: brushed stainless steel pans, kettles, elevator walls, fur, or scratched plastics. Real-time rendering of these materials with image-based lighting is challenging due to the complex shape of the bidirectional reflectance distribution function (BRDF). We propose an anisotropic specular image-based lighting method that can serve as a drop-in replacement for the standard bent normal technique [Rev11]. Our method yields more realistic results with a 50% increase in computation time of the previous technique, using the same high dynamic range (HDR) preintegrated environment image. We use several environment samples positioned along the major axis of the specular microfacet BRDF. We derive an analytic formula to determine the two closest and two farthest points from the reflected direction on an approximation of the BRDF confidence region boundary. The two farthest points define the BRDF major axis, while the two closest points are used to approximate the BRDF width. The environment level of detail is derived from the BRDF width and the distance between the samples. We extensively compare our method with the bent normal technique and the ground truth using the GGX specular BRDF.
Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Zhang, Xindan; Li, Ying; Sheng, Huankun; Zhang, Xinnian; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multimodal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2→SemanticKITTI), Day-to-Night, country-tocountry (USA→Singapore), and synthetic-to-real (VirtualKITTI→SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.
SCARF: Scalable Continual Learning Framework for Memory-efficiency Multiple Neural Radiance Fields
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Wang, Yuze; Wang, Junyi; Wang, Chen; Duan, Wantong; Bao, Yongtang; Qi, Yue; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
This paper introduces a novel continual learning framework for synthesising novel views of multiple scenes, learning multiple 3D scenes incrementally, and updating the network parameters only with the training data of the upcoming new scene. We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function. While NeRF and its extensions have shown a powerful capability of rendering photo-realistic novel views in a single 3D scene, managing these growing 3D NeRF assets efficiently is a new scientific problem. Very few works focus on the efficient representation or continuous learning capability of multiple scenes, which is crucial for the practical applications of NeRF. To achieve these goals, our key idea is to represent multiple scenes as the linear combination of a cross-scene weight matrix and a set of scene-specific weight matrices generated from a global parameter generator. Furthermore, we propose an uncertain surface knowledge distillation strategy to transfer the radiance field knowledge of previous scenes to the new model. Representing multiple 3D scenes with such weight matrices significantly reduces memory requirements. At the same time, the uncertain surface distillation strategy greatly overcomes the catastrophic forgetting problem and maintains the photo-realistic rendering quality of previous scenes. Experiments show that the proposed approach achieves state-of-the-art rendering quality of continual learning NeRF on NeRF-Synthetic, LLFF, and TanksAndTemples datasets while preserving extra low storage cost.
Distinguishing Structures from Textures by Patch-based Contrasts around Pixels for High-quality and Efficient Texture filtering
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Wang, Shengchun; Xu, Panpan; Hou, Fei; Wang, Wencheng; Zhao, Chong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
It is still challenging with existing methods to distinguish structures from texture details, and so preventing texture filtering. Considering that the textures on both sides of a structural edge always differ much from each other in appearances, we determine whether a pixel is on a structure edge by exploiting the appearance contrast between patches around the pixel, and further propose an efficient implementation method. We demonstrate that our proposed method is more effective than existing methods to distinguish structures from texture details, and our required patches for texture measurement can be smaller than the used patches in existing methods by at least half. Thus, we can improve texture filtering on both quality and efficiency, as shown by the experimental results, e.g., we can handle the textured images with a resolution of 800 × 600 pixels in real-time. (The code is available at https://github.com/hefengxiyulu/MLPC)

Browse

Browsing 43-Issue 7 by Issue Date

Results Per Page

Sort Options