Eurographics Digital Library

This is the DSpace 7 platform of the Eurographics Digital Library.
  • The contents of the Eurographics Digital Library Archive are freely accessible. Only access to the full-text documents of the journal Computer Graphics Forum (joint property of Wiley and Eurographics) is restricted to Eurographics members, people from institutions who have an Institutional Membership at Eurographics, or users of the TIB Hannover. On the item pages you will find so-called purchase links to the TIB Hannover.
  • As a Eurographics member, you can log in with your email address and password from https://services.eg.org. If you are part of an institutional member and you are on a computer with a Eurographics registered IP domain, you can proceed immediately.
  • From 2022, all new releases published by Eurographics will be licensed under Creative Commons. Publishing with Eurographics is Plan-S compliant. Please visit Eurographics Licensing and Open Access Policy for more details.
 

Recent Submissions

Item
Situated Visualization in Motion
(Université Paris-Saclay, 2023-12-18) Yao, Lijie
In my thesis, I define visualization in motion and make several contributions to how to visualize and design situated visualizations in motion. In situated data visualization, the data is directly visualized near their data referent, i.e., the physical space, object, or person it refers to. Situated visualizations are often useful in contexts where the data referent or the viewer does not remain stationary but is in relative motion. For example, a runner is looking at visualizations from their fitness band while running or from a public display as they are passing it by. Reading visualizations in such scenarios might be impacted by motion factors. As such, understanding how to best design visualizations for dynamic contexts is important. That is, effective and visually stable situated data encodings need to be defined and studied when motion factors are involved. As such, I first define visualization in motion as visual data representations used in contexts that exhibit relative motion between a viewer and an entire visualization. I classify visualization in motion into 3 categories: (a) moving viewer & stationary visualization, (b) moving visualization & stationary viewer, and (c) moving viewer & moving visualization. To analyze the opportunities and challenges of designing visualization in motion, I propose a research agenda. To explore to what extent viewers can accurately read visualization in motion, I conduct a series of empirical perception studies on magnitude proportion estimation. My results show that people can get reliable information from visualization in motion, even if at high speed and under irregular trajectories. Based on my perception results, I move toward answering the question of how to design and embed visualization in motion in real contexts. I pick up swimming as an application scenario because swimming has rich, dynamic data. I implement a technology probe that allows users to embed visualizations in motion in a live swimming video. Users can adjust in real-time visual encoding parameters, the movement status, and the situatedness of visualization. The visualizations encode real swimming race-related data. My evaluation with designers confirms that designing visualizations in motion requires more than what traditional visualization toolkits provide: the visualization needs to be placed in-context (e.g., its data referent, its background) but also needs to be previewed under its real movement. The full context with motion effects can affect design decisions. After that, I continue my work to understand the impact of the context on the design of visualizations in motion and its user experience. I select video games as my test platform, in which visualizations in motion are placed in a busy, dynamic background but need to help players make quick decisions to win. My study shows there are trade-offs between visualization's readability under motion and aesthetics. Participants seek a balance between the readability of visualization, the aesthetic fitting to the context, the immersion experience the visualization brings, the support the visualization can provide for a win, and the harmony between the visualization and its context.
Item
Learning Digital Humans from Vision and Language
(ETH Zurich, 2024-10-10) Yao Feng
The study of realistic digital humans has gained significant attention within the research communities of computer vision, computer graphics, and machine learning. This growing interest is driven by the importance of understanding human selves and the pivotal role digital humans play in diverse applications, including virtual presence in AR/VR, digital fashion, entertainment, robotics, and healthcare. However, two major challenges hinder the widespread use of digital humans across disciplines: the difficulty in capturing, as current methods rely on complex systems that are time-consuming, labor-intensive, and costly; and the lack of understanding, where even after creating digital humans, gaps in understanding their 3D representations and integrating them with broader world knowledge limit their effective utilization. Overcoming these challenges is crucial to unlocking the full potential of digital humans in interdisciplinary research and practical applications. To address these challenges, this thesis combines insights from computer vision, computer graphics, and machine learning to \textbf{develop scalable methods for capturing and modeling digital humans}. These methods include capturing faces, bodies, hands, hair, and clothing using accessible data such as images, videos, and text descriptions. More importantly, \textbf{we go beyond capturing to shift the research paradigm toward understanding and reasoning} by leveraging large language models (LLMs). For instance, we developed the first foundation model that not only captures 3D human poses from a single image, but also reasons about a person’s potential next actions in 3D by incorporating world knowledge. This thesis unifies scalable capturing and understanding of digital humans, from vision and language data—just as humans do by observing and interpreting the world through visual and linguistic information. Our research begins by developing a framework to capture detailed 3D faces from in-the-wild images. This framework, capable of generating highly realistic and animatable 3D faces from single images, is trained without paired 3D supervision and achieves state-of-the-art accuracy in shape reconstruction. It effectively disentangles identity and expression details, thereby allowing animation of estimated faces with various expressions. Humans, are not just faces, we then develop PIXIE, a method for estimating animatable, whole-body 3D avatars with realistic facial details from a single image. By incorporating an attention mechanism, PIXIE surpasses previous methods in accuracy and enables the creation of expressive, high-quality 3D humans. Expanding beyond human bodies, we proposed SCARF and DELTA, to capture separate body, clothing, face, and hair from monocular videos using a hybrid representation. While clothing and hair are better modeled with implicit representations like neural radiance fields (NeRFs) due to their complex topologies, human bodies are better represented with meshes. SCARF combines the strengths of both by integrating mesh-based bodies with NeRFs for clothing and hair. To enable learning directly from monocular videos, we introduced mesh-integrated volume rendering, which enables optimizing the model directly from 2D image data without requiring 3D supervision. Thanks to the disentangled modeling, the captured avatar's clothing can be transferred to arbitrary body shapes, making it especially valuable for applications such as virtual try-on. Building on SCARF's hybrid representation, we introduced TECA, which uses text-to-image generation models to create realistic and editable 3D avatars. TECA produces more realistic avatars than recent methods while allowing edits due to its compositional design. For instance, users can input descriptions like ``a slim woman with dreadlocks'' to generate a 3D head mesh with texture and a NeRF model for the hair. It also enables transferring NeRF-based hairstyles, scarves, and other accessories between avatars. While these methods make capturing humans more accessible, broader applications require understanding the context of human behavior. Traditional pose estimation methods often isolate subjects by cropping images, which limits their ability to interpret the full scene or reason about actions. To address this, we developed ChatPose, the first model for understanding and reasoning about 3D human poses. ChatPose leverages a multimodal large language model (LLM), finetuned a projection layer to decode embeddings into 3D pose parameters, which are further decoded into 3D body meshes using the SMPL body model. By finetuning on both text-to-3D pose and image-to-3D pose data, ChatPose demonstrates, for the first time, that a LLM can directly reason about 3D human poses. This capability allows ChatPose to describe human behavior, generate 3D poses, and reason about potential next actions in 3D form, combining perception with reasoning. We believe the contributions of this thesis, in scaling up digital human capture and advancing the understanding of humans in 3D, have the potential to shape the future of human-centered research and enable broader applications across diverse fields.
Item
Interaction in Virtual Reality Simulations
(Politecnico di Torino, 2023-07-18) Calandra, Davide
Virtual Reality (VR) has emerged as a powerful technology for creating immersive and engaging simulations that enable users to interact with computer-generated environments in a natural and intuitive way. However, the design and implementation of effective interaction methods in VR remain challenging. The lack of proper haptic feedback, and the need to rely on input devices such as controllers or gestures, for example, can result in awkward or unnatural interactions, reducing the perceived level of realism and the immersion related to the VR experience. At the same time, the employment of poorly designed interaction paradigms may impair usability, reduce the the sense of presence, and even cause unpleasant effects related to the so called cybersickness. This doctoral thesis, which covers a subset of the the research work performed in the three-year Ph.D. period, aims to address these challenges by investigating the role of interaction in VR simulations. The investigated topics range from the study of locomotion interfaces in VR, to the use of haptic interfaces for simulating passive and haptic tools applied to real life training use cases and the exploration of further forms of Human-Computer Interaction (HCI) and Human-Human Interaction (HHI) through voice and body gestures, also in the context of multi-user shared simulations. Results obtained in the considered case studies cover a wide number of relevant aspects, such as realism, usability, and engagement of VR simulations, among others, ultimately leading to a validation of proposed approaches and methodologies.In this way, the thesis contributes to the understanding of how to design and evaluate interaction paradigms in VR simulations in order to enhance aspects related to User eXperience (UX), with the goal of letting users successfully achieve the intended simulation objectives.
Item
Discrete Laplacians for General Polygonal and Polyhedral Meshes
(TU Dortmund University, 2024) Astrid Pontzen (née Bunge)
This thesis presents several approaches that generalize the Laplace-Beltrami operator and its closely related gradient and divergence operators to arbitrary polygonal and polyhedral meshes. We start by introducing the linear virtual refinement method, which provides a simple yet effective discretization of the Laplacian with the help of the Galerkin method from a Finite Element perspective. Its flexibility allows us to explore alternative numerical schemes in this setting and to derive a second Laplacian, called the Diamond Laplacian with a similar approach, but this time combined with the Discrete Duality Finite Volume method. It offers enhanced accuracy but comes at the cost of denser matrices and slightly longer solving times. In the second part of the thesis, we extend the linear virtual refinement to higher-order discretizations. This method is called the quadratic virtual refinement method. It introduces variational quadratic shape functions for arbitrary polygons and polyhedra. We also present a custom multigrid approach to address the computational challenges of higher-order discretizations, making the faster convergence rates and higher accuracy of these polygon shape functions more affordable for the user. The final part of this thesis focuses on the open degrees of freedom of the linear virtual refinement method. By uncovering connections between our operator and the underlying tessellations, we can enhance the accuracy and stability of our initial method and improve its overall performance. These connections equally allow us to define what a ``good'' polygon would be in the context of our Laplacian. We present a smoothing approach that alters the shape of the polygons (while retaining the original surface as much as possible) to allow for even better performance.
Item
Perception-Based Techniques to Enhance User Experience in Virtual Reality
(2024-07-26) Colin Groth
Virtual reality (VR) ushered in a new era of immersive content viewing with vast potential for entertainment, design, medicine, and other fields. However, the willingness of users to practically apply the technology is bound to the quality of the virtual experience. In this dissertation, we describe the development and investigation of novel techniques to reduce negative influences on the user experience in VR applications. Our methods not only include substantial technical improvements but also consider important characteristics of human perception that are exploited to make the applications more effective and subtle. Mostly, we are focused on visual perception, since we deal with visual stimuli, but we also consider the vestibular sense which is a key component for the occurrence of negative symptoms in VR, referred to as cybersickness. In this dissertation, our techniques are designed for three groups of VR applications, characterized by the degree of freedom to apply adjustments. The first set of techniques addresses the extension of VR systems with stimulation hardware. By adjusting common techniques from the medical field, we artificially induce human body signals to create immersive experiences that reduce common mismatches between perceptual information. The second group focuses on applications that use common hardware and allow adjustments of the full render pipeline. Here, especially immersive video content is notable, where the frame rates and quality of the presentations are often not in line with the high requirements of VR systems to satisfy a decent user experience. To address the display problems, we present a novel video codec based on wavelet compression and perceptual features of the visual system. Finally, the third group of applications is the most restrictive and does not allow modifications of the rendering pipeline. Here, our techniques consist of post-processing manipulations in screen space after rendering the image, without knowledge of the 3D scene. To allow techniques in this group to be subtle, we exploit fundamental properties of human peripheral vision and apply spatial masking as well as gaze-contingent motion scaling in our methods.