4-LEGS: 4D Language Embedded Gaussian Splatting
Loading...
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association and John Wiley & Sons Ltd.
Abstract
The emergence of neural representations has revolutionized our means for digitally viewing a wide range of 3D scenes, enabling the synthesis of photorealistic images rendered from novel views. Recently, several techniques have been proposed for connecting these low-level representations with the high-level semantics understanding embodied within the scene. These methods elevate the rich semantic understanding from 2D imagery to 3D representations, distilling high-dimensional spatial features onto 3D space. In our work, we are interested in connecting language with a dynamic modeling of the world. We show how to lift spatio-temporal features to a 4D representation based on 3D Gaussian Splatting. This enables an interactive interface where the user can spatiotemporally localize events in the video from text prompts. We demonstrate our system on public 3D video datasets of people and animals performing various actions.
Description
CCS Concepts: Computing methodologies → 3D imaging; Rendering; Activity recognition and understanding
        @article{10.1111:cgf.70085,
journal = {Computer Graphics Forum},
title = {{4-LEGS: 4D Language Embedded Gaussian Splatting}},
author = {Fiebelman, Gal and Cohen, Tamir and Morgenstern, Ayellet and Hedman, Peter and Averbuch-Elor, Hadar},
year = {2025},
publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
ISSN = {1467-8659},
DOI = {10.1111/cgf.70085}
}
        
