A Design Space for the Critical Validation of LLM-Generated Tabular Data

Abstract
LLM-generated tabular data is creating new opportunities for data-driven applications in academia, business, and society. To leverage benefits like missing value imputation, labeling, and enrichment with context-aware attributes, LLM-generated data needs a critical validation process. The number of pioneering approaches is increasing fast, opening a promising validation space that, so far, remains unstructured. We present a design space for the critical validation of LLM-generated tabular data with two dimensions: First, the Analysis Granularity dimension-from within-attribute (single-item and multi-item) to acrossattribute perspectives (1×1, 1×m, and n×n). Second, the Data Source dimension-differentiating between LLM-generated values, ground truth values, explanations, and their combinations. We discuss analysis tasks for each dimension cross-cut, map 19 existing validation approaches, and discuss the characteristics of two approaches in detail, demonstrating descriptive power.
Description

        
@inproceedings{
10.2312:eurova.20251101
, booktitle = {
EuroVis Workshop on Visual Analytics (EuroVA)
}, editor = {
Schulz, Hans-Jörg
and
Villanova, Anna
}, title = {{
A Design Space for the Critical Validation of LLM-Generated Tabular Data
}}, author = {
Sachdeva, Madhav
and
Narayanan, Christopher
and
Wiedenkeller, Marvin
and
Sedlakova, Jana
and
Bernard, Jürgen
}, year = {
2025
}, publisher = {
The Eurographics Association
}, ISSN = {
2664-4487
}, ISBN = {
978-3-03868-283-7
}, DOI = {
10.2312/eurova.20251101
} }
Citation
Collections