Skeletons Matter: Dynamic Data Augmentation for Text-to-Query

2511.18934v1 cs.CL, cs.AI, cs.DB 2025-11-26
Авторы:

Yuchen Ji, Bo Xu, Jie Shi, Jiaqing Liang, Deqing Yang, Yu Mao, Hai Chen, Yanghua Xiao

Abstract

The task of translating natural language questions into query languages has long been a central focus in semantic parsing. Recent advancements in Large Language Models (LLMs) have significantly accelerated progress in this field. However, existing studies typically focus on a single query language, resulting in methods with limited generalizability across different languages. In this paper, we formally define the Text-to-Query task paradigm, unifying semantic parsing tasks across various query languages. We identify query skeletons as a shared optimization target of Text-to-Query tasks, and propose a general dynamic data augmentation framework that explicitly diagnoses model-specific weaknesses in handling these skeletons to synthesize targeted training data. Experiments on four Text-to-Query benchmarks demonstrate that our method achieves state-of-the-art performance using only a small amount of synthesized data, highlighting the efficiency and generality of our approach and laying a solid foundation for unified research on Text-to-Query tasks. We release our code at https://github.com/jjjycaptain/Skeletron.

Ссылки и действия

Связанные статьи

Play by the Type Rules: Inferring Constraints for LLM Functions in Declarative P...

## Контекст Интеграция LLM-powered operators в declarative query languages позволяет объединить дешевые и интерпретируем...

2025-09-26

Explaining Black-box Language Models with Knowledge Probing Systems: A Post-hoc ...

## Контекст Безрассильные языковые модели (PLM) обучены на больших объемах немаркированных данных и проявляют выдающиеся...

2025-08-27

MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Docume...

## Контекст В последние годы технологии текстовой обработки и машинного обучения приобрели неоспоримую роль в решении ра...

2025-08-19