ZeroCard: Cardinality Estimation with Zero Dependence on Target Databases -- No Data, No Query, No Retraining
2510.07983v1
cs.DB, cs.AI
2025-10-11
Авторы:
Xianghong Xu, Rong Kang, Xiao He, Lei Zhang, Jianjun Chen, Tieying Zhang
Abstract
Cardinality estimation is a fundamental task in database systems and plays a
critical role in query optimization. Despite significant advances in
learning-based cardinality estimation methods, most existing approaches remain
difficult to generalize to new datasets due to their strong dependence on raw
data or queries, thus limiting their practicality in real scenarios. To
overcome these challenges, we argue that semantics in the schema may benefit
cardinality estimation, and leveraging such semantics may alleviate these
dependencies. To this end, we introduce ZeroCard, the first semantics-driven
cardinality estimation method that can be applied without any dependence on raw
data access, query logs, or retraining on the target database. Specifically, we
propose to predict data distributions using schema semantics, thereby avoiding
raw data dependence. Then, we introduce a query template-agnostic
representation method to alleviate query dependence. Finally, we construct a
large-scale query dataset derived from real-world tables and pretrain ZeroCard
on it, enabling it to learn cardinality from schema semantics and predicate
representations. After pretraining, ZeroCard's parameters can be frozen and
applied in an off-the-shelf manner. We conduct extensive experiments to
demonstrate the distinct advantages of ZeroCard and show its practical
applications in query optimization. Its zero-dependence property significantly
facilitates deployment in real-world scenarios.
Ссылки и действия
Дополнительные ресурсы: