A Visual Diagnostics Framework for District Heating Data: Enhancing Data Quality for AI-Driven Heat Consumption Prediction
2510.00872v1
cs.LG, cs.HC
2025-10-04
Авторы:
Kristoffer Christensen, Bo Nørregaard Jørgensen, Zheng Grace Ma
Abstract
High-quality data is a prerequisite for training reliable Artificial
Intelligence (AI) models in the energy domain. In district heating networks,
sensor and metering data often suffer from noise, missing values, and temporal
inconsistencies, which can significantly degrade model performance. This paper
presents a systematic approach for evaluating and improving data quality using
visual diagnostics, implemented through an interactive web-based dashboard. The
dashboard employs Python-based visualization techniques, including time series
plots, heatmaps, box plots, histograms, correlation matrices, and
anomaly-sensitive KPIs such as skewness and anomaly detection based on the
modified z-scores. These tools al-low human experts to inspect and interpret
data anomalies, enabling a human-in-the-loop strategy for data quality
assessment. The methodology is demonstrated on a real-world dataset from a
Danish district heating provider, covering over four years of hourly data from
nearly 7000 meters. The findings show how visual analytics can uncover systemic
data issues and, in the future, guide data cleaning strategies that enhance the
accuracy, stability, and generalizability of Long Short-Term Memory and Gated
Recurrent Unit models for heat demand forecasting. The study contributes to a
scalable, generalizable framework for visual data inspection and underlines the
critical role of data quality in AI-driven energy management systems.
Ссылки и действия
Дополнительные ресурсы: