Semantic-aware Graph-guided Behavior Sequences Generation with Large Language Models for Smart Homes
2508.03484v1
cs.AI
2025-08-06
Авторы:
Zhiyao Xu, Dan Zhao, Qingsong Zou, Qing Li, Yong Jiang, Yuhang Wang, Jingyu Xiao
Резюме на русском
Проблема: модели «умного дома», обученные на статических данных, быстро деградируют при сезонных или образов-ных сдвигах поведения, а сбор новых реальных данных дорог и конфиденциален.
Решение: фреймворк SmartGen, в котором LLM синтезирует реалистичные поведенческие последовательности. Он разбивает длинные логи на семантически цельные куски, компрессирует их кластеризацией в латентном пространстве, строит граф переходов и подаёт его в LLM как контекст, а затем двухступенчатым фильтром убирает аномальные сэмплы.
Эксперименты на трёх датасетах: при поведенческом дрейфе точность детектирования аномалий выросла на 85,4 %, предсказания поведения — на 70,5 % относительно базовых моделей без дообучения.
Abstract
As smart homes become increasingly prevalent, intelligent models are widely
used for tasks such as anomaly detection and behavior prediction. These models
are typically trained on static datasets, making them brittle to behavioral
drift caused by seasonal changes, lifestyle shifts, or evolving routines.
However, collecting new behavior data for retraining is often impractical due
to its slow pace, high cost, and privacy concerns. In this paper, we propose
SmartGen, an LLM-based framework that synthesizes context-aware user behavior
data to support continual adaptation of downstream smart home models. SmartGen
consists of four key components. First, we design a Time and Semantic-aware
Split module to divide long behavior sequences into manageable, semantically
coherent subsequences under dual time-span constraints. Second, we propose
Semantic-aware Sequence Compression to reduce input length while preserving
representative semantics by clustering behavior mapping in latent space. Third,
we introduce Graph-guided Sequence Synthesis, which constructs a behavior
relationship graph and encodes frequent transitions into prompts, guiding the
LLM to generate data aligned with contextual changes while retaining core
behavior patterns. Finally, we design a Two-stage Outlier Filter to identify
and remove implausible or semantically inconsistent outputs, aiming to improve
the factual coherence and behavioral validity of the generated sequences.
Experiments on three real-world datasets demonstrate that SmartGen
significantly enhances model performance on anomaly detection and behavior
prediction tasks under behavioral drift, with anomaly detection improving by
85.43% and behavior prediction by 70.51% on average. The code is available at
https://github.com/horizonsinzqs/SmartGen.
Ссылки и действия
Дополнительные ресурсы: