Advanced Day-Ahead Scheduling of HVAC Demand Response Control Using Novel Strategy of Q-Learning, Model Predictive Control, and Input Convex Neural Networks
Document Type
Article
Publication Date
5-2025
Publisher
Elsevier
Source Publication
Energy and AI
Source ISSN
2666-5468
Original Item ID
DOI: 10.1016/j.egyai.2025.100509
Abstract
In this paper, we present a Q-Learning optimization algorithm for smart home HVAC systems. The proposed algorithm combines new convex deep neural network models with model predictive control (MPC) techniques. More specifically, new input convex long short-term memory (ICLSTM) models are employed to predict dynamic states in an MPC optimal control technique integrated within a Q-Learning reinforcement learning (RL) algorithm to further improve the learned temporal behaviors of nonlinear HVAC systems. As a novel RL approach, the proposed algorithm generates day-ahead HVAC demand response (DR) signals in smart homes that optimally reduce and/or shift peak energy usage, reduce electricity costs, minimize user discomfort, and honor in a best-effort way the recommendations from utility/aggregator, which in turn has impact on the overall well being of the distribution network controlled by the aggregator. The proposed Q-Learning optimization algorithm, based on epsilon-model predictive control (-MPC), can be implemented as a control agent that is executed by the smart house energy management (SHEM) system that we assume exists in the smart home, which can interact with the energy provider of the distribution network, i.e., utility/aggregator, via the smart meter. The output generated by the proposed control agent represents day-ahead local DR signals in the form of temperature setpoints for the HVAC system that are found by the optimization process to lead to desired trade-offs between electricity cost and user discomfort. The proposed algorithm can be used in smart homes with passive HVAC controllers, which solely react to end-user setpoints, to transform them into smart homes with active HVAC controllers. Such systems not only respond to the preferences of the end-user but also incorporate an external control signal provided by the utility or aggregator. Simulation experiments conducted with a custom simulation tool demonstrate that the proposed optimization framework can offer significant benefits. It achieves 87% higher success rate in optimizing setpoints in the desired range, thereby resulting in up to 15% energy savings and zero temperature discomfort.
Recommended Citation
Heidarykiany, Rahman and Ababei, Cristinel, "Advanced Day-Ahead Scheduling of HVAC Demand Response Control Using Novel Strategy of Q-Learning, Model Predictive Control, and Input Convex Neural Networks" (2025). Electrical and Computer Engineering Faculty Research and Publications. 779.
https://epublications.marquette.edu/electric_fac/779
Comments
Energy and AI, Vol. 20 (May 2025). DOI.