
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Practical deployment of reinforcement learning for building controls using an imitation learning approach

handle: 11583/2998194
This paper addresses the critical need for more efficient and adaptive building control systems to maximise occupant comfort while reducing energy consumption. Our objective is to explore the practical application of model-free Deep Reinforcement Learning (DRL) in real-world building environments by developing a system that learns and adapts to changing conditions, beginning its operation by imitating an existing Rule-Based Control (RBC) system. This approach ensures initial reliability and performance while setting the stage for advanced learning capabilities. The methodology involves two distinct phases. Initially, the DRL controller mimics the behaviour of the RBC system, using imitation learning with behavioural cloning as a safe and efficient strategy to achieve baseline operational efficiency. Subsequently, the controller is implemented within a real building in an online learning setting. In this phase, the controller utilises real-time data to continuously refine its control policy, responding adaptively to occupant behaviours and external environmental conditions. To validate our approach, we conducted a comprehensive analysis, comparing the performance of our DRL controller against the baseline RBC controller, another RBC, and a PI (Proportional-Integral) controller implemented in a digital twin model of the real office environment. Energy consumption and temperature violations related to a temperature acceptability range are considered as metrics, providing a robust framework for assessing the effectiveness of our system. The results indicate that our DRL controller, supported by imitation learning, outperforms the two RBCs by reducing energy consumption by 40 % while reducing the cumulative sum of temperature violations by 43 % and 13 % with respect to the two RBCs. Although the PI controller ensures better performance in terms of temperature violations compared to DRL, it requires 45 % more energy than the proposed DRL controller due to its inherent inability to deal with multi-objective control problems. In conclusion, this paper demonstrates the feasibility and advantages of implementing advanced DRL techniques in real-world building control scenarios. Integrating imitation learning with a DRL controller offers a novel and effective way to enhance the scalability of DRL systems, expanding their application in buildings and driving significant improvements in energy efficiency.
Energy and Buildings, 335
ISSN:0378-7788
ISSN:1872-6178
- ETH Zurich Switzerland
- Polytechnic University of Turin Italy
Deep reinforcement learning, Real implementation, Energy efficiency, Imitation learning; Behavioural cloning; Deep reinforcement learning; Building HVAC control; Energy efficiency; Real implementation, Building HVAC control, Imitation learning, Behavioural cloning
Deep reinforcement learning, Real implementation, Energy efficiency, Imitation learning; Behavioural cloning; Deep reinforcement learning; Building HVAC control; Energy efficiency; Real implementation, Building HVAC control, Imitation learning, Behavioural cloning
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).0 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average
