
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Employing federated learning for training autonomous HVAC systems

Buildings account for 40% of global energy consumption. A considerable portion of building energy consumption stems from heating, ventilation, and air conditioning (HVAC), and thus implementing smart, energy-efficient HVAC systems has the potential to significantly impact the course of climate change. In recent years, model-free reinforcement learning algorithms have been increasingly assessed for this purpose due to their ability to learn and adapt purely from experience. They have been shown to outperform classical controllers in terms of energy cost and consumption, as well as thermal comfort. However, their weakness lies in their relatively poor data efficiency, requiring long periods of training to reach acceptable policies, making them inapplicable to real-world controllers directly. In this paper, we demonstrate that using federated learning to train the reinforcement learning controller of HVAC systems can improve the learning speed, as well as improve their ability to generalize, which in turn facilitates transfer learning to unseen building environments. In our setting, a global control policy is learned by aggregating local policies trained on multiple data centers located in different climate zones. The goal of the policy is to minimize energy consumption and maximize thermal comfort. We perform experiments evaluating three different optimizers for local policy training, as well as three different federated learning algorithms against two alternative baselines. Our experiments show that these effects lead to a faster learning speed, as well as greater generalization capabilities in the federated policy compared to any individually trained policy. Furthermore, the learning stability is significantly improved, with the learning process and performance of the federated policy being less sensitive to the choice of parameters and the inherent randomness of reinforcement learning.
- University of Zurich Switzerland
- Aalto University Finland
- Aalto University Finland
FOS: Computer and information sciences, Computer Science - Machine Learning, Federated learning, Systems and Control (eess.SY), Thermal comfort, Electrical Engineering and Systems Science - Systems and Control, Machine Learning (cs.LG), Energy consumption, Soft actor-critic, Optimization and Control (math.OC), Reinforcement learning, FOS: Mathematics, FOS: Electrical engineering, electronic engineering, information engineering, Mathematics - Optimization and Control, HVAC control
FOS: Computer and information sciences, Computer Science - Machine Learning, Federated learning, Systems and Control (eess.SY), Thermal comfort, Electrical Engineering and Systems Science - Systems and Control, Machine Learning (cs.LG), Energy consumption, Soft actor-critic, Optimization and Control (math.OC), Reinforcement learning, FOS: Mathematics, FOS: Electrical engineering, electronic engineering, information engineering, Mathematics - Optimization and Control, HVAC control
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).0 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average
