
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
A Two-Stage Multi-Agent Deep Reinforcement Learning Method for Urban Distribution Network Reconfiguration Considering Switch Contribution

Publisher Copyright: IEEE With the ever-escalating scale of urban distribution networks (UDNs), the traditional model-based reconfiguration methods are becoming inadequate for smart system control. On the contrary, the data-driven deep reinforcement learning method can facilitate the swift decision-making but the large action space would adversely affect the learning performance of its agents. Consequently, this paper presents a novel multi-agent deep reinforcement learning method for the reconfiguration of UDNs by introducing the concept of 'switch contribution'. First, a quantification method is proposed based on the mathematical UDN reconfiguration model. The contributions of controllable switches are effective quantified. By excluding the controllable switches with low contributions during network reconfiguration, the dimensionality of action space can be significantly reduced. Then, an improved QMIX algorithm is introduced to improve the policy of multiple agents by assigning the weights. Besides, a novel two-stage learning structure based on a reward-sharing mechanism is presented to further decompose tasks and enhance the learning efficiency of multiple agents. In the first stage, agents control the switches with higher contributions while switches with lower contributions will be controlled in the second stage. During the two-stage process, the proposed reward-sharing mechanism could guarantee a reliable UND reconfiguration and the convergence of our learning method. Finally, numerical results based on a practical 297-node system are performed to validate our method's effectiveness. Peer reviewed
- Sichuan University China (People's Republic of)
- Sichuan University China (People's Republic of)
- Aalto University Finland
Deep reinforcement learning, Control systems, enhanced QMIX algorithm, Urban distribution network (UDN), two-stage learning structure, reconfiguration, Distribution networks, Substations, switch contribution, multi-agent deep reinforcement learning (MADRL), Aerospace electronics, Voltage, Switches
Deep reinforcement learning, Control systems, enhanced QMIX algorithm, Urban distribution network (UDN), two-stage learning structure, reconfiguration, Distribution networks, Substations, switch contribution, multi-agent deep reinforcement learning (MADRL), Aerospace electronics, Voltage, Switches
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).7 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10%
