
Found an issue? Give us feedback
Vrije Universiteit Brussel Research Portal
Conference object . 2015
Data sources: Vrije Universiteit Brussel Research Portal
Proceedings of the AAAI Conference on Artificial Intelligence
Article . 2015 . Peer-reviewed
Data sources: Crossref
Please grant OpenAIRE to access and update your ORCID works.
This Research product is the result of merged Research products in OpenAIRE.
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
This Research product is the result of merged Research products in OpenAIRE.
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
All Research products
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
For further information contact us at helpdesk@openaire.eu
Expressing Arbitrary Reward Functions as Potential-Based Advice

Authors: Harutyunyan, Anna; Devlin, Sam; Peter, Vrancx; Nowe, Ann;
Abstract
Effectively incorporating external advice is an important problem in reinforcement learning, especially as it moves into the real world. Potential-based reward shaping is a way to provide the agent with a specific form of additional reward, with the guarantee of policy invariance. In this work we give a novel way to incorporate an arbitrary reward function with the same guarantee, by implicitly translating it into the specific form of dynamic advice potentials, which are maintained as an auxiliary value function learnt at the same time. We show that advice provided in this way captures the input reward function in expectation, and demonstrate its efficacy empirically.
Country
Belgium
Related Organizations
- Vrije Universiteit Brussel Belgium
Keywords
Reward Shaping
Reward Shaping
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).40 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Top 10% influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Top 10% impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10%

Found an issue? Give us feedback
citations
Citations provided by BIP!
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
popularity
Popularity provided by BIP!
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
40
Top 10%
Top 10%
Top 10%
gold
Fields of Science (3) View all
Related to Research communities
Energy Research