Policy gradient methods

2010

Article

ei

Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. They do not suffer from many of the problems that have been marring traditional reinforcement learning approaches such as the lack of guarantees of a value function, the intractability problem resulting from uncertain state information and the complexity arising from continuous states & actions.

Author(s):	Peters, J.
Journal:	Scholarpedia
Volume:	5
Number (issue):	11
Pages:	3698
Year:	2010
Month:	November
Day:	0

Department(s):	Empirical Inference
Bibtex Type:	Article (article)

Digital:	0
DOI:	10.4249/scholarpedia.3698
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	Web

BibTex @article{6940, title = {Policy gradient methods}, author = {Peters, J.}, journal = {Scholarpedia}, volume = {5}, number = {11}, pages = {3698}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, month = nov, year = {2010}, doi = {10.4249/scholarpedia.3698}, month_numeric = {11} }

People

Jan Peters

Research Group Leader

Alumni

Policy gradient methods

2010

Article

ei

People

Latest News

Links

Contact Us