Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2981345.2981456guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Distributed optimization in adaptive networks

Published: 09 December 2003 Publication History

Abstract

We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of communication switches. This protocol requires only local communication and simple computations which are distributed among devices. The protocol is scalable to large networks. As a motivating example, we discuss a problem involving optimization of power consumption, delay, and buffer overflow in a sensor network.
Our approach builds on policy gradient methods for optimization of Markov decision processes. The protocol can be viewed as an extension of policy gradient methods to a context involving a team of agents optimizing aggregate performance through asynchronous distributed communication and computation. We establish that the dynamics of the protocol approximate the solution to an ordinary differential equation that follows the gradient of the performance objective.

References

[1]
P. L. Bartlett and J. Baxter. Stochastic Optimization of Controlled Markov Decision Processes. In IEEE Conference on Decision and Control, pages 124-129, 2000.
[2]
P. L. Bartlett and J. Baxter. Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning. Journal of Computer and System Sciences, 64:133-150, 2002.
[3]
J. Baxter and P. L. Bartlett. Infinite-Horizon Gradient-Based Policy Search. Journal of Artificial Intelligence Research, 15:319-350, 2001.
[4]
J. Baxter, P. L. Bartlett, and L. Weaver. Infinite-Horizon Gradient-Based Policy Search: II. Gradient Ascent Algorithms and Experiments. Journal of Artificial Intelligence Research, 15:351-381, 2001.
[5]
T. Jaakkola, S. P. Singh, and M. I. Jordan. Reinforcement Learning Algorithms for Partially Observable Markov Decision Problems. In Advances in Neural Information Processing Systems 7, pages 345-352, 1995.
[6]
H. J. Kushner and G. Yin. Stochastic Approximation Algorithms and Applications. Springer-Verlag, New York, NY, 1997.
[7]
P. Marbach, O. Mihatsch, and J.N. Tsitsiklis. Call Admission Control and Routing in Integrated Service Networks. In IEEE Conference on Decision and Control, 1998.
[8]
P. Marbach and J.N. Tsitsiklis. Simulation–Based Optimization of Markov Reward Processes. IEEE Transactions on Automatic Control, 46(2):191-209, 2001.
[9]
C. C. Moallemi and B. Van Roy. Appendix to NIPS Submission. URL: http://www.moallemi.com/ciamac/papers/nips-2003-appendix.pdf, 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'03: Proceedings of the 16th International Conference on Neural Information Processing Systems
December 2003
1610 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 09 December 2003

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media