Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Learning agents for uncertain environments (extended abstract) release_bnv35r7dzfdzfdhch3qea6amdy

by Stuart Russell

References

NOTE: currently batch computed and may include additional references sources, or be missing recent changes, compared to entity reference list.
Fuzzy reference matching is a work in progress!
Read more about quality, completeness, and caveats in the fatcat guide.
Showing 1 - 24 of 24 references (in 89ms)
[b0]

via grobid
Astrom, K. J. (1965). Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Applic., 10, 174-205.
[b1]

via grobid
Bertsekas, D. C., & Tsitsiklis, J. N. (1996). Neuro-dynamic pro- gramming. Athena Scientific, Belmont, Mass.
[b2]

via fuzzy
Adaptive Probabilistic Networks with Hidden Variables
John Binder, Daphne Koller, Stuart J. Russell, Keiji Kanazawa
1997   Machine Learning
doi:10.1023/a:1007421730016  dblp:journals/ml/BinderKRK97 
[b3]

via fuzzy
Space-Efficient Inference in Dynamic Probabilistic Networks
John Binder, Kevin P. Murphy, Stuart J. Russell
1997   International Joint Conference on Artificial Intelligence
dblp:conf/ijcai/BinderMR97 
web.archive.org [PDF]
[b4]

via fuzzy
Tractable Inference for Complex Stochastic Processes
Xavier Boyen, Daphne Koller
1998   Conference on Uncertainty in Artificial Intelligence
dblp:conf/uai/BoyenK98 
[b5]

via grobid
Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5(3), 142-150.
[b6]

via fuzzy
A Novel Reinforcement Model of Birdsong Vocalization Learning
Kenji Doya, Terrence J. Sejnowski
1994   Neural Information Processing Systems
dblp:conf/nips/DoyaS94 
web.archive.org [PDF]
[b7]

via grobid
Farley, C. T., & Taylor, C. R. (1991). A mechanical trigger for the trot-gallop transition in horses. Science, 253(5017), 306- 308.
[b8]

via fuzzy
Learning the Structure of Dynamic Probabilistic Networks
Nir Friedman, Kevin P. Murphy, Stuart J. Russell
1998   Conference on Uncertainty in Artificial Intelligence
dblp:conf/uai/FriedmanMR98 
[b9]

via grobid
Hoyt, D., & Taylor, C. (1981). Gait and the energetics of locomotion in horses. Nature, 292, 239-240.
[b10]

via grobid
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforce- ment learning: A survey. Journal of Artificial Intelligence Research, 4, 237-285.
[b11]

via fuzzy
Stochastic Simulation Algorithms for Dynamic Probabilistic Networks [report]
Keiji Kanazawa, Daphne Koller, Stuart Russell
2013    pre-print
version:v1  number:UAI-P-1995-PG-346-351  arXiv:1302.4965v1 
web.archive.org [PDF]
[b12]

via grobid
Keeney, R. L., & Raiffa, H. (1976). Decisions with Multiple Objec- tives: Preferences and Value Tradeoffs. Wiley, New York.
[b13]

via grobid
McCallum, A. R. (1993). Overcoming incomplete perception with utile distinction memory. In Proceedings of the Tenth In- ternational Conference on Machine Learning, pp. 190-196
[b14]

via grobid
Amherst, Massachusetts. Morgan Kaufmann.
[b15]

via fuzzy
Bee foraging in uncertain environments using predictive hebbian learning
P. Read Montague, Peter Dayan, Christophe Person, Terrence J. Sejnowski
1995   Nature
doi:10.1038/377725a0  pmid:7477260 
web.archive.org [PDF]
[b16]

via fuzzy
Approximating Optimal Policies for Partially Observable Stochastic Domains
Ronald Parr, Stuart J. Russell
1995   International Joint Conference on Artificial Intelligence
dblp:conf/ijcai/ParrR95 
web.archive.org [PDF]
[b17]

via fuzzy
Reinforcement Learning with Hierarchies of Machines
Ronald Parr, Stuart J. Russell
1997   Neural Information Processing Systems
dblp:conf/nips/ParrR97 
web.archive.org [PDF]
[b18]

via grobid
Russell, S. J., & Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs, New Jersey.
[b19]

via grobid
Rust, J. (1994). Do people behave according to bellman's principal of optimality?. Submitted to Journal of Economic Perspec- tives.
[b20]

via grobid
Sargent, T. J. (1978). Estimation of dynamic labor demand schedules under rational expectations. Journal of Political Economy, 86(6), 1009-1044.
[b21]

via fuzzy
Escape, Avoidance, and Imitation: A Neural Network Approach
Nestor A. Schmajuk, B. Silvano Zanutto
1997   Adaptive Behavior
doi:10.1177/105971239700600103 
[b22]

via grobid
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
[b23]

via fuzzy
Operant Conditioning in Skinnerbots
David S. Touretzky, Lisa M. Saksida
1997   Adaptive Behavior
doi:10.1177/105971239700500302 
web.archive.org [PDF]