research-article

Proppo: a message passing framework for customizable and composable learning algorithms

AUTHORs:

Takuma SenoAuthors Info & Claims

NIPS'22: Proceedings of the 36th International Conference on Neural Information Processing Systems

Article No.: 2114, Pages 29152 - 29165

Published: 03 April 2024 Publication History

Abstract

While existing automatic differentiation (AD) frameworks allow flexibly composing model architectures, they do not provide the same flexibility for composing learning algorithms—everything has to be implemented in terms of back-propagation. To address this gap, we invent Automatic Propagation (AP) software, which generalizes AD, and allows custom and composable construction of complex learning algorithms. The framework allows packaging custom learning algorithms into propagators that automatically implement the necessary computations, and can be reused across different computation graphs. We implement Proppo, a prototype AP software package built on top of the Pytorch AD framework. To demonstrate the utility of Proppo, we use it to implement Monte Carlo gradient estimation techniques, such as reparameterization and likelihood ratio gradients, as well as the total propagation algorithm and Gaussian shaping gradients, which were previously used in model-based reinforcement learning, but do not have any publicly available implementation. Finally, in minimalistic experiments, we show that these methods allow increasing the gradient estimation accuracy by orders of magnitude, particularly when the machine learning system is at the edge of chaos.

Supplementary Material

Additional material (3600270.3602384_supp.pdf)

Supplemental material.

Download
1.57 MB

References

[1]

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. (2016). {TensorFlow}: A system for {Large-Scale} machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pages 265-283. 1

[2]

Anonymous (2022). Anonymous. In Anonymous, page 0000. 1, 3.5, B.3

[3]

Barham, P. and Isard, M. (2019). Machine learning systems are stuck in a rut. In Proceedings of the Workshop on Hot Topics in Operating Systems, pages 177-183. 1

Digital Library

[4]

Bauer, F. L. (1974). Computational graphs and rounding error. SIAM Journal on Numerical Analysis, 11(1):87-96. 2.1

Digital Library

[5]

Baydin, A. G., Pearlmutter, B. A., Radul, A. A., and Siskind, J. M. (2018). Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research, 18:1-43. 2.1

Digital Library

[6]

Bradbury, J., Frostig, R., Hawkins, P., Johnson, M. J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., VanderPlas, J., Wanderman-Milne, S., and Zhang, Q. (2018). JAX: composable transformations of Python+NumPy programs. 1

[7]

Chollet, F. et al. (2015). Keras. https://keras.io. 1

[8]

Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., et al. (2022). Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311. 6

[9]

Dangel, F., Kunstner, F., and Hennig, P. (2020). BackPACK: Packing more into backprop. In International Conference on Learning Representations. 1

[10]

Fairbank, M. (2008). Reinforcement learning by value gradients. arXiv preprint arXiv:0803.3539. A.2

[11]

Foerster, J., Farquhar, G., Al-Shedivat, M., Rocktäschel, T., Xing, E., and Whiteson, S. (2018). Dice: The infinitely differentiable Monte Carlo estimator. In International Conference on Machine Learning, pages 1529-1538. 1, 5

[12]

Glynn, P. W. (1990). Likelihood ratio gradient estimation for stochastic systems. Communications of the ACM, 33(10):75-84. 2.2

Digital Library

[13]

Greensmith, E., Bartlett, P. L., and Baxter, J. (2004). Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(Nov):1471-1530. A.1

Digital Library

[14]

Hafner, D., Lillicrap, T., Ba, J., and Norouzi, M. (2020). Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations. 1

[15]

Jankowiak, M. and Obermeyer, F. (2018). Pathwise derivatives beyond the reparameterization trick. In International Conference on Machine Learning, pages 2240-2249. A.1

[16]

Kingma, D. P. and Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114. 2.2

[17]

Krieken, E., Tomczak, J., and Ten Teije, A. (2021). Storchastic: A framework for general stochastic automatic differentiation. Advances in Neural Information Processing Systems, 34. 1, 5

[18]

Metz, L., Maheswaranathan, N., Nixon, J., Freeman, C. D., and Sohl-Dickstein, J. (2019). Understanding and correcting pathologies in the training of learned optimizers. In International Conference on Machine Learning. 4.2, C.1

[19]

Minka, T. (2019). From automatic differentiation to message passing. Slides for Presentation at Advances and challenges in machine learning languages workshop. [Online; https://tminka.github.io/papers/acmll2019/minka-acmll2019-slides.pdf]. 1

[20]

Mohamed, S., Rosea, M., Figurnov, M., and Mnih, A. (2020). Monte Carlo gradient estimation in machine learning. Journal of Machine Learning Research, 21:1-62. 2.2

[21]

Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I., et al. (2018). Ray: A distributed framework for emerging {AI} applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI18), pages 561-577. 1

[22]

Oktay, D., McGreivy, N., Aduol, J., Beatson, A., and Adams, R. P. (2020). Randomized automatic differentiation. In International Conference on Learning Representations. 1

[23]

Owen, A. B. (2013). Monte Carlo theory, methods and examples. A.1

[24]

Parmas, P. (2018). Total stochastic gradient algorithms and applications in reinforcement learning. In Advances in Neural Information Processing Systems, pages 10204-10214. 1, A.1, A.2, A.2, A.2, A.2, A.2, C.3

[25]

Parmas, P. (2020). Total stochastic gradient algorithms and applications to model-based reinforcement learning. PhD thesis, Okinawa Institute of Science and Technology Graduate University. 1, A.1

[26]

Parmas, P., Rasmussen, C. E., Peters, J., and Doya, K. (2018). PIPPS: Flexible model-based policy search robust to the curse of chaos. In International Conference on Machine Learning, pages 4062-4071. 1, 2.2, 4.1, 4.2, A.1, A.2, A.2, C.1, 8

[27]

Parmas, P. and Sugiyama, M. (2021). A unified view of likelihood ratio and reparameterization gradients. In International Conference on Artificial Intelligence and Statistics, pages 4078-4086. PMLR. A.1

[28]

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32. 1

[29]

Paszke, A., Johnson, D. D., Duvenaud, D., Vytiniotis, D., Radul, A., Johnson, M. J., Ragan-Kelley, J., and Maclaurin, D. (2021). Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming. Proceedings of the ACM on Programming Languages, 5(ICFP):1-29. 1

Digital Library

[30]

Pearl, J. (1982). Reverend bayes on inference engines: a distributed hierarchical approach. In Proceedings of the Second AAAI Conference on Artificial Intelligence, pages 133-136. 1, B.1

[31]

Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., and Chen, M. (2022). Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125. 6

[32]

Rezende, D. J., Mohamed, S., and Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In International Conference on Machine Learning, pages 1278-1286. 2.2, A.1

Digital Library

[33]

Rogozhnikov, A. (2021). Einops: Clear and reliable tensor manipulations with einstein-like notation. In International Conference on Learning Representations. 1

[34]

Schulman, J., Heess, N., Weber, T., and Abbeel, P. (2015). Gradient estimation using stochastic computation graphs. In Advances in Neural Information Processing Systems, pages 3528-3536. 1, 3.4, 5, A.1

Digital Library

[35]

Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014). Deterministic policy gradient algorithms. In International conference on machine learning, pages 387-395. PMLR. A.2

Digital Library

[36]

Sutton, R. S., McAllester, D. A., Singh, S. P., and Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pages 1057-1063. A.2

Digital Library

[37]

Tokui, S., Oono, K., Hido, S., and Clayton, J. (2015). Chaîner: a next-generation open source framework for deep learning. In Proceedings of workshop on machine learning systems (LearningSys) in the twenty-ninth annual conference on neural information processing systems (NIPS), volume 5, pages 1-6. 1

[38]

Wang, X. (1991). Period-doublings to chaos in a simple neural network: An analytical proof. Complex Systems, 5(4):425-441. 4.1, 4.1, 4, C.1, 8

[39]

Weaver, L. and Tao, N. (2001). The optimal reward baseline for gradient-based reinforcement learning. In Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, pages 538-545. Morgan Kaufmann Publishers Inc. A.1

Digital Library

[40]

Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229-256. 2.2

Index Terms

Proppo: a message passing framework for customizable and composable learning algorithms
1. Mathematics of computing

Index terms have been assigned to the content through auto-classification.

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Lifelong Machine Learning
Примена географских информационих технологија у ловном туризму Војводине / Application of geographic information technologies in hunting tourism in Vojvodina

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

November 2022

39114 pages

ISBN:9781713871088

Copyright © 2022 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 03 April 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents