Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 50 results for author: E, W

Searching in archive math. Search in all archives.
.
  1. arXiv:2405.20763  [pdf, other

    cs.LG math.OC stat.ML

    Improving Generalization and Convergence by Enhancing Implicit Regularization

    Authors: Mingze Wang, Haotian He, Jinbo Wang, Zilin Wang, Guanhua Huang, Feiyu Xiong, Zhiyu Li, Weinan E, Lei Wu

    Abstract: In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence. Specifically, IRE decouples the dynamics of flat and sharp directions, which boosts the sharpness reduction along flat directions while maintaining the training stability in sharp directions. We show that I… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 35 pages

  2. arXiv:2401.01220  [pdf, other

    math.NA

    Solving multiscale dynamical systems by deep learning

    Authors: Zhi-Qin John Xu, Junjie Yao, Yuxiao Yi, Liangkai Hang, Weinan E, Yaoyu Zhang, Tianhan Zhang

    Abstract: Multiscale dynamical systems, modeled by high-dimensional stiff ordinary differential equations (ODEs) with wide-ranging characteristic timescales, arise across diverse fields of science and engineering, but their numerical solvers often encounter severe efficiency bottlenecks. This paper introduces a novel DeePODE method, which consists of a global multiscale sampling method and a fitting by deep… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 7 pages, 6 figures

  3. arXiv:2311.17749  [pdf, other

    math.OC cs.RO

    Learning Free Terminal Time Optimal Closed-loop Control of Manipulators

    Authors: Wei Hu, Yue Zhao, Weinan E, Jiequn Han, Jihao Long

    Abstract: This paper presents a novel approach to learning free terminal time closed-loop control for robotic manipulation tasks, enabling dynamic adjustment of task duration and control inputs to enhance performance. We extend the supervised learning approach, namely solving selected optimal open-loop problems and utilizing them as training data for a policy network, to the free terminal time scenario. Thr… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  4. arXiv:2304.06913  [pdf, other

    math.NA physics.comp-ph

    The Random Feature Method for Time-dependent Problems

    Authors: Jingrun Chen, Weinan E, Yixin Luo

    Abstract: We present a framework for solving time-dependent partial differential equations (PDEs) in the spirit of the random feature method. The numerical solution is constructed using a space-time partition of unity and random feature functions. Two different ways of constructing the random feature functions are investigated: feature functions that treat the spatial and temporal variables (STC) on the sam… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 26 pages, 12 figures

    MSC Class: 65M20; 65M55; 65M70

  5. arXiv:2209.04078  [pdf, other

    math.OC

    Initial Value Problem Enhanced Sampling for Closed-Loop Optimal Control Design with Deep Neural Networks

    Authors: Xuanxi Zhang, Jihao Long, Wei Hu, Weinan E, Jiequn Han

    Abstract: Closed-loop optimal control design for high-dimensional nonlinear systems has been a long-standing challenge. Traditional methods, such as solving the associated Hamilton-Jacobi-Bellman equation, suffer from the curse of dimensionality. Recent literature proposed a new promising approach based on supervised learning, by leveraging powerful open-loop optimal control solvers to generate training dat… ▽ More

    Submitted 9 July, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

  6. arXiv:2207.13380  [pdf, other

    math.NA physics.comp-ph

    Bridging Traditional and Machine Learning-based Algorithms for Solving PDEs: The Random Feature Method

    Authors: Jingrun Chen, Xurong Chi, Weinan E, Zhouwang Yang

    Abstract: One of the oldest and most studied subject in scientific computing is algorithms for solving partial differential equations (PDEs). A long list of numerical methods have been proposed and successfully used for various applications. In recent years, deep learning methods have shown their superiority for high-dimensional PDEs where traditional methods fail. However, for low dimensional problems, it… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

  7. arXiv:2205.08622  [pdf, other

    math.OC

    Solving optimal control of rigid-body dynamics with collisions using the hybrid minimum principle

    Authors: Wei Hu, Jihao Long, Yaohua Zang, Weinan E, Jiequn Han

    Abstract: Collisions are common in many dynamical systems with real applications. They can be formulated as hybrid dynamical systems with discontinuities automatically triggered when states transverse certain manifolds. We present an algorithm for the optimal control problem of such hybrid dynamical systems based on solving the equations derived from the hybrid minimum principle (HMP). The algorithm is an i… ▽ More

    Submitted 10 May, 2023; v1 submitted 17 May, 2022; originally announced May 2022.

    MSC Class: 49Mxx

  8. arXiv:2205.07990  [pdf, other

    math.OC

    Empowering Optimal Control with Machine Learning: A Perspective from Model Predictive Control

    Authors: Weinan E, Jiequn Han, Jihao Long

    Abstract: Solving complex optimal control problems have confronted computational challenges for a long time. Recent advances in machine learning have provided us with new opportunities to address these challenges. This paper takes model predictive control, a popular optimal control method, as the primary example to survey recent progress that leverages machine learning techniques to empower optimal control… ▽ More

    Submitted 20 July, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

  9. arXiv:2203.06753  [pdf, other

    math.OC

    A Machine Learning Enhanced Algorithm for the Optimal Landing Problem

    Authors: Yaohua Zang, Jihao Long, Xuanxi Zhang, Wei Hu, Weinan E, Jiequn Han

    Abstract: We propose a machine learning enhanced algorithm for solving the optimal landing problem. Using Pontryagin's minimum principle, we derive a two-point boundary value problem for the landing problem. The proposed algorithm uses deep learning to predict the optimal landing time and a space-marching technique to provide good initial guesses for the boundary value problem solver. The performance of the… ▽ More

    Submitted 13 March, 2022; originally announced March 2022.

  10. arXiv:2201.03549  [pdf, other

    physics.chem-ph cs.LG math.NA physics.comp-ph physics.flu-dyn

    A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics

    Authors: Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu

    Abstract: Machine learning has long been considered as a black box for predicting combustion chemical kinetics due to the extremely large number of parameters and the lack of evaluation standards and reproducibility. The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be. Sampling and preprocessing… ▽ More

    Submitted 12 August, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

  11. arXiv:2201.02025  [pdf, other

    cs.LG math.OC

    A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics

    Authors: Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang

    Abstract: A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics is proposed and validated using high-temperature auto-ignitions, perfectly stirred reactors (PSR), and one-dimensional freely propagating flames of n-heptane/air mixtures. The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a specie… ▽ More

    Submitted 8 September, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

  12. MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

    Authors: Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

    Abstract: In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs. A MOD-Net is driven by a model to solve PDEs based on operator representation with regularization from data. For linear PDEs, we use a DNN to parameterize the Green's function and obtain the neural operator to approximate the solution according to the Green's method. To train the DNN… ▽ More

    Submitted 28 December, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

  13. arXiv:2012.12654  [pdf

    physics.chem-ph cs.LG math.NA

    A deep learning-based ODE solver for chemical kinetics

    Authors: Tianhan Zhang, Yaoyu Zhang, Weinan E, Yiguang Ju

    Abstract: Developing efficient and accurate algorithms for chemistry integration is a challenging task due to its strong stiffness and high dimensionality. The current work presents a deep learning-based numerical method called DeepCombustion0.0 to solve stiff ordinary differential equation systems. The homogeneous autoignition of DME/air mixture, including 54 species, is adopted as an example to illustrate… ▽ More

    Submitted 23 November, 2020; originally announced December 2020.

  14. arXiv:2012.01484  [pdf, ps, other

    math.AP cs.LG

    Some observations on high-dimensional partial differential equations with Barron data

    Authors: Weinan E, Stephan Wojtowytsch

    Abstract: We use explicit representation formulas to show that solutions to certain partial differential equations lie in Barron spaces or multilayer spaces if the PDE data lie in such function spaces. Consequently, these solutions can be represented efficiently using artificial neural networks, even in high dimension. Conversely, we present examples in which the solution fails to lie in the function space… ▽ More

    Submitted 4 June, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    MSC Class: 68T07; 35C15; 65M80

  15. arXiv:2010.05627  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

    Authors: Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E

    Abstract: It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse generalization performance than SGD despite their faster training speed. This work aims to provide understandings on this generalization gap by analyzing their local convergence behaviors. Specifically, we observe the heavy tails of gradient noise in these algorithms. This motivates us to analyze these algorithms thr… ▽ More

    Submitted 28 November, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020

  16. arXiv:2009.14596  [pdf, other

    math.NA cs.LG stat.ML

    Machine Learning and Computational Mathematics

    Authors: Weinan E

    Abstract: Neural network-based machine learning is capable of approximating functions in very high dimension with unprecedented efficiency and accuracy. This has opened up many exciting new possibilities, not just in traditional areas of artificial intelligence, but also in scientific computing and computational science. At the same time, machine learning has also acquired the reputation of being a set of "… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    MSC Class: 68T07; 46E15; 26B35; 26B40

  17. arXiv:2009.13500  [pdf, ps, other

    stat.ML cs.LG math.NA

    A priori estimates for classification problems using neural networks

    Authors: Weinan E, Stephan Wojtowytsch

    Abstract: We consider binary and multi-class classification problems using hypothesis classes of neural networks. For a given hypothesis class, we use Rademacher complexity estimates and direct approximation theorems to obtain a priori error estimates for regularized loss functionals.

    Submitted 28 September, 2020; originally announced September 2020.

    MSC Class: 68T07; 60-08

  18. arXiv:2009.10713  [pdf, other

    cs.LG math.NA stat.ML

    Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't

    Authors: Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu

    Abstract: The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning. In the tradition of good old applied mathematics, we will not only give attention to rigorous mathematical results, but also the insight we have gained from careful numerical experiments as well as… ▽ More

    Submitted 7 December, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: Review article. Feedback welcome

    MSC Class: 68T07 (primary); 26B40; 41A30; 35Q68

  19. arXiv:2009.07799  [pdf, other

    cs.LG math.OC stat.ML

    On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis

    Authors: Zhong Li, Jiequn Han, Weinan E, Qianxiao Li

    Abstract: We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. Mathematically, the latter can be understood as a sequence of linear functionals. We prove a… ▽ More

    Submitted 15 May, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Published version

    MSC Class: 68W25; 68T07; 37M10 ACM Class: I.2.6

  20. arXiv:2009.02327  [pdf, other

    math.DS cs.LG physics.comp-ph

    OnsagerNet: Learning Stable and Interpretable Dynamics using a Generalized Onsager Principle

    Authors: Haijun Yu, Xinyuan Tian, Weinan E, Qianxiao Li

    Abstract: We propose a systematic method for learning stable and physically interpretable dynamical models using sampled trajectory data from physical processes based on a generalized Onsager principle. The learned dynamics are autonomous ordinary differential equations parameterized by neural networks that retain clear physical structure information, such as free energy, diffusion, conservative motion and… ▽ More

    Submitted 17 October, 2021; v1 submitted 6 September, 2020; originally announced September 2020.

    Comments: 29 pages, 19 figures

    MSC Class: 76E30; 34D20; 68T05/07; 82C35

    Journal ref: Phy. Rev. Fluids 6(11):114402, 2021

  21. Algorithms for Solving High Dimensional PDEs: From Nonlinear Monte Carlo to Machine Learning

    Authors: Weinan E, Jiequn Han, Arnulf Jentzen

    Abstract: In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in the case of some nonlinear Monte Carlo methods for… ▽ More

    Submitted 11 September, 2020; v1 submitted 30 August, 2020; originally announced August 2020.

    MSC Class: 65C05; 65K10; 65M75; 90C06

    Journal ref: Nonlinearity 35 (2022) 278-310

  22. arXiv:2007.15623  [pdf, ps, other

    stat.ML cs.LG math.FA

    On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics

    Authors: Weinan E, Stephan Wojtowytsch

    Abstract: We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width. The spaces contain all finite fully connected $L$-layer networks and their $L^2$-limiting objects under bounds on the natural path-norm. Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable generalization properties. Functions in these spaces can… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    MSC Class: 68T07; 46E15; 26B35; 26B40

  23. arXiv:2006.14450  [pdf, other

    cs.LG math.OC stat.ML

    The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

    Authors: Chao Ma, Lei Wu, Weinan E

    Abstract: A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons. It is found that for Xavier-like initialization, there are two distinctive phases in the dynamic behavior of GD in the under-parametr… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: 23 pages

  24. arXiv:2006.05982  [pdf, ps, other

    stat.ML cs.LG math.AP math.FA

    Representation formulas and pointwise properties for Barron functions

    Authors: Weinan E, Stephan Wojtowytsch

    Abstract: We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions whose singular set is fractal or curved (for examp… ▽ More

    Submitted 4 June, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    MSC Class: 68T07; 46E15; 26B35; 26B40

  25. arXiv:2006.02619  [pdf, other

    physics.comp-ph cs.LG math.NA

    Integrating Machine Learning with Physics-Based Modeling

    Authors: Weinan E, Jiequn Han, Linfeng Zhang

    Abstract: Machine learning is poised as a very powerful tool that can drastically improve our ability to carry out scientific research. However, many issues need to be addressed before this becomes a reality. This article focuses on one particular issue of broad interest: How can we integrate machine learning with physics-based modeling to develop new interpretable and truly reliable physical models? After… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

  26. arXiv:2005.10815  [pdf, other

    cs.LG math.AP stat.ML

    Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective

    Authors: Stephan Wojtowytsch, Weinan E

    Abstract: We prove that the gradient descent training of a two-layer neural network on empirical or population risk may not decrease population risk at an order faster than $t^{-4/(d-2)}$ under mean field scaling. Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality. We present numerical evidence that gradient descent tra… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: 5 figures

    MSC Class: 68T07; 49Q22; 68W25

  27. arXiv:2005.10807  [pdf, ps, other

    math.FA cs.LG stat.ML

    Kolmogorov Width Decay and Poor Approximators in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels

    Authors: Weinan E, Stephan Wojtowytsch

    Abstract: We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor $L^2$-approximators for the class of two-layer neural networks in high dimension, and that multi-layer networ… ▽ More

    Submitted 2 October, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

    MSC Class: 68T07; 41A30; 41A65; 46E15; 46E22

  28. arXiv:2003.03672  [pdf, other

    physics.comp-ph math.NA physics.flu-dyn stat.ML

    Machine learning based non-Newtonian fluid model with molecular fidelity

    Authors: Huan Lei, Lei Wu, Weinan E

    Abstract: We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description. Dumbbell polymer solutions are used as examples to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the micro-scale polymer configurations and their macro-s… ▽ More

    Submitted 23 October, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

    Journal ref: Phys. Rev. E 102, 043309 (2020)

  29. arXiv:1912.12777  [pdf, ps, other

    math.NA math.OC stat.ML

    Machine Learning from a Continuous Viewpoint

    Authors: Weinan E, Chao Ma, Lei Wu

    Abstract: We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, in the spirit of classical numerical analysis. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled f… ▽ More

    Submitted 26 September, 2020; v1 submitted 29 December, 2019; originally announced December 2019.

    Comments: published version

    Journal ref: Science China Mathematics (2020)

  30. arXiv:1912.06987  [pdf, ps, other

    stat.ML cs.LG math.ST

    The Generalization Error of the Minimum-norm Solutions for Over-parameterized Neural Networks

    Authors: Weinan E, Chao Ma, Lei Wu

    Abstract: We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model. We proved that for all three models, the generalization error for the minimum-norm solution is comparable to the Monte Carlo rate, up to some logarithmic terms, as long as the mode… ▽ More

    Submitted 28 January, 2021; v1 submitted 15 December, 2019; originally announced December 2019.

    Comments: Published version

    Journal ref: Pure and Applied Functional Analysis, Volume 5, Number 6, 1145-1460, 2020

  31. arXiv:1906.08039  [pdf, ps, other

    cs.LG math.PR stat.ML

    The Barron Space and the Flow-induced Function Spaces for Neural Network Models

    Authors: Weinan E, Chao Ma, Lei Wu

    Abstract: One of the key issues in the analysis of machine learning models is to identify the appropriate function space and norm for the model. This is the set of functions endowed with a quantity which can control the approximation and estimation errors by a particular machine learning model. In this paper, we address this issue for two representative neural network models: the two-layer networks and the… ▽ More

    Submitted 27 March, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

  32. arXiv:1904.05263  [pdf, other

    cs.LG math.OC stat.ML

    Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

    Authors: Weinan E, Chao Ma, Qingcan Wang, Lei Wu

    Abstract: The behavior of the gradient descent (GD) algorithm is analyzed for a deep neural network model with skip-connections. It is proved that in the over-parametrized regime, for a suitable initialization, with high probability GD can find a global minimum exponentially fast. Generalization error estimates along the GD path are also established. As a consequence, it is shown that when the target functi… ▽ More

    Submitted 14 April, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: 29 pages, 4 figures

  33. arXiv:1904.04326  [pdf, other

    cs.LG math.OC stat.ML

    A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

    Authors: Weinan E, Chao Ma, Lei Wu

    Abstract: A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero… ▽ More

    Submitted 20 February, 2020; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: Published version

    MSC Class: 41A99; 49M99

    Journal ref: Science China Mathematics (2020)

  34. arXiv:1810.06397  [pdf, other

    stat.ML cs.LG math.ST

    A Priori Estimates of the Population Risk for Two-layer Neural Networks

    Authors: Weinan E, Chao Ma, Lei Wu

    Abstract: New estimates for the population risk are established for two-layer neural networks. These estimates are nearly optimal in the sense that the error rates scale in the same way as the Monte Carlo error rates. They are equally effective in the over-parametrized regime when the network size is much larger than the size of the dataset. These new estimates are a priori in nature in the sense that the b… ▽ More

    Submitted 20 February, 2020; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: Published version

    MSC Class: 41A46; 41A63; 62J02; 65D05

    Journal ref: Communications in Mathematical Sciences, Volume 17(2019)

  35. arXiv:1809.10188  [pdf, other

    cs.LG cond-mat.stat-mech math.DS stat.ML

    Monge-Ampère Flow for Generative Modeling

    Authors: Linfeng Zhang, Weinan E, Lei Wang

    Abstract: We present a deep generative model, named Monge-Ampère flow, which builds on continuous-time gradient flow arising from the Monge-Ampère equation in optimal transport theory. The generative map from the latent space to the data space follows a dynamical system, where a learnable potential function guides a compressible fluid to flow towards the target density distribution. Training of the model am… ▽ More

    Submitted 26 September, 2018; originally announced September 2018.

  36. A Mean-Field Optimal Control Formulation of Deep Learning

    Authors: Weinan E, Jiequn Han, Qianxiao Li

    Abstract: Recent work linking deep neural networks and dynamical systems opened up new avenues to analyze deep learning. In particular, it is observed that new insights can be obtained by recasting deep learning as an optimal control problem on difference or differential equations. However, the mathematical aspects of such a formulation have not been systematically explored. This paper introduces the mathem… ▽ More

    Submitted 3 July, 2018; originally announced July 2018.

    Comments: 44 pages

    Journal ref: Research in the Mathematical Sciences, 6:10 (2019)

  37. arXiv:1709.05963  [pdf, other

    math.NA cs.LG cs.NE math.PR stat.ML

    Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations

    Authors: Christian Beck, Weinan E, Arnulf Jentzen

    Abstract: High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio. Moreover, such PDEs are often fully nonlinea… ▽ More

    Submitted 18 September, 2017; originally announced September 2017.

    Comments: 56 pages, 12 figures

    MSC Class: 65C99; 65M99; 60H30; 65-05

    Journal ref: J. Nonlinear Sci. 29, 1563-1619 (2019)

  38. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high-dimensional nonlinear backward stochastic differential equations

    Authors: Weinan E, Martin Hutzenthaler, Arnulf Jentzen, Thomas Kruse

    Abstract: Parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) are key ingredients in a number of models in physics and financial engineering. In particular, parabolic PDEs and BSDEs are fundamental tools in the state-of-the-art pricing and hedging of financial derivatives. The PDEs and BSDEs appearing in such applications are often high-dimensional and nonl… ▽ More

    Submitted 10 August, 2017; originally announced August 2017.

    Journal ref: J. Sci. Comput. 79, 1534-1571 (2019)

  39. arXiv:1707.02568  [pdf, other

    math.NA cs.LG math.OC math.PR

    Solving high-dimensional partial differential equations using deep learning

    Authors: Jiequn Han, Arnulf Jentzen, Weinan E

    Abstract: Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic… ▽ More

    Submitted 3 July, 2018; v1 submitted 9 July, 2017; originally announced July 2017.

    Comments: 13 pages, 6 figures

    Journal ref: Proceedings of the National Academy of Sciences, 115(34), 8505-8510 (2018)

  40. arXiv:1706.04702  [pdf, other

    math.NA cs.LG cs.NE math.PR stat.ML

    Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

    Authors: Weinan E, Jiequn Han, Arnulf Jentzen

    Abstract: We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution… ▽ More

    Submitted 14 June, 2017; originally announced June 2017.

    Comments: 39 pages, 15 figures

    MSC Class: 65M75; 60H35; 65C30

    Journal ref: Commun. Math. Stat. 5, 349-380 (2017)

  41. arXiv:1611.07422  [pdf, other

    cs.LG cs.AI cs.NE math.OC stat.ML

    Deep Learning Approximation for Stochastic Control Problems

    Authors: Jiequn Han, Weinan E

    Abstract: Many real world stochastic control problems suffer from the "curse of dimensionality". To overcome this difficulty, we develop a deep learning approach that directly solves high-dimensional stochastic control problems based on Monte-Carlo sampling. We approximate the time-dependent controls as feedforward neural networks and stack these networks together through model dynamics. The objective funct… ▽ More

    Submitted 1 November, 2016; originally announced November 2016.

  42. Multilevel Picard iterations for solving smooth semilinear parabolic heat equations

    Authors: Weinan E, Martin Hutzenthaler, Arnulf Jentzen, Thomas Kruse

    Abstract: We introduce a new family of numerical algorithms for approximating solutions of general high-dimensional semilinear parabolic partial differential equations at single space-time points. The algorithm is obtained through a delicate combination of the Feynman-Kac and the Bismut-Elworthy-Li formulas, and an approximate decomposition of the Picard fixed-point iteration with multilevel accuracy. The a… ▽ More

    Submitted 22 February, 2019; v1 submitted 12 July, 2016; originally announced July 2016.

    Journal ref: Partial Differential Equations and Applications 2 (2021), no. 80

  43. arXiv:1511.02975  [pdf, other

    math.OC

    Noisy Hegselmann-Krause Systems: Phase Transition and the 2R-Conjecture

    Authors: Chu Wang, Qianxiao Li, Weinan E, Bernard Chazelle

    Abstract: The classic Hegselmann-Krause (HK) model for opinion dynam- ics consists of a set of agents on the real line, each one instructed to move, at every time step, to the mass center of all the agents within a fixed distance R. In this work, we investigate the effects of noise in the continuous-time version of the model as described by its mean-field limiting Fokker-Planck equation. In the presence of… ▽ More

    Submitted 24 November, 2015; v1 submitted 9 November, 2015; originally announced November 2015.

  44. arXiv:1211.1446  [pdf, ps, other

    physics.comp-ph math.NA

    Efficient iterative method for solving the Dirac-Kohn-Sham density functional theory

    Authors: Lin Lin, Sihong Shao, Weinan E

    Abstract: We present for the first time an efficient iterative method to directly solve the four-component Dirac-Kohn-Sham (DKS) density functional theory. Due to the existence of the negative energy continuum in the DKS operator, the existing iterative techniques for solving the Kohn-Sham systems cannot be efficiently applied to solve the DKS systems. The key component of our method is a novel filtering st… ▽ More

    Submitted 21 March, 2013; v1 submitted 6 November, 2012; originally announced November 2012.

    Comments: 31 pages, 5 figures

    Journal ref: Journal of Computational Physics 245 (2013) 205-217

  45. arXiv:1102.5545  [pdf, ps, other

    math-ph math.AP

    Cauchy-Born rule and spin density wave for the spin-polarized Thomas-Fermi-Dirac-von Weizsacker model

    Authors: Weinan E, Jianfeng Lu

    Abstract: The electronic structure (electron charges and spins) of a perfect crystal under external magnetic field is analyzed using the spin-polarized Thomas-Fermi-Dirac-von Weizsacker model. An extension of the classical Cauchy-Born rule for crystal lattices is established for the electronic structure under sharp stability conditions on charge density wave and spin density wave. A Landau-Lifschitz type mi… ▽ More

    Submitted 27 February, 2011; originally announced February 2011.

    Comments: 24 pages; dated June 17, 2010

  46. The Gentlest Ascent Dynamics

    Authors: Weinan E, Xiang Zhou

    Abstract: Dynamical systems that describe the escape from the basins of attraction of stable invariant sets are presented and analyzed. It is shown that the stable fixed points of such dynamical systems are the index-1 saddle points. Generalizations to high index saddle points are discussed. Both gradient and non-gradient systems are considered. Preliminary results on the nature of the dynamical behavior ar… ▽ More

    Submitted 4 February, 2011; v1 submitted 29 October, 2010; originally announced November 2010.

  47. arXiv:0812.4352  [pdf, ps, other

    cond-mat.mtrl-sci math.NA

    Multipole Representation of the Fermi Operator with Application to the Electronic Structure Analysis of Metallic Systems

    Authors: Lin Lin, Jianfeng Lu, Roberto Car, Weinan E

    Abstract: We propose a multipole representation of the Fermi-Dirac function and the Fermi operator, and use this representation to develop algorithms for electronic structure analysis of metallic systems. The new algorithm is quite simple and efficient. Its computational cost scales logarithmically with $βΔ\eps$ where $β$ is the inverse temperature, and $Δ\eps$ is the width of the spectrum of the discreti… ▽ More

    Submitted 23 December, 2008; originally announced December 2008.

    Comments: 10 pages, 3 figures, 3 tables

    Journal ref: Phys. Rev. B, 79, 115133, 2009

  48. arXiv:0806.1621  [pdf, ps, other

    math.NA

    Some Critical Issues for the "Equation-Free" Approach to Multiscale Modeling

    Authors: Weinan E, Eric Vanden-Eijnden

    Abstract: The "equation-free'' approach has been proposed in recent years as a general framework for developing multiscale methods to efficiently capture the macroscale behavior of a system using only the microscale models. In this paper, we take a close look at some of the algorithms proposed under the "equation-free'' umbrella, the projective integrators and the patch dynamics. We discuss some very simp… ▽ More

    Submitted 10 June, 2008; originally announced June 2008.

    MSC Class: 65L99; 65M99

  49. arXiv:math/0212415  [pdf, ps, other

    math.NA

    Energy landscapes and rare events

    Authors: Weinan E, Weiqing Ren, Eric Vanden-Eijnden

    Abstract: Many problems in physics, material sciences, chemistry and biology can be abstractly formulated as a system that navigates over a complex energy landscape of high or infinite dimensions. Well-known examples include phase transitions of condensed matter, conformational changes of biopolymers, and chemical reactions. The energy landscape typically exhibits multiscale features, giving rise to the m… ▽ More

    Submitted 30 November, 2002; originally announced December 2002.

    Report number: ICM-2002 MSC Class: 60-08; 60F10; 65C

    Journal ref: Proceedings of the ICM, Beijing 2002, vol. 1, 621--630

  50. arXiv:math/0005306  [pdf, ps, other

    math.AP

    Invariant measures for Burgers equation with stochastic forcing

    Authors: Weinan E, K. M. Khanin, A. E. Mazel, Ya. G. Sinai

    Abstract: In this paper we study the following Burgers equation du/dt + d/dx (u^2/2) = epsilon d^2u/dx^2 + f(x,t) where f(x,t)=dF/dx(x,t) is a random forcing function, which is periodic in x and white noise in t. We prove the existence and uniqueness of an invariant measure by establishing a ``one force, one solution'' principle, namely that for almost every realization of the force, there is a uniqu… ▽ More

    Submitted 30 April, 2000; originally announced May 2000.

    Comments: 84 pages, published version, abstract added in migration

    Report number: Annals migration 4-2001

    Journal ref: Ann. of Math. (2) 151 (2000), no. 3, 877-960