Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

    David Saad

    Research Interests:
    Research Interests:
    Abstract The choice of training parameters and training rules is of great significance in on-line training of neural networks. We employ a variational method for determining globally optimal learning parameters and learning rules for... more
    Abstract The choice of training parameters and training rules is of great significance in on-line training of neural networks. We employ a variational method for determining globally optimal learning parameters and learning rules for on-line gradient descent training of multi-layer neural networks. The approach is based on maximizing the total decrease in generalization error over a fixed time-window, using a statistical mechanics description of the learning process. The method is employed for obtaining optimal learning rates in both ...
    Research Interests:
    ABSTRACT
    Research Interests:
    ABSTRACT
    Abstract Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trained and tested on a set of examples generated by a non linear teacher. The teacher is... more
    Abstract Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trained and tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable ...
    We show that in supervised learning from a particular data set Bayesian model selection, based on the evidence, does not optimise generalization performance even for a learnable linear problem. This is achieved by examining the finite... more
    We show that in supervised learning from a particular data set Bayesian model selection, based on the evidence, does not optimise generalization performance even for a learnable linear problem. This is achieved by examining the finite size effects in hyperparameter ...
    Many natural, technological and social systems are inherently not in equilibrium. We show, by detailed analysis of exemplar models, the emergence of equilibrium-like behavior in localized or nonlocalized domains within non-equilibrium... more
    Many natural, technological and social systems are inherently not in equilibrium. We show, by detailed analysis of exemplar models, the emergence of equilibrium-like behavior in localized or nonlocalized domains within non-equilibrium systems as conjectured in some real systems. Equilibrium domains are shown to emerge either abruptly or gradually depending on the system parameters and disappear, becoming indistinguishable from the remainder of the system for other parameter values. The models studied, defined on densely and sparsely connected networks, provide a useful representation of many real systems.
    Neural networks are the subject of much current research regarding their ability to learn nontrivial mappings from examples see, for example, 1. Specifically, we will consider a learning scenario whereby a feed-forward neural network... more
    Neural networks are the subject of much current research regarding their ability to learn nontrivial mappings from examples see, for example, 1. Specifically, we will consider a learning scenario whereby a feed-forward neural network model, the ''student,''emulates an unknown mapping, the ''teacher,''given examples of the teacher mapping in this case another feed-forward neural network which may be corrupted by noise. This provides a rather general learning scenario since both the student and teacher can represent a very ...
    Research Interests:
    Natural gradient descent is an on-line variable-metric optimization algorithm which utilizes an underlying Riemannian parameter space. We analyze the dynamics of natural gradient descent beyond the asymptotic regime by employing an exact... more
    Natural gradient descent is an on-line variable-metric optimization algorithm which utilizes an underlying Riemannian parameter space. We analyze the dynamics of natural gradient descent beyond the asymptotic regime by employing an exact statistical mechanics description of learning in two-layer feed-forward neural networks. For a realizable learning scenario we find significant improvements over standard gradient descent for both the transient and asymptotic stages of learning, with a slower power law increase in learning ...
    ABSTRACT Many natural, technological and social systems are inherently not in equilibrium. We show, by detailed analysis of exemplar models, the emergence of equilibriumlike behavior in localized or nonlocalized domains within... more
    ABSTRACT Many natural, technological and social systems are inherently not in equilibrium. We show, by detailed analysis of exemplar models, the emergence of equilibriumlike behavior in localized or nonlocalized domains within nonequilibrium Ising spin systems. Equilibrium domains are shown to emerge either abruptly or gradually depending on the system parameters and disappear, becoming indistinguishable from the remainder of the system for other parameter values.
    We examine the fluctuations in the test error induced by random, finite, training and test sets for the linear perceptron of input dimension n with a spherically constrained weight vector. This variance enables us to address such issues... more
    We examine the fluctuations in the test error induced by random, finite, training and test sets for the linear perceptron of input dimension n with a spherically constrained weight vector. This variance enables us to address such issues as the partitioning of a data set into a test and training set. We find that the optimal assignment of the test set size scales with n2/3.
    ... 206 David Barber and David Saad Figure 1: A sphere of radius A. The shaded region represents the version space, 8 = {O E [0.4.0.6].+ E [ 0 , 2 ~ ] } . Making 0 smaller by pushing the inner boundary toward the outer boundary does not... more
    ... 206 David Barber and David Saad Figure 1: A sphere of radius A. The shaded region represents the version space, 8 = {O E [0.4.0.6].+ E [ 0 , 2 ~ ] } . Making 0 smaller by pushing the inner boundary toward the outer boundary does not result in a reduction in general-ization error ...
    ABSTRACT Within a Bayesian framework we consider a system that learns from examples. In particular, using a statistical mechanical formalism, we calculate the evidence and two performance measures, namely the generalization error and the... more
    ABSTRACT Within a Bayesian framework we consider a system that learns from examples. In particular, using a statistical mechanical formalism, we calculate the evidence and two performance measures, namely the generalization error and the consistency measure, for a linear perceptron trained and tested on a set of examples generated by a nonlinear teacher. The teacher is said to be unrealizable because the student can never model it without error. In fact, our model allows us to interpolate between the known linear case and an unrealizable, nonlinear, case. A comparison of the hyperparameters which maximize the evidence with those that optimize the performance measures reveals that, when the student and teacher are fundamentally mismatched, the evidence procedure is a misleading guide to optimizing the performance measures considered.
    We show that in supervised learning from a supplied data set Bayesian model selection, based on the evidence, does not optimize generalization performance even for a learnable linear problem. This is demonstrated by examining the finite... more
    We show that in supervised learning from a supplied data set Bayesian model selection, based on the evidence, does not optimize generalization performance even for a learnable linear problem. This is demonstrated by examining the finite size effects in ...
    ABSTRACT

    And 58 more