Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Past week
  • Any time
  • Past hour
  • Past 24 hours
  • Past week
  • Past month
  • Past year
All results
6 days ago · Results show that signGD-based neuron approximates noticeably faster than the subgradient-based neuron. Although the normalization helps, it is auxiliary for ...
9 hours ago · Specifically, we review papers that attempt to answer “how the neural network trained via gradient-based methods finds the solution that can generalize well on ...
6 days ago · In this paper, we study the effect of normalization layers on the performance of DPSGD. ... In particular, we propose a novel method for integrating batch ...
5 days ago · Finally, we wanted to study how many optimal filters (i.e., units in the first hidden layer) are necessary to achieve acceptable classification accuracy.
3 days ago · In order to improve the recognition effect of rolling bearing faults, this paper uses an improved genetic algorithm to optimize the BP neural network. In order ...
5 days ago · the model optimized by some randomized algorithm e.g.. stochastic gradient descent (SGD). Then, the generalization. gap ...
1 day ago · In the gradient descent algorithm, the learning rate controls the step size of gradient updates and adapts automatically throughout the learning process. In ...
7 days ago · the study of forgetting in the Continual Pre-Training scenario by disentangling the effect of 3 main components: the input modality, the model architecture and ...
6 days ago · ... gradient descent based method to solve it. We evaluate Mudjacking on both vision and language foundation models, eleven benchmark datasets, five existing ...
6 days ago · Experimental results demonstrate that the proposed method overcomes the gradient conflict issue of the conventional MTL methods with constant weights (CW) and ...