"New Gradient-Weighted Adaptive Gradient Methods With Dynamic Constraints"

Existing adaptive gradient descent optimization algorithms such as adaptive gradient (Adagrad), adaptive moment estimation (Adam), and root mean square prop (RMSprop), increase the convergence speed by dynamically adjusting the learning rate. However, in some application scenarios, the generalization ability of these adaptive gradient descent optimization algorithms is inferior compared to stochastic gradient descent (SGD). To address this problem, several improved algorithms have been recently proposed, including adaptive mean square gradient (AMSGrad) and AdaBound. In this paper, we present new variants of AdaBound and AMSBound called GWDC (Adam with weighted gradient and dynamic bound of learning rate) and AMSGWDC (AMSGrad with weighted gradient and dynamic bound of learning rate) respectively. The proposed algorithms are developed on a dynamic decay rate method that can put more memory of the recent gradients in the first moment estimation. A theoretical proof of the convergence of the proposed algorithms is also presented. In order to verify the performance of GWDC and AMSGWDC, we compare them with other popular optimization in three well-known machine learning models, i.e., feedforward neural network, convolution neural network and gated recurrent unit network. Experimental results show that the generalization performance of our proposed algorithms is better than other optimization algorithms on test data, in addition, they also show better convergence speed.

Keywords: optimization algorithms; rate; adaptive gradient; gradient descent; gradient

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended