Abstract
Two types of zeroth-order stochastic algorithms have recently been designed for nonconvex optimization respectively based on the first-order techniques SVRG and SARAH/SPIDER. This paper addresses several important issues that are still open in these methods. First, all existing SVRGtype zeroth-order algorithms suffer from worse function query complexities than either zerothorder gradient descent (ZO-GD) or stochastic gradient descent (ZO-SGD). In this paper, we propose a new algorithm ZO-SVRG-Coord-Rand and develop a new analysis for an existing ZO-SVRGCoord algorithm proposed in Liu et al. 2018b, and show that both ZO-SVRG-Coord-Rand and ZOSVRG-Coord (under our new analysis) outperform other exiting SVRG-type zeroth-order methods as well as ZO-GD and ZO-SGD. Second, the existing SPIDER-type algorithm SPIDER-SZO (Fang et al., 2018) has superior theoretical performance, but suffers from the generation of a large number of Gaussian random variables as well as a √ɛ-level stepsize in practice. In this paper, we develop a new algorithm ZO-SPIDER-Coord, which is free from Gaussian variable generation and allows a large constant stepsize while maintaining the same convergence rate and query complexity, and we further show that ZO-SPIDER-Coord automatically achieves a linear convergence rate as the iterate enters into a local PL region without restart and algorithmic modification.
| Original language | English |
|---|---|
| Pages (from-to) | 3100-3109 |
| Number of pages | 10 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 97 |
| State | Published - 2019 |
| Event | 36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States Duration: Jun 9 2019 → Jun 15 2019 |
Fingerprint
Dive into the research topics of 'Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver