Skip to main navigation Skip to search Skip to main content

Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization

  • Kaiyi Ji
  • , Zhe Wang
  • , Yi Zhou
  • , Yingbin Liang
  • Ohio State University
  • Duke University

Research output: Contribution to journalConference articlepeer-review

48 Scopus citations

Abstract

Two types of zeroth-order stochastic algorithms have recently been designed for nonconvex optimization respectively based on the first-order techniques SVRG and SARAH/SPIDER. This paper addresses several important issues that are still open in these methods. First, all existing SVRGtype zeroth-order algorithms suffer from worse function query complexities than either zerothorder gradient descent (ZO-GD) or stochastic gradient descent (ZO-SGD). In this paper, we propose a new algorithm ZO-SVRG-Coord-Rand and develop a new analysis for an existing ZO-SVRGCoord algorithm proposed in Liu et al. 2018b, and show that both ZO-SVRG-Coord-Rand and ZOSVRG-Coord (under our new analysis) outperform other exiting SVRG-type zeroth-order methods as well as ZO-GD and ZO-SGD. Second, the existing SPIDER-type algorithm SPIDER-SZO (Fang et al., 2018) has superior theoretical performance, but suffers from the generation of a large number of Gaussian random variables as well as a √ɛ-level stepsize in practice. In this paper, we develop a new algorithm ZO-SPIDER-Coord, which is free from Gaussian variable generation and allows a large constant stepsize while maintaining the same convergence rate and query complexity, and we further show that ZO-SPIDER-Coord automatically achieves a linear convergence rate as the iterate enters into a local PL region without restart and algorithmic modification.

Original languageEnglish
Pages (from-to)3100-3109
Number of pages10
JournalProceedings of Machine Learning Research
Volume97
StatePublished - 2019
Event36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States
Duration: Jun 9 2019Jun 15 2019

Fingerprint

Dive into the research topics of 'Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization'. Together they form a unique fingerprint.

Cite this