Skip to main navigation Skip to search Skip to main content

CYCLICAL STOCHASTIC GRADIENT MCMC FOR BAYESIAN DEEP LEARNING

  • Ruqi Zhang
  • , Chunyuan Li
  • , Jianyi Zhang
  • , Changyou Chen
  • , Andrew Gordon Wilson
  • Cornell University
  • Microsoft USA
  • Duke University
  • New York University

Research output: Contribution to conferencePaperpeer-review

117 Scopus citations

Abstract

The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove non-asymptotic convergence of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.

Original languageEnglish
StatePublished - 2020
Event8th International Conference on Learning Representations, ICLR 2020 - Addis Ababa, Ethiopia
Duration: Apr 30 2020 → …

Conference

Conference8th International Conference on Learning Representations, ICLR 2020
Country/TerritoryEthiopia
CityAddis Ababa
Period04/30/20 → …

Fingerprint

Dive into the research topics of 'CYCLICAL STOCHASTIC GRADIENT MCMC FOR BAYESIAN DEEP LEARNING'. Together they form a unique fingerprint.

Cite this