Abstract
The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove non-asymptotic convergence of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.
| Original language | English |
|---|---|
| State | Published - 2020 |
| Event | 8th International Conference on Learning Representations, ICLR 2020 - Addis Ababa, Ethiopia Duration: Apr 30 2020 → … |
Conference
| Conference | 8th International Conference on Learning Representations, ICLR 2020 |
|---|---|
| Country/Territory | Ethiopia |
| City | Addis Ababa |
| Period | 04/30/20 → … |
Fingerprint
Dive into the research topics of 'CYCLICAL STOCHASTIC GRADIENT MCMC FOR BAYESIAN DEEP LEARNING'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver