Skip to main navigation Skip to search Skip to main content

VOCBENCH: A NEURAL VOCODER BENCHMARK FOR SPEECH SYNTHESIS

  • Ehab A. AlBadawy
  • , Andrew Gibiansky
  • , Qing He
  • , Jilong Wu
  • , Ming Ching Chang
  • , Siwei Lyu
  • SUNY Albany
  • Meta

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Neural vocoders, used for converting the spectral representations of an audio signal to the waveforms, are a commonly used component in speech synthesis pipelines. It focuses on synthesizing waveforms from low-dimensional representation, such as Mel-Spectrograms. In recent years, different approaches have been introduced to develop such vocoders. However, it becomes more challenging to assess these new vocoders and compare their performance to previous ones. To address this problem, we present VocBench, a framework that benchmark the performance of state-of-the-art neural vocoders. VocBench uses a systematic study to evaluate different neural vocoders in a shared environment that enables a fair comparison between them. In our experiments, we use the same setup for datasets, training pipeline, and evaluation metrics for all neural vocoders. We perform a subjective and objective evaluation to compare the performance of each vocoder along a different axis. Our results demonstrate that the framework can show competitive efficacy and quality of the synthesized samples for each vocoder. VocBench framework is available at https://github.com/facebookresearch/vocoder-benchmark.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages881-885
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022 - Hybrid, Singapore
Duration: May 22 2022May 27 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityHybrid
Period05/22/2205/27/22

Keywords

  • GAN
  • Mel-Spectrograms
  • VocBench
  • benchmark
  • evaluation
  • speech synthesis
  • vocoders

Fingerprint

Dive into the research topics of 'VOCBENCH: A NEURAL VOCODER BENCHMARK FOR SPEECH SYNTHESIS'. Together they form a unique fingerprint.

Cite this