In the recent past years, deep learning based machine learning systems have demonstrated remarkable success for a wide range of learning tasks in multiple domains such as computer vision, speech recognition and other pattern recognition based applications. The purpose of this article is to contribute a timely review and introduction of state-of-the-art deep learning techniques and their effectiveness in speech/acoustic signal processing. Thorough investigations of various deep learning architectures are provided under the categories of discriminative and generative algorithms, including the up-to-date Generative Adversarial Networks (GANs) as an integrated model. A comprehensive overview of applications in audio generation is highlighted. Based on understandings from these approaches, we discuss how deep learning methods can benefit the field of speech/acoustic signal synthesis and the potential issues that need to be addressed for prospective real-world scenarios. We hope this survey provides a valuable reference for practitioners seeking to innovate in the usage of deep learning approaches for speech/acoustic signal generation.
|Number of pages||20|
|Journal||IEEE Circuits and Systems Magazine|
|Publication status||Published - 1 Oct 2019|