TR2021-144

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation

- Wang, Z.-Q., Wichern, G., Le Roux, J., "Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, DOI: 10.1109/TASLP.2021.3129363, Vol. 29, pp. 3476-3490, December 2021.
  BibTeX TR2021-144 PDF
  - @article{Wang2021dec,
  - author = {Wang, Zhong-Qiu and Wichern, Gordon and {Le Roux}, Jonathan},
  - title = {{Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation}},
  - journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  - year = 2021,
  - volume = 29,
  - pages = {3476--3490},
  - month = dec,
  - doi = {10.1109/TASLP.2021.3129363},
  - url = {https://www.merl.com/publications/TR2021-144}
  - }
MERL Contacts:
- Gordon
  Wichern
- Jonathan
  Le Roux
Research Areas:

Artificial Intelligence, Machine Learning, Speech & Audio

Abstract:

We propose to exploit the linear-filter structure of reverberation within a supervised deep learning based monaural speech dereverberation framework. The key idea is to first estimate the direct-path signal of the target speaker using a DNN and then identify signals that are decayed and delayed copies of the estimated direct-path signal, as these can be reliably considered as reverberation. We then modify the proposed algorithm for speaker separation in reverberant and noisy-reverberant conditions. State-of-the-art speech dereverberation and speaker separation results are obtained on the REVERB, SMS-WSJ, and WHAMR! datasets.

Related News & Events

NEWS Jonathan Le Roux gives invited talk at CMU's Language Technology Institute Colloquium
Date: December 9, 2022
Where: Pittsburg, PA
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- MERL Senior Principal Research Scientist and Speech and Audio Senior Team Leader, Jonathan Le Roux, was invited by Carnegie Mellon University's Language Technology Institute (LTI) to give an invited talk as part of the LTI Colloquium Series. The LTI Colloquium is a prestigious series of talks given by experts from across the country related to different areas of language technologies. Jonathan's talk, entitled "Towards general and flexible audio source separation", presented an overview of techniques developed at MERL towards the goal of robustly and flexibly decomposing and analyzing an acoustic scene, describing in particular the Speech and Audio Team's efforts to extend MERL's early speech separation and enhancement methods to more challenging environments, and to more general and less supervised scenarios.

Related Publication

Wang, Z.-Q., Wichern, G., Le Roux, J., "Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation", arXiv, August 2021.

BibTeX arXiv

@article{Wang2021aug,
author = {Wang, Zhong-Qiu and Wichern, Gordon and {Le Roux}, Jonathan},
title = {{Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation}},
journal = {arXiv},
year = 2021,
month = aug,
url = {https://arxiv.org/abs/2108.07376}
}

MERL Contacts:

GordonWichern

JonathanLe Roux

Research Areas:

Abstract:

Gordon
Wichern

Jonathan
Le Roux