TR2024-124
PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation
-
- "PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation", Interspeech, DOI: 10.21437/Interspeech.2024-1066, September 2024, pp. 582-586.BibTeX TR2024-124 PDF
- @inproceedings{Pan2024sep,
- author = {Pan, Zexu and Wichern, Gordon and Germain, François G and Saijo, Kohei and Le Roux, Jonathan}},
- title = {PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation},
- booktitle = {Interspeech},
- year = 2024,
- pages = {582--586},
- month = sep,
- doi = {10.21437/Interspeech.2024-1066},
- issn = {2958-1796},
- url = {https://www.merl.com/publications/TR2024-124}
- }
,
- "PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation", Interspeech, DOI: 10.21437/Interspeech.2024-1066, September 2024, pp. 582-586.
-
MERL Contacts:
-
Research Areas:
Abstract:
While offline speech separation models have made significant advances, the streaming regime remains less explored and is typically limited to causal modifications of existing offline net- works. This study focuses on empowering a streaming speech separation model with autoregressive capability, in which the current step separation is conditioned on separated samples from past steps. To do so, we introduce pseudo-autoregressive Siamese (PARIS) training: with only two forward passes through a Siamese-style network for each batch, PARIS avoids the training-inference mismatch in teacher forcing and the need for numerous autoregressive steps during training. The pro- posed PARIS training improves the recent online SkiM model by 1.5 dB in SI-SNR on the WSJ0-2mix dataset, with minimal change to the network architecture and inference time.