Software & Data Downloads — TS-SEP

Target-Speaker SEParation for testing the network architectures proposed in our IEEE TASLP paper "TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings".

Minimal PyTorch code for testing the network architectures proposed in our IEEE TASLP paper "TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings." We include both target-speaker voice activity detection (TS-VAD) as a first stage training process, and target-speaker separation (TS-SEP) second stage training.

MERL Contacts
- Jonathan
  Le Roux
- Gordon
  Wichern

Related Publications
Boeddeker, C., Subramanian, A.S., Wichern, G., Haeb-Umbach, R., Le Roux, J., "TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings", IEEE/ACM Transactions on Audio, Speech, and Language Processing, DOI: 10.1109/TASLP.2024.3350887, Vol. 32, pp. 1185-1197, February 2024.
BibTeX TR2024-006 PDF Software
- @article{Boeddeker2024feb,
- author = {Boeddeker, Christoph and Subramanian, Aswin Shanmugam and Wichern, Gordon and Haeb-Umbach, Reinhold and {Le Roux}, Jonathan},
- title = {{TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings}},
- journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
- year = 2024,
- volume = 32,
- pages = {1185--1197},
- month = feb,
- doi = {10.1109/TASLP.2024.3350887},
- issn = {2329-9304},
- url = {https://www.merl.com/publications/TR2024-006}
- }

Access software at https://github.com/merlresearch/tssep.

JonathanLe Roux

GordonWichern

Jonathan
Le Roux

Gordon
Wichern