TR2007-031

Sparse overcomplete Decomposition for Single Channel Speaker Separation


    •  Shashanka, M.V.S., Raj, B., Smaragdis, P., "Sparse Overcomplete Decomposition for Single Channel Speaker Separation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2007, vol. 2, pp. 641-644.
      BibTeX TR2007-031 PDF
      • @inproceedings{Shashanka2007apr,
      • author = {Shashanka, M.V.S. and Raj, B. and Smaragdis, P.},
      • title = {Sparse Overcomplete Decomposition for Single Channel Speaker Separation},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2007,
      • volume = 2,
      • pages = {641--644},
      • month = apr,
      • issn = {1520-6149},
      • url = {https://www.merl.com/publications/TR2007-031}
      • }
  • Research Area:

    Speech & Audio

Abstract:

We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis (2005). The idea is to extract certain characteristic spectra-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear combinations of these learned bases. In other words, their model extracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic framework to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation.

 

  • Related News & Events

    •  NEWS    ICASSP 2007: 4 publications by Anthony Vetro, Paris Smaragdis and others
      Date: April 15, 2007
      Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
      MERL Contact: Anthony Vetro
      Brief
      • The papers "Using Distributed Source Coding to Secure Fingerprint Biometrics" by Draper, S.C., Khisti, A., Martinian, E., Vetro, A. and Yedidia, J.S., "A Framework for Secure Speech Recognition" by Smaragdis, P. and Shashanka, M., "Sparse Overcomplete Decomposition for Single Channel Speaker Separation" by Shashanka, M.V.S., Raj, B. and Smaragdis, P. and "Bandwidth Expansion with a Polya URN Model" by Raj, B., Singh, R., Shashanka, M. and Smaragdis, P. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
    •