TR2010-015

Ultrasonic Sensing for Robust Speech Recognition

- Srinivasan, S., Raj, B., Ezzat, T., "Ultrasonic Sensing for Robust Speech Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2010.
  BibTeX TR2010-015 PDF
  - @inproceedings{Srinivasan2010mar,
  - author = {Srinivasan, S. and Raj, B. and Ezzat, T.},
  - title = {{Ultrasonic Sensing for Robust Speech Recognition}},
  - booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2010,
  - month = mar,
  - url = {https://www.merl.com/publications/TR2010-015}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

In this paper, we present our work using ultrasonic sensing of speech for digit recognition. First, a set of spectral ultrasonic features are developed and tuned in order to achieve optimal performance for the digit recognition task. Using these features, we demonstrate an overall accuracy of 33.00% on a digit recognition task using HMMs with recordings from 6 speakers. The results indicate that ultrasonic sensing of speech is viable, but that further work is needed to achieve word accuracies that match those of audio. Finally, experimental results are presented which demonstrate that fusing information from ultrasound and audio sources show marginal improvements over audio-only performances.

Related News & Events

NEWS ICASSP 2010: 9 publications by Anthony Vetro, Shantanu D. Rane and Petros T. Boufounos
Date: March 14, 2010
Where: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
MERL Contacts: Anthony Vetro; Petros T. Boufounos
Brief
- The papers "Privacy and Security of Features Extracted from Minutiae Aggregates" by Nagar, A., Rane, S.D. and Vetro, A., "Hiding Information Inside Structured Shapes" by Das, S., Rane, S.D. and Vetro, A., "Ultrasonic Sensing for Robust Speech Recognition" by Srinivasan, S., Raj, B. and Ezzat, T., "Reconstruction of Sparse Signals from Distorted Randomized Measurements" by Boufounos, P.T., "Disparity Search Range Estimation: Enforcing Temporal Consistency" by Min, D., Yea, S., Arican, Z. and Vetro, A., "Synthesizing Speech from Doppler Signals" by Toth, A.R., Raj, B., Kalgaonkar, K. and Ezzat, T., "Spectrogram Dimensionality Reduction with Independence Constraints" by Wilson, K.W. and Raj, B., "Robust Regression using Sparse Learning for High Dimensional Parameter Estimation Problems" by Mitra, K., Veeraraghavan, A.N. and Chellappa, R. and "Subword Unit Approaches for Retrieval by Voice" by Gouvea, E., Ezzat, T. and Raj, B. were presented at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Research Areas:

Abstract: