TR2024-072
Deep Neural Room Acoustics Primitive
-
- "Deep Neural Room Acoustics Primitive", International Conference on Machine Learning (ICML), June 2024, pp. 17842-17857.BibTeX TR2024-072 PDF
- @inproceedings{He2024jun,
- author = {He, Yuhang and Cherian, Anoop and Wichern, Gordon and Markham, Andrew}},
- title = {Deep Neural Room Acoustics Primitive},
- booktitle = {International Conference on Machine Learning (ICML)},
- year = 2024,
- pages = {17842--17857},
- month = jun,
- url = {https://www.merl.com/publications/TR2024-072}
- }
,
- "Deep Neural Room Acoustics Primitive", International Conference on Machine Learning (ICML), June 2024, pp. 17842-17857.
-
MERL Contacts:
-
Research Areas:
Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
Abstract:
The primary objective of room acoustics is to model the intricate sound propagation dynamics from any source to receiver position within en- closed 3D spaces. These dynamics are encapsulated in the form of a 1D room impulse response (RIR). Precisely measuring RIR is diffi- cult due to the complexity of sound propagation encompassing reflection, diffraction, and absorption. In this work, we propose to learn a contin- uous neural room acoustics field that implicitly encodes all essential sound propagation primitives for each enclosed 3D space, so that we can infer the RIR corresponding to arbitrary source- receiver positions unseen in the training dataset. Our framework, dubbed DeepNeRAP, is trained in a self-supervised manner without requiring direct access to RIR ground truth that is often needed in prior methods. The key idea is to design two cooperative acoustic agents to actively probe a 3D space, one emitting and the other receiving sound at various locations. Analyzing this sound helps to inversely characterize the acoustic primitives. Our framework is well-grounded in the fundamental physical principles of sound propagation, including reciprocity and globality, and thus is acoustically interpretable and meaningful. We present experiments on both synthetic and real- world datasets, demonstrating superior quality in RIR estimation against closely related methods.