TR2023-030
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT
-
- "Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT", IEEE ICASSP Satellite Workshop on Self-supervision in Audio, Speech and Beyond (SASB), DOI: 10.1109/ICASSPW59220.2023.10193575, May 2023.BibTeX TR2023-030 PDF
- @inproceedings{Chen2023may,
- author = {Chen, Ke and Wichern, Gordon and Germain, Francois and Le Roux, Jonathan},
- title = {Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT},
- booktitle = {IEEE ICASSP Satellite Workshop on Self-supervision in Audio, Speech and Beyond (SASB)},
- year = 2023,
- month = may,
- doi = {10.1109/ICASSPW59220.2023.10193575},
- isbn = {979-8-3503-0261-5},
- url = {https://www.merl.com/publications/TR2023-030}
- }
,
- "Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT", IEEE ICASSP Satellite Workshop on Self-supervision in Audio, Speech and Beyond (SASB), DOI: 10.1109/ICASSPW59220.2023.10193575, May 2023.
-
MERL Contacts:
-
Research Areas:
Abstract:
In spite of the progress in music source separation research, the small amount of publicly-available clean source data remains a constant limiting factor for performance. Thus, recent advances in self-supervised learning present a largely-unexplored opportunity for improving separation models by leveraging unlabelled music data. In this paper, we propose a self-supervised learning frame- work for music source separation inspired by the HuBERT speech representation model. We first investigate the potential impact of the original HuBERT model by inserting an adapted version of it into the well-known Demucs V2 time-domain separation architecture. We then propose Pa -HuBERT, a time-frequency-domain self-supervised model, that we later use in combination with a Res- U-Net decoder for source separation. Pa -HuBERT uses primitive auditory features of music as unsupervised clustering labels to initialize the self-supervised pretraining process using the Free Music Archive (FMA) dataset. The resulting framework achieves better source-to-distortion ratio (SDR) performance on the MusDB18 test set than the original Demucs V2 and Res-U-Net models. We further demonstrate that it can boost performance with small amounts of supervised data. Ultimately, our proposed framework is an effective solution to the challenge of limited clean source data for music source separation.
Related News & Events
-
EVENT MERL Contributes to ICASSP 2023 Date: Sunday, June 4, 2023 - Saturday, June 10, 2023
Location: Rhodes Island, Greece
MERL Contacts: Petros T. Boufounos; François Germain; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Suhas Lohit; Yanting Ma; Hassan Mansour; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Signal Processing, Speech & AudioBrief- MERL has made numerous contributions to both the organization and technical program of ICASSP 2023, which is being held in Rhodes Island, Greece from June 4-10, 2023.
Organization
Petros Boufounos is serving as General Co-Chair of the conference this year, where he has been involved in all aspects of conference planning and execution.
Perry Wang is the organizer of a special session on Radar-Assisted Perception (RAP), which will be held on Wednesday, June 7. The session will feature talks on signal processing and deep learning for radar perception, pose estimation, and mutual interference mitigation with speakers from both academia (Carnegie Mellon University, Virginia Tech, University of Illinois Urbana-Champaign) and industry (Mitsubishi Electric, Bosch, Waveye).
Anthony Vetro is the co-organizer of the Workshop on Signal Processing for Autonomous Systems (SPAS), which will be held on Monday, June 5, and feature invited talks from leaders in both academia and industry on timely topics related to autonomous systems.
Sponsorship
MERL is proud to be a Silver Patron of the conference and will participate in the student job fair on Thursday, June 8. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.
MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Rabab Ward, the recipient of the 2023 IEEE Fourier Award for Signal Processing, and Prof. Alexander Waibel, the recipient of the 2023 IEEE James L. Flanagan Speech and Audio Processing Award.
Technical Program
MERL is presenting 13 papers in the main conference on a wide range of topics including source separation and speech enhancement, radar imaging, depth estimation, motor fault detection, time series recovery, and point clouds. One workshop paper has also been accepted for presentation on self-supervised music source separation.
Perry Wang has been invited to give a keynote talk on Wi-Fi sensing and related standards activities at the Workshop on Integrated Sensing and Communications (ISAC), which will be held on Sunday, June 4.
Additionally, Anthony Vetro will present a Perspective Talk on Physics-Grounded Machine Learning, which is scheduled for Thursday, June 8.
About ICASSP
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
- MERL has made numerous contributions to both the organization and technical program of ICASSP 2023, which is being held in Rhodes Island, Greece from June 4-10, 2023.
Related Publication
- @article{Chen2023apr,
- author = {Chen, Ke and Wichern, Gordon and Germain, Francois and Le Roux, Jonathan},
- title = {Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT},
- journal = {arXiv},
- year = 2023,
- month = apr,
- url = {https://arxiv.org/abs/2304.02160}
- }