Machine Learning
Data-driven approaches to design intelligent algorithms.
MERL has a long history of research activity in machine learning, including the development of various boosting algorithms and contributing to the theory and practice of highly scalable collaborative filtering. Our recent work has focused on deep learning and reinforcement learning, with application to a wide range of applications including automotive, robotics, factory automation, transportation, as well as building and home systems.
Quick Links
-
Researchers
Toshiaki
Koike-Akino
Ye
Wang
Jonathan
Le Roux
Ankush
Chakrabarty
Anoop
Cherian
Gordon
Wichern
Tim K.
Marks
Philip V.
Orlik
Michael J.
Jones
Stefano
Di Cairano
Daniel N.
Nikovski
Kieran
Parsons
Devesh K.
Jha
Christopher R.
Laughman
Diego
Romeres
Pu
(Perry)
WangKarl
Berntorp
Chiori
Hori
Bingnan
Wang
Yebin
Wang
Suhas
Lohit
Mouhacine
Benosman
Hassan
Mansour
Matthew
Brand
Petros T.
Boufounos
Arvind
Raghunathan
Moitreya
Chatterjee
Abraham P.
Vinod
Jianlin
Guo
Siddarth
Jain
Kuan-Chuan
Peng
Scott A.
Bortoff
Vedang M.
Deshpande
Jing
Liu
Hongtao
Qiao
William S.
Yerazunis
Radu
Corcodel
François
Germain
Chungwei
Lin
Dehong
Liu
Saviz
Mowlavi
Hongbo
Sun
Wataru
Tsujita
Sameer
Khurana
Pedro
Miraldo
Yanting
Ma
James
Queeney
Anthony
Vetro
Jinyun
Zhang
Jose
Amaya
Abraham
Goldsmith
Joshua
Rapp
Avishai
Weiss
Janek
Ebbers
-
Awards
-
AWARD MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge Date: August 29, 2024
Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.
The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
-
AWARD Jonathan Le Roux elevated to IEEE Fellow Date: January 1, 2024
Awarded to: Jonathan Le Roux
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.
Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.
IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
-
AWARD Honorable Mention Award at NeurIPS 23 Instruction Workshop Date: December 15, 2023
Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
Research Areas: Artificial Intelligence, Machine Learning, RoboticsBrief- MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
See All Awards for Machine Learning -
-
News & Events
-
NEWS MERL researchers present 9 papers at ACC 2024 Date: July 10, 2024 - July 12, 2024
Where: Toronto, Canada
MERL Contacts: Karl Berntorp; Ankush Chakrabarty; Vedang M. Deshpande; Stefano Di Cairano; Christopher R. Laughman; Arvind Raghunathan; Abraham P. Vinod; Yebin Wang; Avishai Weiss
Research Areas: Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, RoboticsBrief- MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.
As a sponsor of the conference, MERL maintained a booth for open discussions with researchers and students, and hosted a special session to discuss highlights of MERL research and work philosophy.
In addition, Abraham Vinod served as a panelist at the Student Networking Event at the conference. The student networking event provides an opportunity for all interested students to network with professionals working in industry, academia, and national laboratories during a structured event, and encourages their continued participation as the future leaders in the field.
- MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.
-
NEWS Jianlin Guo delivered a keynote in IEEE ICC 2024 Workshop Date: June 13, 2024
Where: IEEE International Conference on Communications (ICC)
MERL Contacts: Jianlin Guo; Philip V. Orlik; Kieran Parsons; Pu (Perry) Wang
Research Areas: Communications, Machine Learning, Signal ProcessingBrief- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
Abstract: With the advent of private 5G-and-Beyond communication technologies, private IoT networks have been emerging. In private IoT networks, network owners have full control on the network resource management. However, to fully realize private IoT networks, the upper layer technologies need to be developed as well. This keynote presents machine learning based anomaly detection in manufacturing systems, innovative multipath TCP technologies over heterogeneous wireless IoT networks, novel channel resource scheduling in private 5G networks and efficient wireless coexistence of the heterogeneous wireless systems.
- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
See All News & Events for Machine Learning -
-
Research Highlights
-
PS-NeuS: A Probability-guided Sampler for Neural Implicit Surface Rendering -
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models -
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-Aware Spatio-Temporal Sampling -
Steered Diffusion -
Edge-Assisted Internet of Vehicles for Smart Mobility -
Robust Machine Learning -
mmWave Beam-SNR Fingerprinting (mmBSF) -
Video Anomaly Detection -
Biosignal Processing for Human-Machine Interaction -
MERL Shopping Dataset
-
-
Internships
-
CA2132: Optimization Algorithms for Motion Planning and Predictive Control
MERL is looking for a highly motivated and qualified individual to work on tailored computational algorithms for optimization-based motion planning and predictive control applications in autonomous systems (vehicles, mobile robots). The ideal candidate should have experience in either one or multiple of the following topics: convex and non-convex optimization, stochastic predictive control (e.g., scenario trees), interaction-aware motion planning, machine learning, learning-based model predictive control, mathematical programs with complementarity constraints (MPCCs), optimal control, and real-time optimization. PhD students in engineering or mathematics, especially with a focus on research related to any of the above topics are encouraged to apply. Publication of relevant results in conference proceedings or journals is expected. Capability of implementing the designs and algorithms in MATLAB/Python is required; coding parts of the algorithms in C/C++ is a plus. The expected duration of the internship is 3 months, and the start date is flexible.
-
OR2196: Visuo-tactile Learning for Dexterous Manipulation
MERL is looking for a highly motivated individual to work on robotic manipulation using visuo-tactile learning. The research will develop robot motor skills for complex, dexterous manipulation using vision and tactile perception. The ideal candidate should have experience in either one or multiple of the following topics: manipulation, tactile sensing, Reinforcement Learning, sim-to-real techniques for manipulation, and grasping. Senior PhD students in robotics and engineering with a focus on contact-rich manipulation are encouraged to apply. Prior experience working with physical robotic systems (and vision and tactile sensors) is required as results need to be implemented on a physical hardware. Good coding skills in Python ML libraries like PyTorch etc. is required. A successful internship will result in submission of results to a peer-reviewed robotics journal in collaboration with MERL researchers. The expected duration of internship is 4-5 months with start date in Aug/Sept 2024. This internship is preferred to be onsite at MERL.
-
CI2091: Robust AI for Operational Technology Security
MERL is seeking a highly motivated and qualified intern to work on operational technology security. The ideal candidate would have significant research experience in cybersecurity for operational technology, anomaly detection, robust machine learning, and defenses against adversarial examples. A mature understanding of modern machine learning methods, proficiency with Python, and familiarity with deep learning frameworks are expected. Candidates at or beyond the middle of their Ph.D. program are encouraged to apply. The expected duration is 3 months with flexible start dates.
See All Internships for Machine Learning -
-
Openings
-
EA2051: Research Scientist - Control & Learning
-
OR2137: Research Scientist - Optimization & Intelligent Robotics
See All Openings at MERL -
-
Recent Publications
- "Few-shot Transparent Instance Segmentation for Bin Picking", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2024.BibTeX TR2024-127 PDF
- @inproceedings{Cherian2024sep,
- author = {Cherian, Anoop and Jain, Siddarth and Marks, Tim K.}},
- title = {Few-shot Transparent Instance Segmentation for Bin Picking},
- booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-127}
- }
, - "Disentangled Acoustic Fields For Multimodal Physical Scene Understanding", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2024.BibTeX TR2024-125 PDF
- @inproceedings{Yin2024sep,
- author = {Yin, Jie and Luo, Andrew and Du, Yilun and Cherian, Anoop and Marks, Tim K. and Le Roux, Jonathan and Gan, Chuang}},
- title = {Disentangled Acoustic Fields For Multimodal Physical Scene Understanding},
- booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-125}
- }
, - "PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation", Interspeech, September 2024.BibTeX TR2024-124 PDF
- @inproceedings{Pan2024sep,
- author = {Pan, Zexu and Wichern, Gordon and Germain, François G and Saijo, Kohei and Le Roux, Jonathan}},
- title = {PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation},
- booktitle = {Interspeech},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-124}
- }
, - "MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception", European Conference on Computer Vision (ECCV), September 2024.BibTeX TR2024-117 PDF Data
- @inproceedings{Rahman2024sep,
- author = {Rahman, Mahbub and Yataka, Ryoma and Kato, Sorachi and Wang, Pu and Li, Peizhao and Cardace, Adriano and Boufounos, Petros T.}},
- title = {MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception},
- booktitle = {European Conference on Computer Vision (ECCV)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-117}
- }
, - "Supervised Contrastive Learning for Electric Motor Bearing Fault Detection", International Conference on Electrical Machines (ICEM), September 2024.BibTeX TR2024-120 PDF
- @inproceedings{Zhang2024sep,
- author = {Zhang, Hengrui and Wang, Bingnan}},
- title = {Supervised Contrastive Learning for Electric Motor Bearing Fault Detection},
- booktitle = {International Conference on Electrical Machines (ICEM)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-120}
- }
, - "MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models", International Conference on Automation Science and Engineering (CASE), August 2024.BibTeX TR2024-115 PDF
- @inproceedings{Yan2024aug,
- author = {Yan, Jiaqi and Chakrabarty, Ankush and Rupenyan, Alisa and Lygeros, John}},
- title = {MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models},
- booktitle = {International Conference on Automation Science and Engineering (CASE)},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-115}
- }
, - "Deep Calibration and Operator Learning for Ground Penetrating Radar Imaging", European Signal Processing Conference (EUSIPCO), August 2024.BibTeX TR2024-128 PDF
- @inproceedings{Shastri2024aug,
- author = {Shastri, Saurav and Ma, Yanting and Boufounos, Petros T. and Mansour, Hassan}},
- title = {Deep Calibration and Operator Learning for Ground Penetrating Radar Imaging},
- booktitle = {European Signal Processing Conference (EUSIPCO)},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-128}
- }
, - "Assessing Building Control Performance Using Physics-Based Simulation Models and Deep Generative Networks", IEEE Conference on Control Technology and Applications (CCTA) 2024, August 2024.BibTeX TR2024-113 PDF
- @inproceedings{Chakrabarty2024aug,
- author = {Chakrabarty, Ankush and Vanfretti, Luigi and Bortoff, Scott A. and Deshpande, Vedang M. and Wang, Ye and Paulson, Joel A. and Zhan, Sicheng and Laughman, Christopher R.}},
- title = {Assessing Building Control Performance Using Physics-Based Simulation Models and Deep Generative Networks},
- booktitle = {IEEE Conference on Control Technology and Applications (CCTA) 2024},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-113}
- }
,
- "Few-shot Transparent Instance Segmentation for Bin Picking", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2024.
-
Videos
-
Software & Data Downloads
-
DeepBornFNO -
ComplexVAD Dataset -
Millimeter-wave Multi-View Radar Dataset -
Gear Extensions of Neural Radiance Fields -
Long-Tailed Anomaly Detection (LTAD) Dataset -
Target-Speaker SEParation -
Pixel-Grounded Prototypical Part Networks -
Steered Diffusion -
BAyesian Network for adaptive SAmple Consensus -
Simple Multimodal Algorithmic Reasoning Task Dataset -
Partial Group Convolutional Neural Networks -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Circular Maze Environment -
Discriminative Subspace Pooling -
Kernel Correlation Network -
Fast Resampling on Point Clouds via Graphs -
FoldingNet -
Deep Category-Aware Semantic Edge Detection -
MERL Shopping Dataset
-