Artificial Intelligence
Making machines smarter for improved safety, efficiency and comfort.
Our AI research encompasses advances in computer vision, speech and audio processing, as well as data analytics. Key research themes include improved perception based on machine learning techniques, learning control policies through model-based reinforcement learning, as well as cognition and reasoning based on learned semantic representations. We apply our work to a broad range of automotive and robotics applications, as well as building and home systems.
Quick Links
-
Researchers

Jonathan
Le Roux

Toshiaki
Koike-Akino

Ye
Wang

Gordon
Wichern

Anoop
Cherian

Tim K.
Marks

Chiori
Hori

Michael J.
Jones

Kieran
Parsons

Daniel N.
Nikovski

Devesh K.
Jha

Jing
Liu

Suhas
Lohit

Matthew
Brand

Kuan-Chuan
Peng

Pu
(Perry)
Wang
Philip V.
Orlik

Moitreya
Chatterjee

Yoshiki
Masuyama

Diego
Romeres

Hassan
Mansour

Petros T.
Boufounos

Siddarth
Jain

William S.
Yerazunis

Radu
Corcodel

Pedro
Miraldo

Arvind
Raghunathan

Jianlin
Guo

Hongbo
Sun

Yebin
Wang

Ankush
Chakrabarty

Chungwei
Lin

Yanting
Ma

Bingnan
Wang

Christoph Benedikt Josef
Boeddeker

Stefano
Di Cairano

Saviz
Mowlavi

Anthony
Vetro

Jinyun
Zhang

Vedang M.
Deshpande

Christopher R.
Laughman

Dehong
Liu

Alexander
Schperberg

Abraham P.
Vinod

Kenji
Inomata

Kei
Suzuki
-
Awards
-
AWARD MERL Wins Awards at NeurIPS LLM Privacy Challenge Date: December 15, 2024
Awarded to: Jing Liu, Ye Wang, Toshiaki Koike-Akino, Tsunato Nakai, Kento Oonishi, Takuya Higashi
MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Ye Wang
Research Areas: Artificial Intelligence, Machine Learning, Information SecurityBrief- The Mitsubishi Electric Privacy Enhancing Technologies (MEL-PETs) team, consisting of a collaboration of MERL and Mitsubishi Electric researchers, won awards at the NeurIPS 2024 Large Language Model (LLM) Privacy Challenge. In the Blue Team track of the challenge, we won the 3rd Place Award, and in the Red Team track, we won the Special Award for Practical Attack.
-
AWARD University of Padua and MERL team wins the AI Olympics with RealAIGym competition at IROS24 Date: October 17, 2024
Awarded to: Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
MERL Contact: Diego Romeres
Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, RoboticsBrief- The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
The competition and award ceremony was hosted by IEEE International Conference on Intelligent Robots and Systems (IROS) on October 17, 2024 in Abu Dhabi, UAE. Diego Romeres presented the team's method, based on a model-based reinforcement learning algorithm called MC-PILCO.
- The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
-
AWARD MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge Date: August 29, 2024
Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
MERL Contacts: Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.
The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
See All Awards for Artificial Intelligence -
-
News & Events
-
NEWS MERL Papers, Workshops, and Talks at ICCV 2025 Date: October 19, 2025 - October 23, 2025
Where: Honolulu, HI, USA
MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal ProcessingBrief- MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
Main Conference Papers:
1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
Paper: https://www.merl.com/publications/TR2025-146
2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
Paper: https://www.merl.com/publications/TR2025-124
3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
Paper: https://www.merl.com/publications/TR2025-139
MERL Co-Organized Workshops:
1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
Workshop link: https://adfmw.github.io/iccv25/
2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
MERL Keynote Talks at Workshops:
1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
Workshop Papers:
1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
paper: https://www.merl.com/publications/TR2025-149
2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
paper: https://www.merl.com/publications/TR2025-154
3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
paper: https://www.merl.com/publications/TR2025-148
- MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
-
NEWS MERL Papers, Workshops, and Talks at ICCV 2025 Date: October 19, 2025 - October 23, 2025
Where: Honolulu, HI, USA
MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal ProcessingBrief- MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
Main Conference Papers:
1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
Paper: https://www.merl.com/publications/TR2025-146
2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
Paper: https://www.merl.com/publications/TR2025-124
3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
Paper: https://www.merl.com/publications/TR2025-139
MERL Co-Organized Workshops:
1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
Workshop link: https://adfmw.github.io/iccv25/
2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
MERL Keynote Talks at Workshops:
1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
Workshop Papers:
1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
paper: https://www.merl.com/publications/TR2025-149
2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
paper: https://www.merl.com/publications/TR2025-154
3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
paper: https://www.merl.com/publications/TR2025-148
- MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
See All News & Events for Artificial Intelligence -
-
Research Highlights
-
PS-NeuS: A Probability-guided Sampler for Neural Implicit Surface Rendering -
Quantum AI Technology -
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models -
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-Aware Spatio-Temporal Sampling -
Steered Diffusion -
Sustainable AI -
Robust Machine Learning -
mmWave Beam-SNR Fingerprinting (mmBSF) -
Video Anomaly Detection -
Biosignal Processing for Human-Machine Interaction -
Task-aware Unified Source Separation - Audio Examples
-
-
Internships
-
SA0191: Human-Robot Interaction Based on Multimodal Scene Understanding
-
EA0183: Internship - Machine Learning for Predictive Maintenance
-
SA0187: Internship - Sound event and anomaly detection
See All Internships for Artificial Intelligence -
-
Openings
See All Openings at MERL -
Recent Publications
- , "QKAN-GS: Quantum-Empowered 3D Gaussian Splatting", ACM Multimedia Workshop, October 2025.BibTeX TR2025-156 PDF
- @inproceedings{Fujihashi2025oct,
- author = {Fujihashi, Takuya and Kuwabara, Akihiro and Koike-Akino, Toshiaki},
- title = {{QKAN-GS: Quantum-Empowered 3D Gaussian Splatting}},
- booktitle = {ACM Multimedia Workshop},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-156}
- }
- , "Joint Training of Image Generator and Detector for Road Defect Detection", IEEE International Conference on Computer Vision (ICCV) Workshops, October 2025.BibTeX TR2025-149 PDF Video Presentation
- @inproceedings{Peng2025oct,
- author = {{{Peng, Kuan-Chuan}}},
- title = {{{Joint Training of Image Generator and Detector for Road Defect Detection}}},
- booktitle = {IEEE International Conference on Computer Vision (ICCV) Workshops},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-149}
- }
- , "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts", IEEE International Conference on Computer Vision (ICCV), October 2025.BibTeX TR2025-124 PDF Video Data Presentation
- @inproceedings{Yang2025oct,
- author = {{{Yang, Chiao-An and Peng, Kuan-Chuan and Yeh, Raymond}}},
- title = {{{Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts}}},
- booktitle = {IEEE International Conference on Computer Vision (ICCV)},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-124}
- }
- , "Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography", IEEE Access, October 2025.BibTeX TR2025-145 PDF
- @article{Shenoy2025oct,
- author = {Shenoy, Vineet and Wu, Shaoju and Comas, Armand and Lohit, Suhas and Mansour, Hassan and Marks, Tim K.},
- title = {{Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography}},
- journal = {IEEE Access},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-145}
- }
- , "Physics-Informed Direction-Aware Neural Acoustic Fields", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2025.BibTeX TR2025-142 PDF
- @inproceedings{Masuyama2025oct,
- author = {Masuyama, Yoshiki and Germain, François G and Wichern, Gordon and Ick, Christopher and {Le Roux}, Jonathan},
- title = {{Physics-Informed Direction-Aware Neural Acoustic Fields}},
- booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-142}
- }
- , "FasTUSS: Faster Task-Aware Unified Source Separation", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2025.BibTeX TR2025-143 PDF
- @inproceedings{Paissan2025oct,
- author = {Paissan, Francesco and Wichern, Gordon and Masuyama, Yoshiki and Aihara, Ryo and Germain, François G and Saijo, Kohei and {Le Roux}, Jonathan},
- title = {{FasTUSS: Faster Task-Aware Unified Source Separation}},
- booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
- year = 2025,
- month = oct,
- url = {https://www.merl.com/publications/TR2025-143}
- }
- , "Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal", IEEE Transactions on Geoscience and Remote Sensing, DOI: 10.1109/TGRS.2025.3604654, Vol. 63, September 2025.BibTeX TR2025-138 PDF
- @article{Hu2025sep2,
- author = {Hu, Yuyang and Lohit, Suhas and Kamilov, Ulugbek and Marks, Tim K.},
- title = {{Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal}},
- journal = {IEEE Transactions on Geoscience and Remote Sensing},
- year = 2025,
- volume = 63,
- month = sep,
- doi = {10.1109/TGRS.2025.3604654},
- issn = {1558-0644},
- url = {https://www.merl.com/publications/TR2025-138}
- }
- , "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams", IEEE International Conference on Computer Vision (ICCV), September 2025.BibTeX TR2025-139 PDF
- @inproceedings{Zhang2025sep,
- author = {Zhang, Jiahao and Cherian, Anoop and Rodriguez, Cristian and Deng, Weijian and Gould, Stephen},
- title = {{Manual-PA: Learning 3D Part Assembly from Instruction Diagrams}},
- booktitle = {IEEE International Conference on Computer Vision (ICCV)},
- year = 2025,
- month = sep,
- url = {https://www.merl.com/publications/TR2025-139}
- }
- , "QKAN-GS: Quantum-Empowered 3D Gaussian Splatting", ACM Multimedia Workshop, October 2025.
-
Videos
-
Software & Data Downloads
-
MEL-PETs Joint-Context Attack for LLM Privacy Challenge -
Subject- and Dataset-Aware Neural Field for HRTF Modeling -
MEL-PETs Defense for LLM Privacy Challenge -
Learned Born Operator for Reflection Tomographic Imaging -
Long-Tailed Online Anomaly Detection dataset -
Group Representation Networks -
Task-Aware Unified Source Separation -
Local Density-Based Anomaly Score Normalization for Domain Generalization -
Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization -
Self-Monitored Inference-Time INtervention for Generative Music Transformers -
Transformer-based model with LOcal-modeling by COnvolution -
Sound Event Bounding Boxes -
Enhanced Reverberation as Supervision -
Gear Extensions of Neural Radiance Fields -
Long-Tailed Anomaly Detection Dataset -
Neural IIR Filter Field for HRTF Upsampling and Personalization -
Target-Speaker SEParation -
Pixel-Grounded Prototypical Part Networks -
Steered Diffusion -
Hyperbolic Audio Source Separation -
Simple Multimodal Algorithmic Reasoning Task Dataset -
Partial Group Convolutional Neural Networks -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Goal directed RL with Safety Constraints -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Discriminative Subspace Pooling -
Open Vocabulary Attribute Detection Dataset
-