Artificial Intelligence

Making machines smarter for improved safety, efficiency and comfort.

Our AI research encompasses advances in computer vision, speech and audio processing, as well as data analytics. Key research themes include improved perception based on machine learning techniques, learning control policies through model-based reinforcement learning, as well as cognition and reasoning based on learned semantic representations. We apply our work to a broad range of automotive and robotics applications, as well as building and home systems.

Quick Links
Researchers
Awards
- AWARD MERL Wins Awards at NeurIPS LLM Privacy Challenge
  Date: December 15, 2024
  Awarded to: Jing Liu, Ye Wang, Toshiaki Koike-Akino, Tsunato Nakai, Kento Oonishi, Takuya Higashi
  MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Ye Wang
  Research Areas: Artificial Intelligence, Machine Learning, Information Security
  Brief
  - The Mitsubishi Electric Privacy Enhancing Technologies (MEL-PETs) team, consisting of a collaboration of MERL and Mitsubishi Electric researchers, won awards at the NeurIPS 2024 Large Language Model (LLM) Privacy Challenge. In the Blue Team track of the challenge, we won the 3rd Place Award, and in the Red Team track, we won the Special Award for Practical Attack.
- AWARD University of Padua and MERL team wins the AI Olympics with RealAIGym competition at IROS24
  Date: October 17, 2024
  Awarded to: Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
  MERL Contact: Diego Romeres
  Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, Robotics
  Brief
  - The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
    
    The competition and award ceremony was hosted by IEEE International Conference on Intelligent Robots and Systems (IROS) on October 17, 2024 in Abu Dhabi, UAE. Diego Romeres presented the team's method, based on a model-based reinforcement learning algorithm called MC-PILCO.
- AWARD MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge
  Date: August 29, 2024
  Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
  MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
  Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
  Brief
  - MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
    
    The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.
    
    The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
See All Awards for Artificial Intelligence
News & Events
- NEWS Diego Romeres Delivers Invited Talks at Fraunhofer Italia and the University of Padua
  Date: July 16, 2025 - July 18, 2025
  MERL Contact: Diego Romeres
  Research Areas: Artificial Intelligence, Control, Machine Learning, Optimization, Robotics, Human-Computer Interaction
  Brief
  - MERL researcher Diego Romeres was invited to present MERL's latest research at two institutions in Italy this July, focusing on human-robot collaboration and LLM-driven assembly systems.
    
    On July 16th, Dr. Romeres delivered a talk titled “Human-Robot Collaborative Assembly” at Fraunhofer Italia – Innovation Engineering Center (EIC) in Bolzano. His presentation showcased research on human-robot collaboration for efficient and flexible assembly processes. Fraunhofer Italia EIC is a non-profit research institute focused on enabling digital and sustainable transformation through applied innovation in close collaboration with both public and private sectors.
    
    Two days later, on July 18th, Dr. Romeres was hosted by the University of Padua, one of Europe’s oldest and most renowned universities. His invited lecture, “Robot Assembly through Human Collaboration & Large Language Models”, explored how artificial intelligence can enhance human-robot synergy in complex assembly tasks.
- NEWS Toshiaki Koike-Akino to give a tutorial talk at ISIT 2025 Quantum Hackathon
  Date: June 22, 2025
  Where: IEEE International Symposium on Information Theory (ISIT)
  MERL Contact: Toshiaki Koike-Akino
  Research Areas: Artificial Intelligence, Communications, Data Analytics, Machine Learning, Optimization, Signal Processing, Human-Computer Interaction, Information Security
  Brief
  - Toshiaki Koike-Akino is invited to present a tutorial talk at IEEE ISIT 2025 Quantum Hackathon, to be held at Ann Arbor, Michigan, USA. The talk, entitled "Emerging Quantum AI Technology", will discuss the recent trends, challenges, and applications of quantum artificial intelligence (QAI) technologies.
    
    The ISIT 2025 Quantum Hackathon invites participants to explore the intersection of quantum computing and information theory. Participants will work with quantum simulators, available quantum hardware, and state-of-the-art development kits to create innovative solutions that connect quantum advancements with challenges in communication and signal processing.
    
    The IEEE International Symposium on Information Theory (ISIT) is the flagship conference of the IEEE Information Theory Society. The symposium centers around the presentation in all of the areas of information theory, including source and channel coding, communication theory and systems, cryptography and security, detection and estimation, networks, pattern recognition and learning, statistics, stochastic processes and complexity, and signal processing.
See All News & Events for Artificial Intelligence
Research Highlights
Internships
- CI0082: Internship - Quantum AI
  
  MERL is excited to announce an internship opportunity in the field of Quantum Machine Learning (QML) and Quantum AI (QAI). We are seeking a highly motivated and talented individual to join our research team. This is an exciting opportunity to make a real impact in the field of quantum computing and AI, with the aim of publishing at leading research venues.
  Responsibilities:
  - Conduct cutting-edge research in quantum machine learning.
  - Collaborate with a team of experts in quantum computing, deep learning, and signal processing.
  - Develop and implement algorithms using PyTorch and PennyLane.
  - Publish research results at leading research venues.
  Qualifications:
  - Currently pursuing a PhD or a post-graduate researcher in a relevant field.
  - Strong background and solid publication records in quantum computing, deep learning, and signal processing.
  - Proficient programming skills in PyTorch and PennyLane are highly desirable.
  What We Offer:
  - An opportunity to work on groundbreaking research in a leading research lab.
  - Collaboration with a team of experienced researchers.
  - A stimulating and supportive work environment.
  If you are passionate about quantum machine learning and meet the above qualifications, we encourage you to apply. Please submit your resume and a brief cover letter detailing your research experience and interests. Join us at MERL and contribute to the future of quantum machine learning!
- OR0164: Internship - Robotic 6D grasp pose estimation
  
  MERL is looking for a highly motivated and qualified intern to work on methods for task-oriented 6-dof grasp pose detection using vision and tactile sensing. The objective is to enable a robot to identify multiple 6-DoF grasp poses tailored to specific tasks, allowing it to effectively grasp and manipulate objects. The ideal candidate would be a Ph.D. student familiar with the state-of-the-art methods for robotic grasping, object tracking, and imitation learning. This role involves developing, fine-tuning and deploying models on hardware. The successful candidate will work closely with MERL researchers to develop and implement novel algorithms, conduct experiments, and publish research findings at a top-tier conference. Start date and expected duration of the internship is flexible. Interested candidates are encouraged to apply with their updated CV and list of relevant publications.
  Required Specific Experience
  - Prior experience in robotic grasping
  - Experience in Machine Learning
  - Excellent programing skills
- CI0169: Internship - Robotic AI Agent
  
  Those who are passionate about pushing the boundaries of embodied AI, join our cutting-edge research team as an intern and contribute to the development of generalist AI agents for humanoid robots. This is a unique opportunity to work on impactful projects aimed at publishing in top-tier AI and robotics venues.
  What We’re Looking For
  We’re seeking highly motivated individuals with:
  - Advanced research experience in robotic AI, edge AI, and agentic AI systems.
  - Hands-on expertise in Large Language Models (LLMs), Vision-Language-Action (VLA) models and Foundation Models
  - Strong proficiency with Python, PyTorch, deep learning, and robotic agent frameworks
  Internship Details
  - Duration: ~3 months
  - Start Date: Flexible
  - Goal: Publish research at leading AI/robotics conferences and journals
  If you're excited about shaping the future of humanoid robotics and AI agents, we’d love to hear from you!
See All Internships for Artificial Intelligence
Recent Publications
- Steinmetz, C., Uhle, C., Everardo, F., Mitcheltree, C., McElveen, J.K., Jot, J.-M., Wichern, G., "Audio Signal Processing in the Artificial Intelligence Era: Challenges and Directions", Journal of the Audio Engineering Society, August 2025.
  BibTeX TR2025-116 PDF
  - @article{Steinmetz2025aug,
  - author = {Steinmetz, Christian and Uhle, Christian and Everardo, Flavio and Mitcheltree, Christopher and McElveen, J. Keith and Jot, Jean-Marc and Wichern, Gordon},
  - title = {{Audio Signal Processing in the Artificial Intelligence Era: Challenges and Directions}},
  - journal = {Journal of the Audio Engineering Society},
  - year = 2025,
  - month = aug,
  - url = {https://www.merl.com/publications/TR2025-116}
  - }
- Lewis, A., White, M., Liu, J., Koike-Akino, T., Parsons, K., Wang, Y., "Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in Product QA Agents", ACL 2025 workshop on Generation, Evaluation & Metrics (GEM), July 2025.
  BibTeX TR2025-114 PDF
  - @inproceedings{Lewis2025jul2,
  - author = {Lewis, Ashley and White, Michael and Liu, Jing and Koike-Akino, Toshiaki and Parsons, Kieran and Wang, Ye},
  - title = {{Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in Product QA Agents}},
  - booktitle = {ACL 2025 workshop on Generation, Evaluation \& Metrics (GEM)},
  - year = 2025,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2025-114}
  - }
- Almudévar, A., Hernández-Lobato, J.M., Khurana, S., Marxer, R., Ortega, A., "Aligning Multimodal Representations through an Information Bottleneck", International Conference on Machine Learning (ICML), July 2025.
  BibTeX TR2025-109 PDF
  - @inproceedings{Almudévar2025jul,
  - author = {Almudévar, Antonio and Hernández-Lobato, José, M and Khurana, Sameer and Marxer, Ricard and Ortega, Alfonso},
  - title = {{Aligning Multimodal Representations through an Information Bottleneck}},
  - booktitle = {International Conference on Machine Learning (ICML)},
  - year = 2025,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2025-109}
  - }
- Koike-Akino, T., Liu, J., Wang, Y., "u-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts", International Conference on Machine Learning (ICML) Workshop, July 2025.
  BibTeX TR2025-112 PDF
  - @inproceedings{Koike-Akino2025jul,
  - author = {Koike-Akino, Toshiaki and Liu, Jing and Wang, Ye},
  - title = {{u-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts}},
  - booktitle = {International Conference on Machine Learning (ICML) Workshop},
  - year = 2025,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2025-112}
  - }
- Liu, J., Koike-Akino, T., Wang, Y., Mansour, H., Brand, M., "AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent", International Conference on Machine Learning (ICML) workshop, July 2025.
  BibTeX TR2025-111 PDF
  - @inproceedings{Liu2025jul,
  - author = {Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mansour, Hassan and Brand, Matthew},
  - title = {{AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent}},
  - booktitle = {International Conference on Machine Learning (ICML) workshop},
  - year = 2025,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2025-111}
  - }
- Wang, R., Wang, Y., Liu, J., Koike-Akino, T., "Quantum Diffusion Models for Few-Shot Learning", ICAD, June 2025.
  BibTeX TR2025-095 PDF
  - @inproceedings{Wang2025jun2,
  - author = {Wang, Ruhan and Wang, Ye and Liu, Jing and Koike-Akino, Toshiaki},
  - title = {{Quantum Diffusion Models for Few-Shot Learning}},
  - booktitle = {ICAD},
  - year = 2025,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2025-095}
  - }
- Masuyama, Y., "Single- and Multi-Channel Speech Enhancement and Separation for Far-Field Conversation Recognition," Tech. Rep. TR2025-097, Jelinek Summer Workshop on Speech and Language Technology (JSALT), June 2025.
  BibTeX TR2025-097 PDF
  - @techreport{Masuyama2025jun,
  - author = {{{Masuyama, Yoshiki}}},
  - title = {{{Single- and Multi-Channel Speech Enhancement and Separation for Far-Field Conversation Recognition}}},
  - institution = {Jelinek Summer Workshop on Speech and Language Technology (JSALT)},
  - year = 2025,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2025-097}
  - }
- Chen, X., Liu, J., Wang, Y., Brand, M., Wang, P., Koike-Akino, T., "TuneComp: Joint Fine-Tuning and Compression for Large Foundation Models", IEEE Conference on Computer Vision and Pattern Recognition (CVPR) workshop on Efficient and On-Device Generation, June 2025.
  BibTeX TR2025-079 PDF
  - @inproceedings{Chen2025jun,
  - author = {Chen, Xiangyu and Liu, Jing and Wang, Ye and Brand, Matthew and Wang, Pu and Koike-Akino, Toshiaki},
  - title = {{TuneComp: Joint Fine-Tuning and Compression for Large Foundation Models}},
  - booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR) workshop on Efficient and On-Device Generation},
  - year = 2025,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2025-079}
  - }
See All Publications for Artificial Intelligence
Videos

[CVPR 2025] TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection

[MERL Seminar Series Spring 2025] Red Teaming AI Agents in-the-wild: Revealing Deployment Vulnerabilities

[MERL Seminar Series Spring 2025] The Emergence of Generalizability and Semantic Low-Dim Subspaces in Diffusion Models

[MERL Seminar Series Spring 2025] Amplifying human performance in combinatorial competitive programming

[WACV 2025] Towards Zero-shot 3D Anomaly Localization

[NeurIPS 2024] MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track

[NeurIPS 2024] MEL-PETs Joint-Context Attack for the NeurIPS 2024 LLM Privacy Challenge Red Team Track

[MERL Seminar Series Fall 2024] AI-assisted Power Grid Dispatch and Control: Optimization, Safety, and Real-world Demonstrations

[NeurIPS 2024] Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads

[MERL Seminar Series Fall 2024] Audio for Object and Spatial Awareness

[IROS 2024] Few-shot Transparent Instance Segmentation for Bin Picking

[MERL Seminar Series Fall 2024] Tools from cognitive science to understand the behavior of large language models

[ECCV 2024] PS-NEUS: A Probability-guided Sampler for Neural Implicit Surface Rendering

[ECCV 2024] Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection

Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling

[MERL Seminar Series Spring 2024] Are Emergent Abilities of Large Language Models a Mirage?

MERL's Quantum AI Technology

[MERL Seminar Series Spring 2024] The Debate Over 'Understanding' in AI's Large Language Models

[MERL Seminar Series Spring 2024] Computational models of human auditory and language processing

[MERL Seminar Series Fall 2023] Multiplicity in Machine Learning

Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection

[MERL Seminar Series Fall 2023] Visual Programming - A compositional approach to building General Purpose Vision Systems

[MERL Seminar Series Fall 2023] The Confluence of Vision, Language, and Robotics

Are Deep Neural Networks SMARTer than Second Graders?

[MERL Seminar Series Spring 2023] Fine-grained wildlife sound recognition: Towards the accuracy of a naturalist

[MERL Seminar Series Spring 2023] Pitfalls and Opportunities in Interpretable Machine Learning

Human Perspective Scene Understanding via Multimodal Sensing

[MERL Seminar Series Spring 2022] Self-Supervised Scene Representation Learning

[MERL Seminar Series Spring 2022] Learning Speech Representations with Multimodal Self-Supervision

[MERL Seminar Series 2021] Deep probabilistic regression

[MERL Seminar Series 2021] Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning

[MERL Seminar Series 2021] Look and Listen: From Semantic to Spatial Audio-Visual Perception

Application of Deep Learning for Nanophotonic Device Design (Invited)

Machine Learning Power Amplifier

Scene-Aware Interaction Technology
Software & Data Downloads

Software & Data Downloads

MERL is making Artificial Intelligence software and data available to the research community:

Task-Aware Unified Source Separation (TUSS)
Local Density-Based Anomaly Score Normalization for Domain Generalization (anomaly-score-normalization)
Long-Tailed Online Anomaly Detection dataset (LTOAD)
Group Representation Networks (G-RepsNets)
Self-Monitored Inference-Time INtervention for Generative Music Transformers (SMITIN)
Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization (ranf-hrtf)

See All Downloads