Computer Vision

Extracting meaning and building representations of visual objects and events in the world.

Our main research themes cover the areas of deep learning and artificial intelligence for object and action detection, classification and scene understanding, robotic vision and object manipulation, 3D processing and computational geometry, as well as simulation of physical systems to enhance machine learning systems.

Quick Links
Researchers
Awards
- AWARD Best Paper - Honorable Mention Award at WACV 2021
  Date: January 6, 2021
  Awarded to: Rushil Anirudh, Suhas Lohit, Pavan Turaga
  MERL Contact: Suhas Anand Lohit
  Research Areas: Computational Sensing, Computer Vision, Machine Learning
  Brief
  - A team of researchers from Mitsubishi Electric Research Laboratories (MERL), Lawrence Livermore National Laboratory (LLNL) and Arizona State University (ASU) received the Best Paper Honorable Mention Award at WACV 2021 for their paper "Generative Patch Priors for Practical Compressive Image Recovery".
    
    The paper proposes a novel model of natural images as a composition of small patches which are obtained from a deep generative network. This is unlike prior approaches where the networks attempt to model image-level distributions and are unable to generalize outside training distributions. The key idea in this paper is that learning patch-level statistics is far easier. As the authors demonstrate, this model can then be used to efficiently solve challenging inverse problems in imaging such as compressive image recovery and inpainting even from very few measurements for diverse natural scenes.
- AWARD MERL Researchers win Best Paper Award at ICCV 2019 Workshop on Statistical Deep Learning in Computer Vision
  Date: October 27, 2019
  Awarded to: Abhinav Kumar, Tim K. Marks, Wenxuan Mou, Chen Feng, Xiaoming Liu
  MERL Contact: Tim K. Marks
  Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
  Brief
  - MERL researcher Tim Marks, former MERL interns Abhinav Kumar and Wenxuan Mou, and MERL consultants Professor Chen Feng (NYU) and Professor Xiaoming Liu (MSU) received the Best Oral Paper Award at the IEEE/CVF International Conference on Computer Vision (ICCV) 2019 Workshop on Statistical Deep Learning in Computer Vision (SDL-CV) held in Seoul, Korea. Their paper, entitled "UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss," describes a method which, given an image of a face, estimates not only the locations of facial landmarks but also the uncertainty of each landmark location estimate.
- AWARD CVPR 2011 Longuet-Higgins Prize
  Date: June 25, 2011
  Awarded to: Paul A. Viola and Michael J. Jones
  Awarded for: "Rapid Object Detection using a Boosted Cascade of Simple Features"
  Awarded by: Conference on Computer Vision and Pattern Recognition (CVPR)
  MERL Contact: Michael J. Jones
  Research Area: Machine Learning
  Brief
  - Paper from 10 years ago with the largest impact on the field: "Rapid Object Detection using a Boosted Cascade of Simple Features", originally published at Conference on Computer Vision and Pattern Recognition (CVPR 2001).
See All Awards for MERL
News & Events
- NEWS MERL Papers, Workshops, and Talks at ICCV 2025
  Date: October 19, 2025 - October 23, 2025
  Where: Honolulu, HI, USA
  MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
  Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
  Brief
  - MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
    
    Main Conference Papers:
    
    1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
    Paper: https://www.merl.com/publications/TR2025-146
    
    2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
    Paper: https://www.merl.com/publications/TR2025-124
    
    3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
    Paper: https://www.merl.com/publications/TR2025-139
    
    MERL Co-Organized Workshops:
    
    1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
    Workshop link: https://adfmw.github.io/iccv25/
    
    2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
    Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
    
    MERL Keynote Talks at Workshops:
    
    1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
    Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
    
    2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
    Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
    
    Workshop Papers:
    
    1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
    paper: https://www.merl.com/publications/TR2025-149
    
    2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
    paper: https://www.merl.com/publications/TR2025-154
    
    3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
    paper: https://www.merl.com/publications/TR2025-148
- NEWS MERL Papers, Workshops, and Talks at ICCV 2025
  Date: October 19, 2025 - October 23, 2025
  Where: Honolulu, HI, USA
  MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
  Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
  Brief
  - MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
    
    Main Conference Papers:
    
    1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
    Paper: https://www.merl.com/publications/TR2025-146
    
    2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
    Paper: https://www.merl.com/publications/TR2025-124
    
    3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
    Paper: https://www.merl.com/publications/TR2025-139
    
    MERL Co-Organized Workshops:
    
    1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
    Workshop link: https://adfmw.github.io/iccv25/
    
    2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
    Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
    
    MERL Keynote Talks at Workshops:
    
    1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
    Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
    
    2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
    Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
    
    Workshop Papers:
    
    1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
    paper: https://www.merl.com/publications/TR2025-149
    
    2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
    paper: https://www.merl.com/publications/TR2025-154
    
    3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
    paper: https://www.merl.com/publications/TR2025-148
See All News & Events for Computer Vision
Research Highlights
Internships
See All Internships for Computer Vision
Openings
- CI0177: Postdoctoral Research Fellow - Agentic AI
See All Openings at MERL
Recent Publications
- Gambill, P., Jha, D.K., Krishnamoorthy, B., Raghunathan, A., Yerazunis, W.S., "DamageEst: Accurate Estimation of Damage for Repair using Additive Manufacturing", Solid Freeform Fabrication Symposium (SFF), November 2025.
  BibTeX TR2025-158 PDF Presentation
  - @inproceedings{Gambill2025nov,
  - author = {{{Gambill, Patrick and Jha, Devesh K. and Krishnamoorthy, Bala and Raghunathan, Arvind and Yerazunis, William S.}}},
  - title = {{{DamageEst: Accurate Estimation of Damage for Repair using Additive Manufacturing}}},
  - booktitle = {Solid Freeform Fabrication Symposium (SFF)},
  - year = 2025,
  - month = nov,
  - url = {https://www.merl.com/publications/TR2025-158}
  - }
- Nikovski, D.N., "Observation-Based Inverse Kinematics for Visual Servo Control", 22nd International Conference on Informatics in Control, Automation and Robotics (ICINCO), October 2025.
  BibTeX TR2025-153 PDF
  - @inproceedings{Nikovski2025oct,
  - author = {Nikovski, Daniel N.},
  - title = {{Observation-Based Inverse Kinematics for Visual Servo Control}},
  - booktitle = {22nd International Conference on Informatics in Control, Automation and Robotics (ICINCO)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-153}
  - }
- Yataka, R., Wang, P., Boufounos, P.T., Takahashi, R., "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception", IEEE International Conference on Computer Vision (ICCV) Workshop, October 2025.
  BibTeX TR2025-154 PDF
  - @inproceedings{Yataka2025oct,
  - author = {Yataka, Ryoma and Wang, Pu and Boufounos, Petros T. and Takahashi, Ryuhei},
  - title = {{Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV) Workshop},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-154}
  - }
- Karumanchi, S., Rokaha, B., Schperberg, A., Vinod, A.P., "Energy-constrained multi-robot exploration for autonomous map building", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2025.
  BibTeX TR2025-131 PDF
  - @inproceedings{Karumanchi2025oct,
  - author = {Karumanchi, Sambhu and Rokaha, Bhagawan and Schperberg, Alexander and Vinod, Abraham P.},
  - title = {{Energy-constrained multi-robot exploration for autonomous map building}},
  - booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-131}
  - }
- Peng, K.-C., "Joint Training of Image Generator and Detector for Road Defect Detection", IEEE International Conference on Computer Vision (ICCV) Workshops, October 2025.
  BibTeX TR2025-149 PDF Video Presentation
  - @inproceedings{Peng2025oct,
  - author = {{{Peng, Kuan-Chuan}}},
  - title = {{{Joint Training of Image Generator and Detector for Road Defect Detection}}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV) Workshops},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-149}
  - }
- Piedade, V., Chitturi, S., Gaspar, J., Govindu, V., Miraldo, P., "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity", IEEE International Conference on Computer Vision (ICCV), October 2025.
  BibTeX TR2025-146 PDF Presentation
  - @inproceedings{Piedade2025oct,
  - author = {{{Piedade, Valter and Chitturi, Sidhartha and Gaspar, Jose and Govindu, Venu and Miraldo, Pedro}}},
  - title = {{{SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity}}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-146}
  - }
- Yang, C.-A., Peng, K.-C., Yeh, R., "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts", IEEE International Conference on Computer Vision (ICCV), October 2025.
  BibTeX TR2025-124 PDF Video Data Presentation
  - @inproceedings{Yang2025oct,
  - author = {{{Yang, Chiao-An and Peng, Kuan-Chuan and Yeh, Raymond}}},
  - title = {{{Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts}}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-124}
  - }
- Shenoy, V., Wu, S., Comas, A., Lohit, S., Mansour, H., Marks, T.K., "Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography", IEEE Access, October 2025.
  BibTeX TR2025-145 PDF
  - @article{Shenoy2025oct,
  - author = {Shenoy, Vineet and Wu, Shaoju and Comas, Armand and Lohit, Suhas and Mansour, Hassan and Marks, Tim K.},
  - title = {{Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography}},
  - journal = {IEEE Access},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-145}
  - }
See All Publications for Computer Vision
Videos

[ICCV 2025] SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity

[BMVC 2025] Towards Open-Vocabulary Multimodal 3D Object Detection with Attributes

[ICCV Workshop 2025] Joint Training of Image Generator and Detector for Road Defect Detection

[ICCV 2025] Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts

[CVPR 2025] TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection

[MERL Seminar Series Spring 2025] Amplifying human performance in combinatorial competitive programming

[MERL Seminar Series Spring 2025] Imaging Dynamic Scenes from Seconds to Picoseconds

[WACV 2025] Towards Zero-shot 3D Anomaly Localization

[NeurIPS 2024] Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads

[IROS 2024] Few-shot Transparent Instance Segmentation for Bin Picking

[ECCV 2024] PS-NEUS: A Probability-guided Sampler for Neural Implicit Surface Rendering

[ECCV 2024] Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection

Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling

[MERL Seminar Series Fall 2023] Visual Programming - A compositional approach to building General Purpose Vision Systems

[MERL Seminar Series Fall 2023] The Confluence of Vision, Language, and Robotics

Are Deep Neural Networks SMARTer than Second Graders?

[CVPR 2023] EVAL: Explainable Video Anomaly Localization

[MERL Seminar Series Spring 2023] Pitfalls and Opportunities in Interpretable Machine Learning

Human Perspective Scene Understanding via Multimodal Sensing

[MERL Seminar Series Spring 2022] Self-Supervised Scene Representation Learning

[MERL Seminar Series Spring 2022] Learning Speech Representations with Multimodal Self-Supervision

HealthCam: A system for non-contact monitoring of vital signs

[MERL Seminar Series 2021] Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning

[MERL Seminar Series 2021] Look and Listen: From Semantic to Spatial Audio-Visual Perception

Towards Human-Level Learning of Complex Physical Puzzles

Scene-Aware Interaction Technology

3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

Joint 3D Reconstruction of a Static Scene and Moving Objects

Direct Multichannel Tracking

FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds

FasTFit: A fast T-spline fitting algorithm

CASENet: Deep Category-Aware Semantic Edge Detection

Object Detection and Tracking in RGB-D SLAM via Hierarchical Feature Grouping

Pinpoint SLAM: A Hybrid of 2D and 3D Simultaneous Localization and Mapping for RGB-D Sensors

Action Detection Using A Deep Recurrent Neural Network

Obstacle Detection

Saffron - Digital Type System

Robot Bin Picking

Sapphire - High Accuracy NC Milling Simulation

Dose optimization for particle beam therapy

MERL Research on Autonomous Vehicles

Semantic Scene Labeling

3D Reconstruction

Deep Hierarchical Parsing for Semantic Segmentation

Global Local Face Upsampling Network

Gaussian Conditional Random Field Network for Semantic Segmentation

Fast Graspability Evaluation on Single Depth Maps for Bin Picking with General Grippers

Point-Plane SLAM for Hand-Held 3D Sensors

Tracking an RGB-D Camera Using Points and Planes

Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clustering

Calibration of Non-Overlapping Cameras Using an External SLAM System

Voting-Based Pose Estimation for Robotic Assembly Using a 3D Sensor

Fast Object Localization and Pose Estimation in Heavy Clutter for Robotic Bin Picking

Learning to rank 3D features
Software & Data Downloads

Quick Links

AnoopCherian

Tim K.Marks

Michael J.Jones

ChioriHori

Suhas AnandLohit

JonathanLe Roux

HassanMansour

Kuan-ChuanPeng

MatthewBrand

MoitreyaChatterjee

SiddarthJain

Devesh K.Jha

PedroMiraldo

DiegoRomeres

RaduCorcodel

Petros T.Boufounos

Daniel N.Nikovski

YeWang

AnthonyVetro

GordonWichern

William S.Yerazunis

DehongLiu

ArvindRaghunathan

ToshiakiKoike-Akino

Abraham P.Vinod

AvishaiWeiss

StefanoDi Cairano

Pu(Perry)Wang

YantingMa

YoshikiMasuyama

Philip V.Orlik

JoshuaRapp

AlexanderSchperberg

HuifangSun

YebinWang

KenjiInomata

HuaizuJiang

JinKato

KaenKogashi

JingLiu

LalitManam

KeiSuzuki

CV0252: Internship - Vital Signs from Video using Computer Vision & AI

CA0153: Internship - High-Fidelity Visualization and Simulation for Space Applications

CI0213: Internship - Efficient Foundation Models for Edge Intelligence

CI0177: Postdoctoral Research Fellow - Agentic AI

Anoop
Cherian

Tim K.
Marks

Michael J.
Jones

Chiori
Hori

Suhas Anand
Lohit

Jonathan
Le Roux

Hassan
Mansour

Kuan-Chuan
Peng

Matthew
Brand

Moitreya
Chatterjee

Siddarth
Jain

Devesh K.
Jha

Pedro
Miraldo

Diego
Romeres

Radu
Corcodel

Petros T.
Boufounos

Daniel N.
Nikovski

Ye
Wang

Anthony
Vetro

Gordon
Wichern

William S.
Yerazunis

Dehong
Liu

Arvind
Raghunathan

Toshiaki
Koike-Akino

Abraham P.
Vinod

Avishai
Weiss

Stefano
Di Cairano

Pu
(Perry)
Wang

Yanting
Ma

Yoshiki
Masuyama

Philip V.
Orlik

Joshua
Rapp

Alexander
Schperberg

Huifang
Sun

Yebin
Wang

Kenji
Inomata

Huaizu
Jiang

Jin
Kato

Kaen
Kogashi

Jing
Liu

Lalit
Manam

Kei
Suzuki