Anoop Cherian

Anoop Cherian
  • Biography

    Anoop was a postdoctoral researcher in the LEAR group at Inria from 2012-2015 where his research was on the estimation and tracking of human poses in videos. From 2015-2017, he was a Research Fellow at the Australian National University, where he worked on the problem of recognizing human activities in video sequences. Anoop is the recipient of the Best Student Paper award at the Intl. Conference on Image Processing in 2012. Currently, his research focus is on modeling the semantics of video data.

  • Recent News & Events

    •  TALK    [MERL Seminar Series 2026] Jialong Wu presents talk titled World Models and Human-like Reasoning
      Date & Time: Wednesday, March 25, 2026; 11:00 AM
      Speaker: Jialong Wu, Tsinghua University
      MERL Host: Anoop Cherian
      Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
      Abstract
      • This talk introduces the background and key findings of our recent work, "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models," which answers the question of when and how visual generation enabled by unified multimodal models (UMMs) benefits reasoning. We take a world model perspective, inspired by human cognition. Specifically, humans construct mental models of the world, representing information and knowledge through two complementary channels—verbal and visual—to support reasoning, planning, and decision-making. In contrast, recent advances in large language models (LLMs) and vision–language models (VLMs) largely rely on verbal chain-of-thought reasoning, leveraging primarily symbolic and linguistic world knowledge. Unified multimodal models (UMMs) open a new paradigm by using visual generation for visual world modeling, advancing more human-like reasoning on tasks grounded in the physical world. In this work, we formalize the atomic capabilities of world models and world model-based chain-of-thought reasoning. We highlight the richer informativeness and complementary prior knowledge afforded by visual world modeling, leading to our visual superiority hypothesis for tasks grounded in the physical world. We identify and design tasks that necessitate interleaved visual-verbal CoT reasoning, constructing a new evaluation suite, VisWorld-Eval. Through controlled experiments on BAGEL, we show that interleaved CoT significantly outperforms purely verbal CoT on tasks that favor visual world modeling, strongly supporting our insights.
    •  
    •  NEWS    MERL Researchers at NeurIPS 2025 presented 2 conference papers, 5 workshop papers, and organized a workshop.
      Date: December 2, 2025 - December 7, 2025
      Where: San Diego
      MERL Contacts: Petros T. Boufounos; Anoop Cherian; Radu Corcodel; Stefano Di Cairano; Chiori Hori; Christopher R. Laughman; Suhas Lohit; Pedro Miraldo; Saviz Mowlavi; Kuan-Chuan Peng; Arvind Raghunathan; Abraham P. Vinod; Pu (Perry) Wang
      Research Areas: Artificial Intelligence, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & Audio
      Brief
      • MERL researchers presented 2 main-conference papers and 5 workshop papers, as well as organized a workshop, at NeurIPS 2025.

        Main Conference Papers:

        1) Sorachi Kato, Ryoma Yataka, Pu Wang, Pedro Miraldo, Takuya Fujihashi, and Petros Boufounos, "RAPTR: Radar-based 3D Pose Estimation using Transformer", Code available at: https://github.com/merlresearch/radar-pose-transformer

        2) Runyu Zhang, Arvind Raghunathan, Jeff Shamma, and Na Li, "Constrained Optimization From a Control Perspective via Feedback Linearization"

        Workshop Papers:

        1) Yuyou Zhang, Radu Corcodel, Chiori Hori, Anoop Cherian, and Ding Zhao, "SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs", NeuriIPS 2025 Workshop on SPACE in Vision, Language, and Embodied AI (SpaVLE) (Best Paper Runner-up)

        2) Xiaoyu Xie, Saviz Mowlavi, and Mouhacine Benosman, "Smooth and Sparse Latent Dynamics in Operator Learning with Jerk Regularization", Workshop on Machine Learning and the Physical Sciences (ML4PS)

        3) Spencer Hutchinson, Abraham Vinod, François Germain, Stefano Di Cairano, Christopher Laughman, and Ankush Chakrabarty, "Quantile-SMPC for Grid-Interactive Buildings with Multivariate Temporal Fusion Transformers", Workshop on UrbanAI: Harnessing Artificial Intelligence for Smart Cities (UrbanAI)

        4) Yuki Shirai, Kei Ota, Devesh Jha, and Diego Romeres, "Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch", Worskhop on Embodied World Models for Decision Making

        5) Mark Van der Merwe and Devesh Jha, "In-Context Policy Iteration for Dynamic Manipulation", Workshop on Embodied World Models for Decision Making

        Workshop Organized:

        MERL members co-organized the Multimodal Algorithmic Reasoning (MAR) Workshop (https://marworkshop.github.io/neurips25/). Organizers: Anoop Cherian (Mitsubishi Electric Research Laboratories), Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories), Suhas Lohit (Mitsubishi Electric Research Laboratories), Honglu Zhou (Salesforce AI Research), Kevin Smith (Massachusetts Institute of Technology), and Joshua B. Tenenbaum (Massachusetts Institute of Technology).
    •  

    See All News & Events for Anoop
  • Research Highlights

  • Internships with Anoop

    See All Internships at MERL
  • MERL Publications

    •  Kogashi, K., Cherian, A., Kuo, M.-Y.J., "MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions", IEEE Winter Conference on Applications of Computer Vision (WACV), March 2026, pp. 1512-1521.
      BibTeX TR2026-029 PDF Video Data
      • @inproceedings{Kogashi2026mar,
      • author = {Kogashi, Kaen and Cherian, Anoop and Kuo, Meng-Yu Jennifer},
      • title = {{MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions}},
      • booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
      • year = 2026,
      • pages = {1512--1521},
      • month = mar,
      • url = {https://www.merl.com/publications/TR2026-029}
      • }
    •  Mumcu, F., Jones, M.J., Yilmaz, Y., Cherian, A., "Leveraging Multimodal LLM Descriptions of Activity for Explainable Semi-Supervised Video Anomaly Detection", Transactions on Machine Learning Research, February 2026.
      BibTeX TR2026-027 PDF
      • @article{Mumcu2026feb2,
      • author = {Mumcu, Furkan and Jones, Michael J. and Yilmaz, Yasin and Cherian, Anoop},
      • title = {{Leveraging Multimodal LLM Descriptions of Activity for Explainable Semi-Supervised Video Anomaly Detection}},
      • journal = {Transactions on Machine Learning Research},
      • year = 2026,
      • month = feb,
      • url = {https://www.merl.com/publications/TR2026-027}
      • }
    •  Zhang, Y., Corcodel, R., Hori, C., Cherian, A., Zhao, D., "AxisBench: What Can Go Wrong in VLMs’ Spatial Reasoning?", Advances in Neural Information Processing Systems (NeurIPS) workshop, December 2025.
      BibTeX TR2025-168 PDF
      • @inproceedings{Zhang2025dec2,
      • author = {{{Zhang, Yuyou and Corcodel, Radu and Hori, Chiori and Cherian, Anoop and Zhao, Ding}}},
      • title = {{{AxisBench: What Can Go Wrong in VLMs’ Spatial Reasoning?}}},
      • booktitle = {Advances in Neural Information Processing Systems (NeurIPS) workshop},
      • year = 2025,
      • month = dec,
      • url = {https://www.merl.com/publications/TR2025-168}
      • }
    •  Zhang, J., Cherian, A., Rodriguez, C., Deng, W., Gould, S., "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams", IEEE International Conference on Computer Vision (ICCV), September 2025, pp. 6304-6314.
      BibTeX TR2025-139 PDF
      • @inproceedings{Zhang2025sep,
      • author = {Zhang, Jiahao and Cherian, Anoop and Rodriguez, Cristian and Deng, Weijian and Gould, Stephen},
      • title = {{Manual-PA: Learning 3D Part Assembly from Instruction Diagrams}},
      • booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
      • year = 2025,
      • pages = {6304--6314},
      • month = sep,
      • url = {https://www.merl.com/publications/TR2025-139}
      • }
    •  Ni, Y., Wen, S., Koniusz, P., Cherian, A., "Noise Consistency Regularization for Improved Subject-Driven Image Synthesis", IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), June 2025, pp. 3116-3126.
      BibTeX TR2025-073 PDF
      • @inproceedings{Ni2025jun,
      • author = {Ni, Yao and Wen, Song and Koniusz, Piotr and Cherian, Anoop},
      • title = {{Noise Consistency Regularization for Improved Subject-Driven Image Synthesis}},
      • booktitle = {IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR)},
      • year = 2025,
      • pages = {3116--3126},
      • month = jun,
      • publisher = {CVF},
      • url = {https://www.merl.com/publications/TR2025-073}
      • }
    See All MERL Publications for Anoop
  • Other Publications

    •  Anoop Cherian and Stephen Gould, "Second-order Temporal Pooling for Action Recognition", International Journal of Computer Vision (IJCV), 2018.
      BibTeX
      • @Article{cherian2018ijcv,
      • author = {Cherian, Anoop and Gould, Stephen},
      • title = {Second-order Temporal Pooling for Action Recognition},
      • journal = {International Journal of Computer Vision (IJCV)},
      • year = 2018,
      • publisher = {Springer}
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "Visual Permutation Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.
      BibTeX
      • @Article{cherian2018permutation,
      • author = {Santa Cruz, Rodrigo and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {Visual Permutation Learning},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2018,
      • publisher = {IEEE}
      • }
    •  Jue Wang, Anoop Cherian, Fatih Porikli and Stephen Gould, "Video Representation Learning Using Discriminative Pooling", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_representation_cvpr18,
      • author = {Wang, Jue and Cherian, Anoop and Porikli, Fatih and Gould, Stephen},
      • title = {Video Representation Learning Using Discriminative Pooling},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Suryansh Kumar, Anoop Cherian, Yuchao Dai and Hongdong Li, "Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_rigid_cvpr18,
      • author = {Kumar, Suryansh and Cherian, Anoop and Dai, Yuchao and Li, Hongdong},
      • title = {Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Suvrit Sra, Stephen Gould and Richard Hartley, "Non-Linear Temporal Subspace Representations for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_temporal_cvpr18,
      • author = {Cherian, Anoop and Sra, Suvrit and Gould, Stephen and Hartley, Richard},
      • title = {Non-Linear Temporal Subspace Representations for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Basura Fernando, Mehrtash Harandi and Stephen Gould, "Generalized Rank Pooling for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cherian2017generalized,
      • author = {Cherian, Anoop and Fernando, Basura and Harandi, Mehrtash and Gould, Stephen},
      • title = {Generalized Rank Pooling for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Panagiotis Stanitsas, Mehrtash Harandi, Vassilios Morellas and Nikolaos Papanikolopoulos, "Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices", International Conference on Computer Vision (ICCV), 2017.
      BibTeX
      • @Inproceedings{cherian_rigid_iccv17,
      • author = {Cherian, Anoop and Stanitsas, Panagiotis and Harandi, Mehrtash and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2017
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "DeepPermNet: Visual Permutation Learning", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cruz2017deeppermnet,
      • author = {Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {{DeepPermNet: Visual Permutation Learning}},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Bayesian Non-Parametric clustering for positive definite matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016.
      BibTeX
      • @Article{cherian2016bayesian,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Bayesian Non-Parametric clustering for positive definite matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2016,
      • publisher = {IEEE}
      • }
    •  Piotr Koniusz and Anoop Cherian, "Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
      BibTeX
      • @Inproceedings{koniusz2016sparse,
      • author = {Koniusz, Piotr and Cherian, Anoop},
      • title = {Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2016
      • }
    •  Piotr Koniusz, Anoop Cherian and Fatih Porikli, "Tensor representations via kernel linearization for action recognition from 3D skeletons", European Conference on Computer Vision (ECCV), 2016.
      BibTeX
      • @Inproceedings{koniusz2016tensor,
      • author = {Koniusz, Piotr and Cherian, Anoop and Porikli, Fatih},
      • title = {Tensor representations via kernel linearization for action recognition from {3D} skeletons},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2016,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Julien Mairal, Karteek Alahari and Cordelia Schmid, "Mixing body-part sequences for human pose estimation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
      BibTeX
      • @Inproceedings{cherian2014mixing,
      • author = {Cherian, Anoop and Mairal, Julien and Alahari, Karteek and Schmid, Cordelia},
      • title = {Mixing body-part sequences for human pose estimation},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2014
      • }
    •  Anoop Cherian, "Nearest neighbors using compact sparse codes", International Conference on Machine Learning (ICML), 2014.
      BibTeX
      • @Inproceedings{cherian2014nearest,
      • author = {Cherian, Anoop},
      • title = {Nearest neighbors using compact sparse codes},
      • booktitle = {International Conference on Machine Learning (ICML)},
      • year = 2014
      • }
    •  Anoop Cherian and Suvrit Sra, "Riemannian sparse coding for positive definite matrices", European Conference on Computer Vision (ECCV), 2014.
      BibTeX
      • @Inproceedings{cherian2014riemannian,
      • author = {Cherian, Anoop and Sra, Suvrit},
      • title = {Riemannian sparse coding for positive definite matrices},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2014,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013.
      BibTeX
      • @Article{cherian2013jensen,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {{Jensen-Bregman} logdet divergence with application to efficient similarity search for covariance matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2013,
      • publisher = {IEEE}
      • }
    •  Anoop Cherian, Vassilios Morellas, Nikolaos Papanikolopoulos and Saad J Bedros, "Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications", Computer Vision and Pattern Recognition (CVPR), 2011.
      BibTeX
      • @Inproceedings{cherian2011dirichlet,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos and Bedros, Saad J},
      • title = {Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications},
      • booktitle = {Computer Vision and Pattern Recognition (CVPR)},
      • year = 2011
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence", International Conference on Computer Vision (ICCV), 2011.
      BibTeX
      • @Inproceedings{cherian2011efficient,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2011
      • }
    •  Suvrit Sra and Anoop Cherian, "Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval", Machine Learning and Knowledge Discovery in Databases (ECML), 2011.
      BibTeX
      • @Article{sra2011generalized,
      • author = {Sra, Suvrit and Cherian, Anoop},
      • title = {Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval},
      • journal = {Machine Learning and Knowledge Discovery in Databases (ECML)},
      • year = 2011
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Accurate 3D ground plane estimation from a single image", International Conference on Robotics and Automation, 2009.
      BibTeX
      • @Inproceedings{cherian2009accurate,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Accurate 3D ground plane estimation from a single image},
      • booktitle = {International Conference on Robotics and Automation},
      • year = 2009
      • }
  • Software & Data Downloads

  • Videos

  • MERL Issued Patents

    • Title: "System and Method for Anomaly Detection using an Attention Model"
      Inventors: Cherian, Anoop
      Patent No.: 12,474,699
      Issue Date: Nov 18, 2025
    • Title: "System and Method for Controlling a Robot"
      Inventors: Cherian, Anoop; Chatterjee, Moitreya; Liu, Xiulong; Paul, Sudipta
      Patent No.: 12/459/115
      Issue Date: Nov 4, 2025
    • Title: "Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation"
      Inventors: Cherian, Anoop; Sullivan, Alan; Marks, Tim
      Patent No.: 12,406,374
      Issue Date: Sep 2, 2025
    • Title: "A Method and System for Scene-Aware Audio-Video Representation"
      Inventors: Cherian, Anoop; Chatterjee, Moitreya; Le Roux, Jonathan
      Patent No.: 12,056,213
      Issue Date: Aug 6, 2024
    • Title: "Artificial Intelligence System for Classification of Data Based on Contrastive Learning"
      Inventors: Cherian, Anoop; Aeron, Shuchin
      Patent No.: 11,809,988
      Issue Date: Nov 7, 2023
    • Title: "System and Method for Manipulating Two-Dimensional (2D) Images of Three-Dimensional (3D) Objects"
      Inventors: Marks, Tim; Medin, Safa; Cherian, Anoop; Wang, Ye
      Patent No.: 11,663,798
      Issue Date: May 30, 2023
    • Title: "InSeGAN: A Generative Approach to Instance Segmentation in Depth Images"
      Inventors: Cherian, Anoop; Pais, Goncalo; Marks, Tim; Sullivan, Alan
      Patent No.: 11,651,497
      Issue Date: May 16, 2023
    • Title: "Method and System for Scene-Aware Interaction"
      Inventors: Hori, Chiori; Cherian, Anoop; Chen, Siheng; Marks, Tim; Le Roux, Jonathan; Hori, Takaaki; Harsham, Bret A.; Vetro, Anthony; Sullivan, Alan
      Patent No.: 11,635,299
      Issue Date: Apr 25, 2023
    • Title: "Scene-Aware Video Encoder System and Method"
      Inventors: Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan; Marks, Tim; Sullivan, Alan
      Patent No.: 11,582,485
      Issue Date: Feb 14, 2023
    • Title: "Low-latency Captioning System"
      Inventors: Hori, Chiori; Hori, Takaaki; Cherian, Anoop; Marks, Tim; Le Roux, Jonathan
      Patent No.: 11,445,267
      Issue Date: Sep 13, 2022
    • Title: "Anomaly Detector for Detecting Anomaly using Complementary Classifiers"
      Inventors: Cherian, Anoop; Wang, Jue
      Patent No.: 11,423,698
      Issue Date: Aug 23, 2022
    • Title: "System and Method for a Dialogue Response Generation System"
      Inventors: Hori, Chiori; Cherian, Anoop; Marks, Tim; Hori, Takaaki
      Patent No.: 11,264,009
      Issue Date: Mar 1, 2022
    • Title: "Scene-Aware Video Dialog"
      Inventors: Geng, Shijie; Gao, Peng; Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan
      Patent No.: 11,210,523
      Issue Date: Dec 28, 2021
    See All Patents for MERL