TR2026-091

Partial Ring Scan: Revisiting Scan Order in Vision State Space Models

- Hsieh, Y.-K., Peng, K.-C., Li, X., Chang, M.-C., Tseng, Y.-C., Hsieh, J.-W., "Partial Ring Scan: Revisiting Scan Order in Vision State Space Models", International Conference on Machine Learning (ICML), July 2026.
  BibTeX TR2026-091 PDF
  - @inproceedings{Hsieh2026jul,
  - author = {Hsieh, Yi-Kuan and Peng, Kuan-Chuan and Li, Xin and Chang, Ming-Ching and Tseng, Yu-Chee and Hsieh, Jun-Wei},
  - title = {{Partial Ring Scan: Revisiting Scan Order in Vision State Space Models}},
  - booktitle = {International Conference on Machine Learning (ICML)},
  - year = 2026,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2026-091}
  - }
MERL Contact:
- Kuan-Chuan
  Peng
Research Areas:

Artificial Intelligence, Computer Vision, Machine Learning

Abstract:

State Space Models (SSMs) provide linear-time alternatives to attention for vision, but require serializing 2D images into 1D sequences using a predefined scan order. We identify scan order as a previously underexplored inductive bias that fundamentally shapes spatial dependency modeling in Vision SSMs. Fixed scan paths distort local adjacency, fragment object structure, and induce anisotropic representations that are brittle under geometric transformations such as rotation. We propose Partial RIng Scan Mamba (PRIS-Mamba), a rotation-robust traversal that decomposes images into concentric rings, per- forms permutation-invariant aggregation within each ring, and models cross-ring dependencies via short radial SSMs. This design induces a struc- tured factorization of spatial dependencies that preserves isotropy while maintaining linear complexity. To improve efficiency without sacrificing expressivity, we introduce partial channel filtering, selectively applying recurrent modeling to in- formative channels while routing others through a residual pathway. Empirically, PRIS-Mamba improves accuracy, efficiency, and rotation robustness over prior Vision SSMs on ImageNet-1K. Our results position scan-order design as a core representational choice in Vision SSMs, with implications for robustness and generalization beyond architectural scaling. The code will be re- leased upon paper acceptance.

Related News & Events

NEWS MERL Presents 4 Main Conference Papers and 6 Workshop Papers at ICML 2026
Date: July 6, 2026 - July 11, 2026
Where: COEX, Seoul, South Korea
MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Stefano Di Cairano; Toshiaki Koike-Akino; Christopher R. Laughman; Jing Liu; Suhas Lohit; Kuan-Chuan Peng; Alexander Schperberg; Ye Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
Brief
- MERL researchers are proud to present 4 main conference papers and 6 workshop papers at ICML 2026. ICML, taking place from July 6-11 in Seoul, South Korea, is a premier international conference in machine learning.
  
  Main Conference Papers with MERL Authors:
  
  1. Understanding Dynamic Compute Allocation in Recurrent Transformers by Ibraheem Muhammad Moosa, Suhas Lohit, Ye Wang, Moitreya Chatterjee, and Wenpeng Yin.
  
  2. LLawCo: Learning Laws of Cooperation for Modeling Embodied Multi-Agent Behavior by Qinhong Zhou, Chuang Gan, and Anoop Cherian.
  
  3. Memory-Distilled Selection for Noise-Robust Anomaly Detection by Sirojbek Safarov, Jaewoo Park, Yoon G. Jung, Kuan-Chuan Peng, Wonchul Kim, Seongdeok Bang, and Octavia Camps.
  
  4. Partial Ring Scan: Revisiting Scan Order in Vision State Space Models by Yi-Kuan Hsieh, Kuan-Chuan Peng, Xin Li, Ming-Ching Chang, Yu-Chee Tseng, and Jun-Wei Hsieh.
  
  Workshop Papers with MERL Authors:
  
  1. WISE: Weighted Iterative Society-of-Experts for Multimodal Multi-Agent Debate with Probabilistic Consensus by Anoop Cherian, Suhas Lohit, and Kuan-Chuan Peng. (Workshop on Scalable Learning and Optimization for Efficient Multimodal AI Agents (SCALE))
  
  2. MIRROR: Multisensory Implicit Rejection-sampled RObotic policy by Amisha Bhaskar, Pratap Tokekar, Stefano Di Cairano, and Alexander Schperberg. (Workshop on Structured Probabilistic Inference & Generative Modeling)
  
  3. Reinforced Neural Processes: Memory-Efficient Time-Series Forecasting with a World-Feedback-Trained Memory Policy by Nibraas Khan, Gordon Wichern, and Christopher R. Laughman. (Workshop on Reinforcement Learning from World Feedback (RLxF))
  
  4. Connecting Low-Rank Adapters and Policy Stability in GRPO Fine-Tuning by Antonin Rottman, Francesco Tonin, Yongtao Wu, Toshiaki Koike-Akino, and Volkan Cevher. (Workshop on Connecting Low-rank Representations in AI (CoLorAI))
  
  5. EinSort: Sorting is All We Need for Tensorizing LLM by Toshiaki Koike-Akino, Jing Liu, and Ye Wang. (Workshop on Connecting Low-rank Representations in AI (CoLorAI))
  
  6. Temper and Tilt Lead to SLOP: Reward Hacking Mitigation with Inference-Time Alignment by Ye Wang, and Jing Liu, and Toshiaki Koike-Akino. (Workshop on Agents in the Wild: Safety, Security, and Beyond)

MERL Contact:

Kuan-ChuanPeng

Research Areas:

Abstract:

Kuan-Chuan
Peng