TR2004-028

Visual Tracking and Recognition Using Appearance-Adaptive Models in Particle Filters


    •  Shaohua Zhou, Rama Chellappa, Baback Moghaddam, "Visual Tracking and Recognition Using Appearance-Adaptive Models in Particle Filters", Tech. Rep. TR2004-028, Mitsubishi Electric Research Laboratories, Cambridge, MA, November 2004.
      BibTeX TR2004-028 PDF
      • @techreport{MERL_TR2004-028,
      • author = {Shaohua Zhou, Rama Chellappa, Baback Moghaddam},
      • title = {Visual Tracking and Recognition Using Appearance-Adaptive Models in Particle Filters},
      • institution = {MERL - Mitsubishi Electric Research Laboratories},
      • address = {Cambridge, MA 02139},
      • number = {TR2004-028},
      • month = nov,
      • year = 2004,
      • url = {https://www.merl.com/publications/TR2004-028/}
      • }
  • Research Area:

    Computer Vision

Abstract:

We propose an approach that incorporates appearance-based models in a particle filter to realize robust visual tracking and recognition algorithms. In conventional tracking algorithms, the appearance model is either fixed or rapidly changing, and the motion model is simply a random walk with fixed noise variance. Also, the number of particles is typically fixed. All these factors make the visual tracker unstable. To stabilize the tracker, we propose the following features: an observation model arising from an adaptive appearance model, an adaptive velocity motion model with adaptive noise variance, and an adaptive number of particles. The adaptive-velocity model is derived using a first-order linear predictor based on the appearance difference between the incoming observation and the previous particle configuration. Occlusion analysis is implemented using robust statistics. Experimental results on tracking visual objects in long outdoor and indoor video sequences demonstrate the effectiveness and robustness of our tracking algorithm. We then perform simultaneous tracking and recognition by embedding them in one particle filter. For recognition purposes, we model the appearance changes between frames and gallery images by constructing the intra- and extra-personal spaces. Accurate recognition is achieved when confronted by pose and view variations.