TR2004-046

A Time Series Clustering based Framework for Multimedia Mining and Summarization


    •  Radhakrishnan, R., Divakaran, A., Xiong, Z., "A Time Series Clustering based Framework for Multimedia Mining and Summarization", ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR), October 2004, pp. 157-164.
      BibTeX TR2004-046 PDF
      • @inproceedings{Radhakrishnan2004oct,
      • author = {Radhakrishnan, R. and Divakaran, A. and Xiong, Z.},
      • title = {A Time Series Clustering based Framework for Multimedia Mining and Summarization},
      • booktitle = {ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR)},
      • year = 2004,
      • pages = {157--164},
      • month = oct,
      • isbn = {1-58113-940-3},
      • url = {https://www.merl.com/publications/TR2004-046}
      • }
  • Research Areas:

    Artificial Intelligence, Speech & Audio

Abstract:

Past work on multimedia analysis has shown the utility of detecting specific temporal patterns for different content genres. In this paper, we propose a unified, content-adaptive, unsupervised mining framework to bring out such temporal patterns from different multimedia genres. We formulate the problem of pattern discovery from video as a time series-clustering problem. We treat the sequence of low/mid level audio-visual features extracted from the video as a time series and perform a temporal segmentation based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the time series. Our temporal segmentation detects transition points and outliers from a sequence of observations from a stationary background process. We define a confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the mining parameters and the confidence measure using bootstrapping and kernel density estimation thereby enabling a systematic method to choose the mining parameters for any application. Furthermore, the confidence measure can be used to rank the detected transitions in terms of their departures from the background process. Our experimental results with sequences of low and mid level audio features extracted from sports video show that highlight events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out patterns from surveillance videos without any a priori knowledge. Finally, we show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length.

 

  • Related News & Events