TR2016-027

Exemplar Learning for Extremely Efficient Anomaly Detection in Real-Valued Time Series


    •  Jones, M.J., Nikovski, D.N., Imamura, M., Hirata, T., "Exemplar Learning for Extremely Efficient Anomaly Detection in Real-Valued Time Series", Journal of Data Mining and Knowledge Discovery, DOI: 10.1007/​s10618-015-0449-3, Vol. 30, No. 6, pp. 1427-1454, March 2016.
      BibTeX TR2016-027 PDF
      • @article{Jones2016mar,
      • author = {Jones, Michael J. and Nikovski, Daniel N. and Imamura, Makoto and Hirata, Takahisa},
      • title = {Exemplar Learning for Extremely Efficient Anomaly Detection in Real-Valued Time Series},
      • journal = {Journal of Data Mining and Knowledge Discovery},
      • year = 2016,
      • volume = 30,
      • number = 6,
      • pages = {1427--1454},
      • month = mar,
      • doi = {10.1007/s10618-015-0449-3},
      • issn = {1573-756X},
      • url = {https://www.merl.com/publications/TR2016-027}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Data Analytics

Abstract:

We investigate algorithms for efficiently detecting anomalies in real-valued one-dimensional time series. Past work has shown that a simple brute force algorithm that uses as an anomaly score the Euclidean distance between nearest neighbors of subsequences from a testing time series and a training time series is an effective anomaly detector. We investigate a very efficient implementation of this method and show that it is still too slow for most real world applications. Next, we present a new method based on summarizing the training time series with a small set of exemplars. The exemplars we use are feature vectors that capture both the high frequency and low frequency information in sets of similar subsequences of the time series. We show that this exemplar-based method is both much faster than the efficient brute force method as well as a prediction-based method and also handles a wider range of anomalies. Our exemplar-based algorithm is able to process time series in minutes that would take other methods days to process.