TR2004-091
The Spam-Filtering Accuracy Plateau at 99.9% Accuracy and How to Get Past It
-
- "The Spam-Filtering Accuracy Plateau at 99.9% Accuracy and How to Get Past It", MIT Spam Conference, January 2004. ,
-
MERL Contact:
Abstract:
Bayesian filters have now become the standard for spam filtering; unfortunately most Bayesian filters seem to reach a plateau of accuracy at 99.9 percent. We experimentally compare the training methods TEFT, TOE, and TUNE, as well as pure Bayesian, token-bag, token-sequence, SBPH, and Markovian ddiscriminators. The results deomonstrate that TUNE is indeed best for training, but computationally exorbitant, and that Markovian discrimination is considerably more accurate than Bayesian, but not sufficient to reach four-nines accuracy, and that other techniques such as inoculation are needed.
Related News & Events
-
NEWS MIT Spam Conference 2004: publication by William Yerazunis Date: January 21, 2004
Where: MIT Spam Conference
MERL Contact: William S. Yerazunis
Research Area: Data AnalyticsBrief- The paper "The Spam-Filtering Accuracy Plateau at 99.9% Accuracy and How to Get Past It" by Yerazunis, W.S. was presented at the MIT Spam Conference.