Why is Bayesian filtering the best way to catch spam?
The information in this article applies to:
- GFI MailEssentials 2010 for Exchange/SMTP
- GFI MailEssentials for Exchange/SMTP 10
- GFI MailEssentials for Exchange/SMTP 11
- GFI MailEssentials for Exchange/SMTP 12
- GFI MailEssentials for Exchange/SMTP 14
- GFI MailEssentials for Exchange/SMTP 9
Article ID: KBID001813
Query keywords: anti-spam, bayesian
Bayesian filtering is widely acknowledged by leading experts and publications to be the best way to catch spam. A Bayesian filter uses a mathematical approach based on known spam and legitimate emails. This gives it a tremendous advantage over other spam solutions that just check for keywords or rely on downloading signatures of known spam. GFI’s Bayesian filter uses an advanced mathematical formula and a dataset which is ‘custom-created’ for your installation: The spam data is continuously updated by GFI and is automatically downloaded by GFI MailEssentials, whereas the ham data is automatically collected from your outbound mail. This means that the Bayesian filter is constantly learning new spam tricks, and spammers cannot circumvent the dataset used. This results in a 98+% spam detection rate, after the required two-week learning period. In short, Bayesian filtering has the following advantages:
- Looks at the whole spam message, not just keywords or known spam signatures
- Learns from your outbound mail (ham) and therefore reduces false positives greatly
- Adapts itself over time by learning about new spam and new valid mail
- Dataset is unique to company, making it impossible to bypass
- Multilingual and international
Notes:
- Articles by Bayesian guru Paul Graham can be found at: http://www.paulgraham.com/spam.html and http://www.paulgraham.com/sofar.html
- GFI white paper anout Bayesian filteting: http://www.gfi.com/whitepapers/why-bayesian-filtering.pdf
- BBC report: http://news.bbc.co.uk/1/hi/technology/3014029.stm
- Sorting the ham from the spam: http://www.smh.com.au/articles/2003/06/23/1056220528960.html
- Information about the methods available to train the Bayesian filter can be found at: KBID002947
- Recommendations and notes on training and using the Bayesian filter can be found at: KBID002946
- Information on why the Bayesian filter may not block spam properly can be found in the knowledge base article: KBID002691
- Information on how to reset the Bayesian filter can be found in the knowledge base article KBID002082