By Stas Bekman.
Published: May 15th 2006
The majority of anti-spam email companies do Heuristic (also known as signature-based) content filtering. A typical product receives the message from the client, saves it to a disk, then it tries to apply a variety of checks trying to match certain patterns. A score is assigned based on how well the pattern was matched, or how many patterns were matched. The higher the score the higher is the possibility that the email is a undesired email. It's up to a user to decide at what scores a given email should be dumped, saved to a spam-maybe folder (quarantine) or delivered to an INBOX.
One of the techniques used to assign a score is a pattern signature. This approach is that it requires a special lab with spamtraps, so that they can attract as much undesired email as possible. Next there is a need for humans armed with some analysing tools, who go over the undesired email and extract signatures which are then distributed to the customers (usually via very frequent product domains).
The main disadvantage of this approach is the huge delay between the spam or virus or phishing outbreak and the time the customers receive a signature update, which will allow to filter out those unwanted messages. And of course it's a disadvantage to the anti-spam company, since they have to employ lots of people to take care of this semi-manual labour-intense task.
Another disadvantage is those signatures aren't perfect, i.e. they may assign a high SPAM-email score to a totally legitimate email, what's known as a false-positive.
Here are some vendors supporting this technique (including open-source solutions):
Please notify me if you know of others.
And here are some pointers for additional information on the subject:
and Malware with Open Source (http://www.brettglass.com/spam/paper.html)
Spam Filters (http://www.emailcash.com/heuristic-spam-filters.html)
Companies to Squash Your Spam (http://www.darwinmag.com/read/020104/spam.html)