Gamecraft

This blog is all about the craft of making games, and in particular, computer games. Gamecraft involves a broad range of topics, including design, development, quality control, packaging, marketing, management, and life experience.

Name: Gregg Seelhoff
Location: East Lansing, Michigan, United States

Sunday, June 07, 2009

A Tale of a Good Anti-spam Tool

Spam, spam, go away... You are not welcome ANY day.

My approach to my primary email address, from the very start (more than 13.5 years ago) was that potential clients and customers should be able to contact me without jumping through hoops, so I have never bothered to hide or obscure my address: seelhoff@sophsoft.com . I have always published it in plain view (and to do otherwise would now be closing the barn door long after the horse has bolted and gone on to live free and happy until dying of old age).

Of course, this also allows any spamming slimebag with an address harvester to easily add me to each and every email database on the planet, so I do get spam. Lots of spam. To be honest, though, the level of spam to my "open" account seemed to plateau fairly quickly, although I never really kept track. Over the years, it may have been slowly and steadily rising, but I know that my patience has been slowly and steadily declining, so a while ago, I added some tools to stem the tide.

Let's talk numbers, first. Since the beginning of April, my primary email account has received 75,000 email messages. Of those, almost exactly 98% are spam. Of the other (legitimate) messages, 80% are business (1.6% of the total), and the remaining 20% (0.4% of the total) are personal. Both of these categories include active mailing lists, such as Carbon and DirectX development (business) and community events (personal). I set up my email client to automatically sort these (and marketing messages) into appropriate folders, and the number of messages specifically to me, from clients, customers, family, and friends, is just a handful per day. These are the only ones that actually hit my inbox and trigger a notification sound.

To be honest, not all Bayesian filtering is created equal, and my email client is probably about average. It handled much of the junk, but an annoying number of spams were being missed, and signalling me (incorrectly) that I had a legitimate message. When I finally had enough, I downloaded and installed POPFile upon a recommendation from somebody in the ASP. I had been leery about installing an interim mail server on my system simply for filtering email, but it turned out to be an excellent choice.

After several months of training, POPFile is 99.92% accurate selecting among business, personal, and spam classifications and, importantly, I have gone for more than a month without a false positive for spam. (Most of the classification "errors" are simply unclassified messages that need to be trained.) Used in series with my email client, I can review messages that either think may be legitimate (ideally, to never miss a valid email), but I am only notified of incoming mail if they both agree on the validity. This has greatly reduced interruptions and made my days more productive.

Of course, there is some training involved so POPFile can "learn" the difference between legitimate messages and spam, but the initial process goes pretty quickly (and when one averages more than 1000 messages per day, there is lots of data). If I were to start all over again, I would not have chosen to have business and personal messages separated, since that distinction is not particularly necessary for me (and not always clear, either, such as when a family member reports a server problem, or a business associate invites me to a party).

If you are looking for an anti-spam solution local to your own system, I strongly recommend POPFile.

1 Comments:

Anonymous sheila manning said...

Sounds Good.

June 09, 2009 7:59 AM  

Post a Comment

<< Home