Killing comment spam with Bayes
I’m not a fan of comment moderation, nor of CAPTCHAs, registration requirements, or anything else that makes it hard to leave a comment. I don’t use them on this website, and I get a lot of comment spam as a direct result. I needed a solution.
I’ve gone through three phases of spam fighting. In the first phase, I disabled comments on old posts that had been found by the evil robots. That worked fine, until they struck some posts that were still getting genuine comments.
In phase two, I moved to a simple blacklist system. I wrote a little scrubber script in Ruby that runs every few minutes, looks through recent comments, and hides them if they contain one of a small number of spammy features. Spammers generally peddle the same old crap, so this was pretty effective. But I had to keep up the blacklist.
Phase three came about during an idle moment at Railsconf Europe last week. To pass the time, I dug up a bit of code I’d written earlier that attempted to use a Bayesian classifier to partition comments into ham and spam. I got it running and trained it against the old comments (I keep them all, ham or spam). It worked reasonably well, with a few false positives and false negatives. With a bit more work today, I’ve got it working with 100% accuracy. I’m suffering a storm of spam right now, and it’s correctly identifying and hiding all of them.
I’ve found two things particularly effective in improving reliability:
- Use the Robinson-Fisher combiner algorithm.
- Tokenise and include everything.
By everything, I mean:
- Every word and non-word in the comment text
- Every part of a URL supplied
- Every word in the commenter’s name
- Every non-punctuation part of the email address
- The post the comment refers to
- The first two octets of the originating IP address
Things that aren’t in the comment text are tokenised as fake
words with a prefix indicating their origin (e.g.
), so that the classifier can weight them
separately from regular words.
By adding these extra features into the mix, it seems to be possible for the classifier to distinguish copy-and-paste spam from genuine comments. I’ll see how well it performs in the weeks ahead.