Anti Spam Hints and Tips

Friday, September 29, 2006

ASSP

OK, this is the anti-spam beasty I (mostly) use. It's very feature rich. In fact it has so many layers of spam defense, you will rarely want all of them running.
It is from http://assp.sourceforge.net/
First up, install the Linux version on a Linux box. The windows edition sux0r, manily due to the lack of good perl support, esp for the networking and dns libraries. It does work, but some features are not going to work without putting in a heap of extra time.
Next, you need to decide how paranoid you wish to be.
This will help you choose which aspects of the filtering to have turned on. In particular you need to decide whether you prefer type one or two errors, ie false negatives or false positives.
Pros and cons, in no particular order.
Bayesian Filtering: Unless you're very tolerant of spam, you probably should opt for Bayesian filtering. You can adjust how aggressive it is, so that lets you adjust how much you trust it, so there's no real reason to turn it off. Remember to run in test mode until the bayes check is getting more reliable. I ended up waiting a whole week, and even then got a few false positives. But then I was being quite aggressive with it.
Validate HELOs is really worth using. The messages that this filters are 100% junk. No mail server will ever send you this garbage, and only bad cgi will.
Validate destination address: You may think this is useless, as these won't get delivered anyway. But, if you use the penalty box it is invaluable for catching spam bombers and shoving them into the penalty box incredibly quickly. If you're running a mail server that includes an LDAP service then by far, this is the best (low-maintenance) way to check email addresses.
Missing DNS Info, both PTR and MX checks. These are mostly checking that you would be able to send back to the sending server. Missing MX records means that the sending domain doesn't have an MX, which means nothing specifically points to an email server. Bingo> problem. And missing ptr means the sending PTR doesn't have a reversing entry for a DNS record, theoretically meaning that the IP isn't listed on any DNS. In reality, a lot of domains have flakey DNS entries. SMTP will go to A records if MX records don't exist, so many domains are running happily without an MX and don't know it. And when clusters are used it's possible the sending IP is not DNS listed at all. Not counting DNS services that simply fail to update PTR records with their A records. I log these, but I don't block them. I blocked them for a few hours once and got a heap of false positives. So I don't use them.
Penalty Box: This is one to be careful with. I use it, but with a fairly high threshold that must be reached in fifteen minutes, and I've adjusted most of my weightings down to 1, with the exception of bad addresses. Bad addresses I weight at 20. What this does is essentially penalty box the people who just try lots of different email addresses for a single domain. I hardly get any normal blocks that aren't also "Extreme", because that type of idiot spammer doesn't look at their bounces, and runs heaps of SMTP connections in a short time frame. And the penalty box provides an IP based block, so it happens at the HELO/EHLO stage, before you get to even the email header. But it does need to be used with care.
RBL: This is nifty because it happens at the IP level, like the penalty box. At first, I thought I would end up getting a lot of false positives with this, especially the XBL (exploit block list) and the open relay lists, but in practice, I haven't had any. Blury excellent. If you ask me. Most can be done with the lists from spamhaus.org, but I ended up splitting theirs, then adding ordb.org, and then others, because if it grabs stuff from an open relay, then it can provide more diagnostic info to the real victims.

Thursday, September 28, 2006

This article is for server administration, not for single clients. This is because as a single client, most of the damage is done before you can see enough of the mail to filter it, and there just is no effective solution. The solution needs to be at the mail server, so spam is not just thrown away, but the spammer can receive an SMTP rejection of some sort.
RBLs vs Content Verification.
So do you block known bad IPs or do you check the contents of the mail for signatures that indicate spam. I've personally never been a fan of the false dichotomy so I won't linger on this for very long, but the answer is clearly "both". But you check the IP first. Because the IP address can be checked against a string of reverse DNS RBLs at the connection level, way before the message arrives. Hell, if everything is working nicely, before you even get the header. HELO, Shaadup! EHLO, Shaadup! easy.
Then, if the email doesn't flag anything on the RBLs, look at the header, and then look at the message.
RBLs and Content checking are just the two most common methods for checking spam. They are, for the most part, quite easy to understand. RBLs are very simple, Content checking has several variants and several commercial interpretations. Some newer systems are beginning to include image content checking as well, but this is likely to take a while to filter through to those of us not running supercomputers for mail servers. But searching for more effective anti-spam solutions has opened my eyes to many other options. Here are a few of the tricks I've come across, or use and think need more air-time.
Header Checking:
Just checking the formatting of smtp headers excludes a nice chunk of spam. Sad but true. Spammers often go to great lengths to hide their IPs, and as such include lots of spoofed header which is usually not quite formatted correctly.
Check that the supposed sender's email domain exists. ie, if the mail comes from spammer@spam.co.uk, check that spam.co.uk actually has an MX record. Sure, this isn't going to kill off 100% of your spam, but when you're looking at a multi-layered defense, this sort of additional check is useful. The secondary check is to check that the IP that sent the HELO(EHLO) has a reverse DNS A record. I personally have this one turned off, because too many legit people have a slightly screwy mail set-up, either because their ISP has stuffed them around or they just didn't know what they were doing. Potentially useful though.
One additional check I'd like to see, but haven't yet seen in any spam filter in the wild, would be checking that the MX record for the reply-to or from address match at least within the network address of the sending IP.
Delaying:
This is simply responding to all unknown incoming emails with a non-fatal error. When that particular email is retried, pass it through the delayer code. What this means is that unless the email is coming from a mail server with a queue, then the mail will probably just fail. No prizes for guessing what type of email that usually means. The problems arrise with the other sources of non-queued email. Web-forms often send mail direct with no queue. And some mail servers do not respond nicely to being delayed. Thankfully some people (http://projects.puremagic.com/greylisting/) have built lists of these misbehaving servers.
Penalty boxing:
One of my favourites. Any IP with more than a specified badness gets blocked. I weight most spammy type stuff very lowly to add to this badness, but weight bad email addresses fairly highly. What this does for me is fairly quickly penalize spammers that just try every possible combo of letters and numbers for usernames at a domain. You would be surprised how often that sort of thing happens. The penalty box can act like a local RBL, and reject connections before they become bandwidth. Happyness. One of the good points with Penalty boxing is that in most cases, spam will be stopped by one of either penalty boxing or delaying. If the spammer tries moving IPs and sending from different addresses, then the delaying will kill off the email, and if they sit still and retry to bypass the delayer, then the penalty box will shut them down.
The Other Side
Then there's the whitelisting. What most systems have is a manually entered list of email addresses to always accept email from. The system I use has an outgoing mail proxy that automatically adds all the outgoing emails' destination into the whitelist. This basically means that most (~60-80%) of legit email is whitelisted. I hear you thinking, "so what?". The good thing about this is it means you can be that little bit more agressive with non-whitelisted emails.

This is basically just a holding spot for my SMTP server layout.
Port numbers have been changed to protect the innocent. :-)

Incoming Route
[Port 25]
[I'Net Gateway]
[Port 25]
[ASSP, on CentOS]
[Post 25]
[Trend Micro Interscan MSS on Win 2003]
[Port 25]
[Exchange 2003 on Win 2003]

Outgoing
[Client]
[Port 75]
[I'Net Gateway]
[Port 75]
[SMTP VirtServ1, Exchange 2003 on Win 2003]
[Port 55]
[ASSP, on CentOS]
[Port 45]
[SMTP VirtServ2, Exchange 2003 on Win 2003]
[Port 25]