Anti Spam Hints and Tips

Monday, October 09, 2006

Delaying AKA Greylisting vs Spam collecting

Bayesian (spam) filters rely on being able to collect at least some spam emails as well as some good emails. And if you want to keep up with the changing face of spam, you need to be collecting recent spam as well as just looking through the old archived spam from months back.
However, the assorted "trickery" used by ASSP at the network level means a lot of spam is blocked very early on. In particular, the delaying/greylisting seems to stop a lot of spam quite dead, and of course it never collects the spam email itself. It's simply respponding to the initial EHLO command. So you can end up in a situation (like I find myself in) where, although you're blocking 87% of incoming emails as spam, you can never actually collect any of them for the spam database.
I've ended up running for half a day without delaying to boost my database. And turned off delaying for my little group of SpamLovers. A long term solution needs to be found however.
And I'm trying to better exploit the Spamtrap address. The spam trap address I'm using already existed before ASSP. It's the address that belonged to one of our technicians who worked here for a few years and seemed to put his name down for all sorts of crap. He's been gone for a long time now, but still seems to get heaps of junk email. So far, that passive spam-trapping has been all I've needed, but with Delaying, I need to take it further.
I'm going to try posting the address on as many forums, newsgroups and bulletin boards as will let me, and see if the volume increases. Hell, I may even try to "Remove" him from some mailing lists.

Tuesday, October 03, 2006

Happy Bounce messages

OK, so a bit bored and I've rewritten my ASSP bounce messages. This does serve a purpose. It makes it easier to spot my bounce messages, and then track them back to ASSP. As incoming emails can be blocked (in my set-up) by five different filters, it's quite useful to be able to isolate which program has blocked stuff.

Bayes

577 Message Integrity Failure. Spell check (Khoisan:!Kung) Failed. Server Memory Leak, advise top-up immediately. This may be a permanent failure. Message was not delivered. Send error reports to admin @

Relay

530 Replaying Failure. UPS Blocked IRQ packet. UPS Responded with; No routable interface found for that DNS protocol. Please rewind tape to beginning.

Invalid recipient

550 5.1.1 Mailbox . There is no one available to take your mail. If you would like to leave a message for EMAILADDRESS, please send after the beep, and be sure to include you return address. If you have been routed to this incorrectly, please verify address details at http://www.imdb.com/search and retry.

PB Blocking

554 5.7.1 Astropneumatic oscillations in the servers' water-cooling have created too many packet collisions for available packet ambulances. Packets are being routed to /dev/null, but /dev/null is reporting full. Upgrade of /dev/null to handle cavitation in water cooled packet ambulance routing recommended.

Bad Sender

550 5.5.0 EHLO/HELO rejected by server. SMTP 666 Error. Specific server error was " REASON. " Probable failure in CPU alignment or positron focusing. Repack bearings and rotate CPU.

Delay

451 4.7.1 Transient recursive traversal of loopback mount points failed. Please try again later.

RBLs (sbl.spamhaus.org|xbl.spamhaus.org|list.dsbl.org|relays.ordb.org|combined.njabl.org|smtp.dnsbl.sorbs.net|zombie.dnsbl.sorbs.net|nomail.rhsbl.sorbs.net)

471 4.7.1 Delivery NOT Authorized. Please insert $1 coin to continue. Message was refused at this time. Blacklisted by RBLLISTED. This error indicates that the sending IP address (the email server) has been listed as a source of spam. We (Manaccom) cannot remove you from these lists. You (or more accurately your email system administrator) need to contact the listing authority (http://RBLLISTED) for methods to remove the server from the list. XBL or ORDB lists => THIS MAY INDICATE YOUR EMAIL SERVER HAS BEEN EXPLOITED BY SPAMMERS! ACT NOW! XBL lists are listing IP addresses known to be being used to send mail through exploited systems, via virus infections, trojans or other "zombie" methods. OpenRelays (ordb) are mail servers that will forward anyone's mail for them. Spammers use exploited systems and open relays to hide behind other systems when sending their mail.

Bad Attach

575 Microelectronic Riemannian curved-space fault in write-only file system. Part 2 and later of multipart SMTP message failed to authenticate with Cerberus. Homework eaten error. Remainder of message delivered sucessfully to /dev/null.

See? Fun. :-)

Friday, September 29, 2006

ASSP

OK, this is the anti-spam beasty I (mostly) use. It's very feature rich. In fact it has so many layers of spam defense, you will rarely want all of them running.
It is from http://assp.sourceforge.net/
First up, install the Linux version on a Linux box. The windows edition sux0r, manily due to the lack of good perl support, esp for the networking and dns libraries. It does work, but some features are not going to work without putting in a heap of extra time.
Next, you need to decide how paranoid you wish to be.
This will help you choose which aspects of the filtering to have turned on. In particular you need to decide whether you prefer type one or two errors, ie false negatives or false positives.
Pros and cons, in no particular order.
Bayesian Filtering: Unless you're very tolerant of spam, you probably should opt for Bayesian filtering. You can adjust how aggressive it is, so that lets you adjust how much you trust it, so there's no real reason to turn it off. Remember to run in test mode until the bayes check is getting more reliable. I ended up waiting a whole week, and even then got a few false positives. But then I was being quite aggressive with it.
Validate HELOs is really worth using. The messages that this filters are 100% junk. No mail server will ever send you this garbage, and only bad cgi will.
Validate destination address: You may think this is useless, as these won't get delivered anyway. But, if you use the penalty box it is invaluable for catching spam bombers and shoving them into the penalty box incredibly quickly. If you're running a mail server that includes an LDAP service then by far, this is the best (low-maintenance) way to check email addresses.
Missing DNS Info, both PTR and MX checks. These are mostly checking that you would be able to send back to the sending server. Missing MX records means that the sending domain doesn't have an MX, which means nothing specifically points to an email server. Bingo> problem. And missing ptr means the sending PTR doesn't have a reversing entry for a DNS record, theoretically meaning that the IP isn't listed on any DNS. In reality, a lot of domains have flakey DNS entries. SMTP will go to A records if MX records don't exist, so many domains are running happily without an MX and don't know it. And when clusters are used it's possible the sending IP is not DNS listed at all. Not counting DNS services that simply fail to update PTR records with their A records. I log these, but I don't block them. I blocked them for a few hours once and got a heap of false positives. So I don't use them.
Penalty Box: This is one to be careful with. I use it, but with a fairly high threshold that must be reached in fifteen minutes, and I've adjusted most of my weightings down to 1, with the exception of bad addresses. Bad addresses I weight at 20. What this does is essentially penalty box the people who just try lots of different email addresses for a single domain. I hardly get any normal blocks that aren't also "Extreme", because that type of idiot spammer doesn't look at their bounces, and runs heaps of SMTP connections in a short time frame. And the penalty box provides an IP based block, so it happens at the HELO/EHLO stage, before you get to even the email header. But it does need to be used with care.
RBL: This is nifty because it happens at the IP level, like the penalty box. At first, I thought I would end up getting a lot of false positives with this, especially the XBL (exploit block list) and the open relay lists, but in practice, I haven't had any. Blury excellent. If you ask me. Most can be done with the lists from spamhaus.org, but I ended up splitting theirs, then adding ordb.org, and then others, because if it grabs stuff from an open relay, then it can provide more diagnostic info to the real victims.

Thursday, September 28, 2006

This article is for server administration, not for single clients. This is because as a single client, most of the damage is done before you can see enough of the mail to filter it, and there just is no effective solution. The solution needs to be at the mail server, so spam is not just thrown away, but the spammer can receive an SMTP rejection of some sort.
RBLs vs Content Verification.
So do you block known bad IPs or do you check the contents of the mail for signatures that indicate spam. I've personally never been a fan of the false dichotomy so I won't linger on this for very long, but the answer is clearly "both". But you check the IP first. Because the IP address can be checked against a string of reverse DNS RBLs at the connection level, way before the message arrives. Hell, if everything is working nicely, before you even get the header. HELO, Shaadup! EHLO, Shaadup! easy.
Then, if the email doesn't flag anything on the RBLs, look at the header, and then look at the message.
RBLs and Content checking are just the two most common methods for checking spam. They are, for the most part, quite easy to understand. RBLs are very simple, Content checking has several variants and several commercial interpretations. Some newer systems are beginning to include image content checking as well, but this is likely to take a while to filter through to those of us not running supercomputers for mail servers. But searching for more effective anti-spam solutions has opened my eyes to many other options. Here are a few of the tricks I've come across, or use and think need more air-time.
Header Checking:
Just checking the formatting of smtp headers excludes a nice chunk of spam. Sad but true. Spammers often go to great lengths to hide their IPs, and as such include lots of spoofed header which is usually not quite formatted correctly.
Check that the supposed sender's email domain exists. ie, if the mail comes from spammer@spam.co.uk, check that spam.co.uk actually has an MX record. Sure, this isn't going to kill off 100% of your spam, but when you're looking at a multi-layered defense, this sort of additional check is useful. The secondary check is to check that the IP that sent the HELO(EHLO) has a reverse DNS A record. I personally have this one turned off, because too many legit people have a slightly screwy mail set-up, either because their ISP has stuffed them around or they just didn't know what they were doing. Potentially useful though.
One additional check I'd like to see, but haven't yet seen in any spam filter in the wild, would be checking that the MX record for the reply-to or from address match at least within the network address of the sending IP.
Delaying:
This is simply responding to all unknown incoming emails with a non-fatal error. When that particular email is retried, pass it through the delayer code. What this means is that unless the email is coming from a mail server with a queue, then the mail will probably just fail. No prizes for guessing what type of email that usually means. The problems arrise with the other sources of non-queued email. Web-forms often send mail direct with no queue. And some mail servers do not respond nicely to being delayed. Thankfully some people (http://projects.puremagic.com/greylisting/) have built lists of these misbehaving servers.
Penalty boxing:
One of my favourites. Any IP with more than a specified badness gets blocked. I weight most spammy type stuff very lowly to add to this badness, but weight bad email addresses fairly highly. What this does for me is fairly quickly penalize spammers that just try every possible combo of letters and numbers for usernames at a domain. You would be surprised how often that sort of thing happens. The penalty box can act like a local RBL, and reject connections before they become bandwidth. Happyness. One of the good points with Penalty boxing is that in most cases, spam will be stopped by one of either penalty boxing or delaying. If the spammer tries moving IPs and sending from different addresses, then the delaying will kill off the email, and if they sit still and retry to bypass the delayer, then the penalty box will shut them down.
The Other Side
Then there's the whitelisting. What most systems have is a manually entered list of email addresses to always accept email from. The system I use has an outgoing mail proxy that automatically adds all the outgoing emails' destination into the whitelist. This basically means that most (~60-80%) of legit email is whitelisted. I hear you thinking, "so what?". The good thing about this is it means you can be that little bit more agressive with non-whitelisted emails.

This is basically just a holding spot for my SMTP server layout.
Port numbers have been changed to protect the innocent. :-)

Incoming Route
[Port 25]
[I'Net Gateway]
[Port 25]
[ASSP, on CentOS]
[Post 25]
[Trend Micro Interscan MSS on Win 2003]
[Port 25]
[Exchange 2003 on Win 2003]

Outgoing
[Client]
[Port 75]
[I'Net Gateway]
[Port 75]
[SMTP VirtServ1, Exchange 2003 on Win 2003]
[Port 55]
[ASSP, on CentOS]
[Port 45]
[SMTP VirtServ2, Exchange 2003 on Win 2003]
[Port 25]