What's worse? The spam itself? Or how anti-spam solutions block legitimate mail?

After being in New York City last week and being busy almost the entire time, I spent a good part of the weekend catching up on e-mail. I have more inboxes than I care to admit and use more technologies than I should be using to see access them (Via the Web, Thunderbird, Outlook, etc).

After being in New York City last week and being busy almost the entire time, I spent a good part of the weekend catching up on e-mail. I have more inboxes than I care to admit and use more technologies than I should be using to see access them (Via the Web, Thunderbird, Outlook, etc). Somewhere on my to-do list is a day or two's worth of purging and consolidation. These e-mail marathons usually include the tedious job of searching junk mail and spam folders for any legitimate e-mails. When it comes to e-mail, most of us (e-mail users and the e-mail solutions providers that serve them) have lost our sensibilities. The facts that (a) legitimate e-mail finds its way into our junk mail folders, (b) we must spend our time searching through junk mail folders for that legitimate e-mail, and (c) we somehow think this is normal, is proof that we're gluttons for punishment.

So, let me make this abundantly clear: the very second you must access your junk e-mail folder to make sure there's no legitimate e-mail in it is the very second in which your anti-spam technology has become entirely useless to you. After all, the whole idea of anti-spam technology is to make it so that you don't have to wade through illegitimate e-mail in order to read your legitimate e-mail -- all of your legitimate e-mail. One statistic that anti-spam solution providers pride themselves on is the fewest number of false positives. That is, they'll boast that their systems make the fewest number of mistakes when it comes to misclassifying a legitimate e-mail as spam (and dumping that legit mail into your junk mail or spam folder). To me, this would be like an amusement park bragging about the fewest number of deaths.

Let's be clear. Even one false positive is unacceptable. In fact, I'd argue that just one false-positive is even worse than a bunch. What's harder to spot? One needle in a giant haystack? Or 20 or 30? If you've ever scanned a junk mail folder with 100's of entries, only one of which is legit, spotting that one is actually harder than spotting a bunch because of the way we are so easily desensitized by seas of text. Ultimately however, it doesn't matter. The fact that we have to look at all completely defeats the purpose of having a junk mail folder. We might as well just let the spam flow into inboxes because having to look in two folders for our e-mail is just the same as having to look in two inboxes. You still have to look.

OK, so you don't have to look. That is, so long as there isn't a chance that a critical e-mail may have ended up there. But what's non-critical? Several recent scans of my junk mail folders revealed the following false positives and how that false positive affected me:

  • Someone that I would drop everything for was coming to town on short notice and wrote to me to see if I wanted to get together with them. I missed the opportunity.
  • A service that I pay for annually was about to expire and was reminding me via e-mail to renew. Luckily, I caught it before it expired.
  • A vendor wrote to me with a correction to something I wrote on ZDNet. It took longer than either of us would have liked for me to correct the text.
  • Readers write to me with tips for Technology Shakedowns or their own thoughts on what I've written. I like to write back to readers and thank them for writing to me on a timely basis. But that sometimes doesn't happen because their e-mail is getting falsely accused of being spam.
  • My bank actually sent me a real e-mail having to do with security measures. With so much phishing going on, it is nearly impossible to tell the difference between an e-mail from your real bank and some imposter pretending to be your bank. Fortunately, I look closely at anything that says it's from my bank just in case it really is. But should I have to do this?
  • Someone I had an appointment with had to change that appointment. I showed up at the originally scheduled time and bumped into a competitor. Can you say "uncomfortable"?

I could keep going but won't. You get the picture. Once you realize that these sorts of mission critical e-mails are being routed to your spam folders, you have no choice but to keep an eye on those folders too.

It raises a serious question. What's worse? The spam itself? Or, the nasty side-effect of anti-spam solutions whereby important e-mail isn't getting to its recipients on time, if at all. For me, all it takes is one missed deadline. Or one canceled appointment. Or one missed critical business communication for me to realize that one of those snafus is far more costly to me as a businessperson than all of the spam taken together. Can't decide? Put yourself in the sender's shoes. Actually, you don't have to do that.

Chances are, you send e-mail. What's worse? The spam you're getting or the fact that some mail you are sending is getting falsely classified as spam on the other end by an anti-spam system that you have no control over? During my junk mail folder cleansing operation this weekend, I decided to do something differently (perhaps you're one of the lucky few who heard from me?). For every false positive (and there were many), I wrote back to the person with the following message or something similar:

Just fyi... outlook rejects your e-mails as spam. No idea why. Outlook doesn't tell you.

So, first, a couple qualifiers. Here at CNET Networks, we use Spam Assassin at the server level and Outlook's built-in filtering at the client level. When Spam Assassin catches something, it adds an attachment that tries to explain why the e-mail in question passed the corporately set "Is this spam?" test. When this attachment was present, I furnished that information as well. But for e-mail that makes it past Spam Assassin's watchful eyes (and plenty of spam does), Outlook 2003 has its own anti-spam technology to serve as a backup. When Outlook 2003 thinks something is spam, it doesn't tell you why the way Spam Assassin does.

I wasn't about to dig around these e-mails to figure out. It's not my job and I don't have the time for every false positive that comes in (now that there are so many). But I'd hate to have to be the poor IT guy on the other side where now, they've been notified that their business-critical communications may not be getting through to the intended recipients. How many e-mails didn't get through? Don't know. What was causing the problem? Don't know (even when Spam Assassin tells you, you have to be a rocket scientist to figure out what it means). It's a complete breakdown of a system that senders everywhere are depending on.

This, my friends, is known as the "deliverability problem." If you've noticed legitimate mail getting falsely classified as spam on your end, then you know it's happening to your outbound e-mail on the other end. How many times have you said to someone "Didn't you get my e-mail?" and had the other person say "No, maybe it got trapped by my spam filters."

Invariably, in response to my rants about spam, my inbox and my junk mail folder get loaded with pitches from anti-spam solution providers who will swear until their blue in the face that I must try their system because of how much more accurate it is than the rest of the solutions on the market (especially mine). The funny thing is that even though they don't realize it, they all say the exact same things. Here are some bullet points. Feel free to cut and paste if you work for an anti-spam vendor:

  • Our system is patented (whoop dee doo. Some kid filed for and was awarded a patent for swinging sideways on an ordinary swing).
  • It was developed through man years of research by security experts in Tel Aviv (that's right, Tel Aviv attracts better spam researchers than any other city in the world).
  • The inventor of our system has a Ph.D. (no comment, I don't want hate mail from Ph.Ds unless my anti-spam system will falsely classify it as spam).
  • I've seen this Dave and I'm telling you, it really works (Your definition of "works" and mine are very different).
  • The Gartner Group has seen this and they agree, there's nothing quite like it (It's one of the most unfortunate facts about the anti-spam ecosystem -- no two solutions are created equal. That's part of the problem).
  • So and so Fortune 500 company is using it (oy vey, the blind leading the blind).
  • No honestly Dave, I swear to you. Try this system and you'll agree that it's better than anything else out there.

I'm so tired of this e-mail that I usually ignore it. Occasionally, I respond and the first question I ask is, "What does your solution do to solve the deliverability problem?" Answer nothing. Case in point? I'm still arguing with one anti-spam solution provider and, irony of all ironies, most of the e-mails that he's sending to me, telling me about how his system is so much better than everyone else's, are showing up in my junk mail folder.

He does however admit that there's one way to solve the problem; everyone needs to run the same system. In his case, he just thinks it should be his system. In my case, the answer is to make sure the fundamental technologies are baked, as standards, into all e-mail systems. It's simply unrealistic to think that every e-mail administrator in the world is going to go out and buy the same system. But if the so-called system involves standards that are baked into every solution that's out there, then, we stand a chance of rectifying the problem.

It isn't just one standard either. Fixing the problem requires layers of standards just the same way that retrieving e-mail today involves layers. For example, when e-mail servers transmit or receive e-mail from across the Internet, those servers must comply with the Simple Mail Transfer Protocol (SMTP). But for you to get your e-mail into your PC from one of those servers usually requires your e-mail client (Outlook, Thunderbird, etc.) to connect with an SMTP-compliant server over a different protocol. It might be a proprietary protocol like the one Outlook uses to speak with Microsoft's Exchange Servers (for both mail and calendering) or it might be the POP3 or IMAP standards for e-mail retrieval. The point is that layers are involved and that bit of complexity, which will be required here, shouldn't deter us from going after the right solution.

For example, going back to my bit of manual labor over the weekend where I wrote back to a bunch of people telling them that their e-mail had been falsely classified as spam, there's no reason the system could not have done that. In other words, over the SMTP protocol, there could be a variety of error codes that the suspicious system sends back to the suspect to let them know that (a) the e-mail didn't get to it's intended recipient and (b) why. Imagine for example if all the people who received my manual generated "non-delivery e-mail" received the same sort of non-delivery message for every e-mail that was falsely categorized as spam from all the other recipients? At least they'd know they have a problem and with whom. They might even be able to zero in on the problem and eliminate it, thereby increasing the chances of deliverability this time.

Arm-chair anti-spam quarterbacks will tell you that this sort of automated response is a terrible idea because it notifies the sender that they've found a active inbox. They talk about this like it's the equivalent of letting the spammer have one foot in the door. This is pure BS. Does it really matter? The system is so broken today that we'd be conceding very little in exchange for something that long term stands a chance. That's because this would simply be a layer in the system. Other layers (for example, authentication) would take care of spammers' other means of flying below our radars and weaseling their way into our inboxes.

Finally, as I have said many times before, we can't make this sort of progress on anti-spam standards (or layers of anti-spam standards) until the world's largest e-mail solution providers Microsoft, AOL, Google, and Yahoo (MAGY: pronounced "Maggie") decide to work together to (1) agree on what the anti-spam protocols should be, (2) get their systems interoperating over those standards, and (3) announce a date in the future at which point non-conforming e-mail will be refused entry into their systems. Why they can't come together to a least take a stab at this on behalf of everyone who is plagued by both spam and non-deliverablity (heck, nothing else is working) remains a mystery to me.