The biggest spam challenge: defining it

Charlie Rose, during a recent taping of his TV show, asked me and three other guests for our definitions of spam. The other responses ranged from “unsolicited commercial e-mail" to "an unwanted automated message” to "anything I don't like." My own respon
Written by David Berlind, Inactive
Whether he knows it or not, Charlie Rose flushed out one of the thorniest issues when it comes to battling spam. During a recent taping of his show on the topic, Rose's first directive out of the gate had to do with defining spam. He turned to me: "David, help us agree on what spam is."

For some time now, I've been saying that we won't make progress in the battle until we set aside any and all attempts to define spam. There is no single definition, and relying on solutions that respond to a single definition is a big mistake. Today, most solutions --such as those used to protect the inboxes of certain ISPs' customers or corporate users --take this approach

Unfortunately, Rose turned to me first to provide a definition. Had he turned to me last, after his other guests-- Microsoft chief counsel Brad Smith, FTC commissioner Orson Swindle, and AOL senior vice president Joe Barrett-- had taken their turns, my point would have been much easier to make. Each guest volunteered a different definition.

Microsoft's Smith --- the lawyer spearheading Microsoft's 15 separate lawsuits against spammers --- defined spam as "unsolicited commercial e-mail sent to advertise a product or a service." AOL's Barrett called spam "an unwanted automated message." Finally, the FTC's Swindle responded to Rose's question with "anything I don't like." This "Swindle Rule" was the definition I liked best of the three.. Swindle understands the spam problem better than most people I know. The Swindle Rule speaks volumes about Swindle's belief that to solve the problem, end users must be empowered to decide for themselves what is and what is not spam. Said Swindle during the show, "If we empower consumers at their computer to screen out that which they prefer to screen out, and I know that can be done in a number of ways, I think that will be one part of the solution."

My interpretation is that ISPs and other centralized solutions in a position to intercept e-mail before it reaches its intended recipient shouldn't be making that decision on behalf of users, which is what is happening today. Before a decision to filter an e-mail can be made at any central interception point, the administrators of that interception point must decide what the common definition of spam will be for all downstream users. This approach is flawed since, as the Swindle Rule says, spam is anything the end user doesn't like. What satisfies the Swindle Rule for Orson Swindle may be very different from what satisfies the rule for Charlie Rose, Brad Smith, or Joe Barrett.

Brad Smith apparently doesn't like unsolicited commercial e-mail, aka UCE. As a side note, there's an entire organization focused specifically on UCE called CAUCE; Coalition Against Unsolicited Commercial E-mail. His definition is understandable.

When one party sues another (as in Microsoft suing spammers), the plaintiff is usually seeking some form of damages from the defendant. In Microsoft's case, the company is seeking both injunctive relief as well as monetary damages. Not surprisingly, Smith's definition of spam includes the phrase "commercial mail." The only spammers worth suing, if you're a lawyer, are the ones making money on it. While unsolicited commercial e-mail undoubtedly constitutes a significant portion of the e-mail we don't want, singling out distributors of commercial e-mails still leaves us vulnerable on the non-commercial e-mail front. UCE is not the only type of e-mail that satisfies the Swindle Rule for me. For example, virus-bearing e-mail, which has the potential to do far more harm than most UCE, satisfies the Swindle Rule for me as well.

Barrett's definition-"an unwanted, automated message"-also contains a limiting qualifier--the word "automated." Barrett was on the right track when he started with "unwanted." But the minute "automated" entered the definition, we got an interesting peek at one decision that ISPs are making at the aforementioned interception point. Because so much e-mail is flowing through the systems at ISPs like AOL, Yahoo, MSN, and Earthlink, these companies are in the position to compare those e-mails to each other and determine which ones are automated and which are not.

For example, no human being is capable of sending the same exact e-mail to 10,000 different recipients over the span of a couple minutes without some degree of automation. Designing technology that looks for this pattern--analyzing headers, content, and timestamps--wouldn't be difficult. But just in case they weren't sure, the major ISPs eventually joined forces to share such intelligence.

While Barrett's definition isn't nearly as narrow as Smith's, it still isn't as good as the Swindle Rule. For example, I wonder if the forensics that look for automated e-mail would catch e-mails that were automated by a virus. Any programmer worth their salt--especially one tenacious enough to write viruses--could easily fool a system that's designed to look for automation.

Of course, that's one reason we're in the mess we're in. The so-called spammers always seem to be one step ahead of the spam forensics community. In addition to crafty virus writers, we also have the propagators of chain letters, political messages, jokes, and e-mail Denial of Service attacks (via SMTP-flooding). There's also what I call reverse-spam: the e-mail that's bounced back to your inbox when a spammer uses your e-mail address as his or her return address. None of these are commercial and, although some of us may find these to be little more than a nuisance, others would prefer to group their transmissions in with the rest of the mail that satisfies the Swindle Rule-- mail we don't want.

It's a pretty simple rule and it should be used as the guiding principle behind the search for legal and technological solutions.

Rather than starting with a patchwork of conflicting solutions (that start with conflicting definitions), the technology and legislative communities need to look at ways that end users can reliably start to refuse mail they don't want, and then build from there. This effort would require a more reliable way to determine an e-mail's origin and doing that requires some work at the credential-level of the Internet's e-mail standard--the simple mail transport protocol (SMTP). This could be done in a number of ways, including one recently suggested by Domain Name System inventor Paul Mockapetris that involves a specification called DNSSEC..

Once we can determine with a greater degree of reliability from where and whom an e-mail is coming , we will have made a quantum leap in being able to fight spam with the Swindle Rule. Spam is not only e-mail we don't want; it's usually from people we don't want it from. Sure, there are exceptions. You may want to receive mail from me until a virus invades my e-mail system, crawls my address book, and sends itself to you on my behalf. If I were you, I would terminate your relationship with me until I was able to give you some assurance that my system was clean.

Speaking of relationship termination, the mechanism that a recipient engages to terminate a relationship with a sender--currently, and vaguely referred to as opt-out--should not be left up to the sender to determine. The mechanism should be built into the e-mail protocols, and our e-mail clients should have a terminate command in the menu structure much the same way that they have commands like "send." Relationship termination, however, is highly dependent on the inclusion of tamperproof credentials with an e-mail. E-mails without the proper credentials could either be made illegal, or simply refused by e-mail systems (perhaps resulting in an automated reply that instructs the original sender on how to become credential standard-compliant). Attempts to tamper with the credentials should definitely be made illegal. So far, none of these mechanisms attempt to define spam.

As I said in response to Rose's opening question about agreeing on what spam is: "That's the biggest problem. There is no agreement on what spam is."

Nor should there be.

Use TalkBack to let your fellow ZDNet readers know what you think. Or write to me at david.berlind@cnet.com. If you're looking for my commentaries on other IT topics, check the archives.

Editorial standards