MessageLabs: Filtering your email sewage

Behind the Security Lines: In the last part of a special report on security research labs, ZDNet UK reports on its visit to MessageLabs UK's main research facility

"ISPs do the equivalent of pumping out raw sewage into your home. You wouldn't expect to have to filter your own water, so why do home users have to filter their own data?"

Paul Wood, MessageLabs senior analyst, has some very forthright views on just who should share the responsibility for ever-growing virus and spam burden on businesses and consumers. The comments are part of a guided tour of MessageLabs UK's main research facility near Gloucester.

For more, see the rest of our special report:

Inside Symantec's nuclear bunker

Sophos: Protecting the world from The Pentagon

"ISPs take the view that if they start looking at data packets, then this changes the legal position of the company," adds Woods, in explanation of service providers are reluctant to get involved with security filtering. MessageLabs on the other hand regards screening spam and malware from its customers as its core business or "messaging security and management" as the company describes it.

The company claims that 1 in 50 emails contains some form of malware, and is in a good position to comment on how ISPs should behave having grown out of Star — an ISP. MessageLabs claims that ISPs should collaborate more to minimise the threat caused by malware.

"ISPs need to talk to each other, and share information proactively," says Alex Shipp, MessageLabs senior antivirus technologist and 'imagineer'. "When we started out, MessageLabs used to send emails to ISPs saying spam was coming from their IP addresses, but ISPs hated that. They sent us rude emails. We had to stop, because we were finding so many compromised IP addresses — 1.5 million per day. If we sent out 1.5 million abuse reports per day to ISPs, we'd be spamming them!"

Shipp claims that he recently discovered that 700 different accounts were used to host spam Web sites on one ISP. "If we reported this to the ISP and they did something about it, and managed to shut down new compromised accounts every two minutes, it would take them all day. And, they would just have 700 new compromised accounts tomorrow," he adds

Although MessageLabs scans some150 million emails per day, the UK antivirus operations are run by a relatively small team. "We have eight people in the UK office on the full-time team, plus the Network Operations Centre guys doing anti-virus and anti-spam work."

The company is able to be effective with a small team by escalating anti-malware work, and by using third-party antivirus engines. It also has offices in Sydney, Hong Kong, Singapore, New York and two sites in the UK — Gloucester and London. MessageLabs can follow the sun, an essential prerequisite for security companies to tackle a global problem. Both Symantec and Sophos can also respond 24/7.

MessageLabs antivirus team deal with a mixture of long and short-term projects running concurrently. Long-term projects include looking at different ways to roll out malware signatures over the company infrastructure and measuring the efficacy of other vendors' antivirus engines used by MessageLabs. Currently, the... monitoring company use antivirus engines from McAfee and F-Secure, having switched last year from Sophos.

Short-term projects arrive as-and-when for ad hoc fire-fighting. Every day MessageLabs stop 12,000 items that are not stopped by the antivirus engines alone. Dedicated mailservers are used to filter emails for malware by analysing how much 'chaos' is contained in the code. Good files such as legitimate updates have a different stat distribution within the code. If the code has a number of different values, it is classed as chaotic. "If the code has 64 bytes, and every single byte is different, then the code is likely to be malware," said Shipp. For example, bad files often have encryption, and look different from good files because they are trying to hide themselves.

MessageLabs also compares new code with its signature file databases, which is between 2GB and 3GB of information. This database is constantly being updated, "so having caught variant A, we're confident of catching B, C, and D," says Shipp.

Initially defining viruses is "processor intensive". MessageLabs take the potentially malicious code and analyse it. Unusual features in email immediately mark code down as being suspicious. "If the code has IRC, FTP and email — not many legitimate programs have all of those capabilities," says Shipp.

MessageLabs also look for profanity, and virus writer handles. "Virus writers have big egos — they like putting their own names into the code. This never appears in good files," he adds.

Knowing their code contains indicators has led hackers to attempt more subtle social engineering tactics to propagate malicious code, including sending links in emails. This circumvents this problem as the malicious code is not actually contained in the email. "That's why the bad guys are sending links," said Shipp. One example of social engineering tactics is an email pretending the recipient has been sent an e-card. When the person clicks on the link to the card, they are redirected to a site containing malware, and infected.

MessageLabs work around this by detecting if the links have been obfuscated in an email to hide the URL or URI of the site the user would go to. There is also a link-following system which feeds into a discrete network that is dedicated to analysing the links.

Antivirus knowledge is also increased by MessageLabs sharing virus information with other companies, and law enforcement agencies. The company provides virus samples to sharing networks such as AV Gurus. This site maintains and publishes a collection of viruses using PTP encryption, and can only be accessed by legitimate users, according to Shipp.

The threat landscape: A new threat that the antivirus team has seen are data-stealing Trojans sent in spam. The email only has to be opened and the Trojan — hidden in a word document — is activated. These are being repeatedly sent to banks and government agencies in the hope that some information can be stolen.

"High-end criminals" are targeting aerospace companies with just these kinds of Trojans in the hope of gaining valuable information that can...

... be sold on, according to MessageLabs. Unlike the majority of spam, these emails have no grammatical or syntactical errors, and the code is spot on, says Maksym Schipka, anti-virus technical architect. Attacks are also increasingly blended to target both instant-messaging and the web.

Monitoring botnets
Botnets are another growing problem. They comprise of PCs that have been hijacked by hackers to send spam or other code, Botnets can be traced by looking at specific patterns of behaviour according to MessageLabs. If different machines are sending the same spam, it's likely they will use the same IRC channels. MessageLabs have ways of monitoring the compromised server. If a new bot is seen that contains the address of the IRC server, MessageLabs can follow the link through a command-and-control channel.

The current threat from bots is spam carrying malware, and the installation of spyware to steal sensitive information. This is very much financially motivated, with botmasters charging 6 US cents per install. "Some spyware code is particularly interesting as it activates itself half an hour after someone has visited a site, to disassociate itself from that site," says Shipp.

Monitoring MessageLabs' infrastructure
The Network Operations Centre (NOC) scans all mail destined for a client, before deciding whether that mail is spam or contains malware. MessageLabs has over 100 server towers dealing with managed mail services for customers. Within a tower are between 14 and 36 mail servers in a cluster. A new client is given a host name through which to route its mail and all the towers take on mail for that customer. Altogether, one billion emails a week are processed by the towers, says MessageLabs.

For more, see the rest of our special report:

Inside Symantec's nuclear bunker

Sophos: Protecting the world from The Pentagon

The arrangement of the towers makes the service more flexible —   if one of the servers crashes, others can pick up the slack and continue delivering mail. "This gives greater resilience within a cluster. If one of the servers crashes, or there's another issue with a third-party datacentre that affects it as a bandwidth provider, cutomers won't see a delay in their mailflow," says Andy Davies, NOC infrastructure support team leader.

MessageLabs also has a monitoring tool to monitor the bandwidth from its various third party datacentres. Graphics related to each server are displayed on a system called 'Big Brother'. Graphs on the left hand side of Big Brother represent the different towers. Each bar on a graph represents a server. If the colour of the bar is red, that's a warning that the server has crashed and needs rebooting, or that the mail queue has been delayed because the scanner has crashed. If the bar is yellow, it means the server is approaching its spec threshold, based on the mail flow within the tower.

Mail is funnelled through split directories. If the mail gets stuck, it is copied to a central location. All mail is scanned and if a particular mail has been identified as containing a virus that MessageLabs has not previously seen, the NOC personnel can start the process of writing the antivirus program or signature.