[Spooks] Random Acts of Spamness

Tue, 13 Jan 2004 13:13:32 -0500

Recently we talked a bit about 5lg's appearing in spam. This article might 
shed some light on the why's they appear.

>http://www.wired.com/news/infostructure/0,1377,61886,00.html
>
>By Michelle Delio
>Jan. 13, 2004
>
>"Daphnia blue-crested fish cattle, darkorange fountain moss,
>beaverwood educating, eyeblinking advancing, dulltuned amazons...."
>
>This is not a failed attempt at free-form prose. It's a snippet of a
>spam message intended to promote a sexual stimulant, a deliberate
>crack at sneaking past and spoiling some of the most popular antispam
>filters.
>
>Antispam experts agreed that this isn't a brand-new technique, but
>said the addition of potentially filter-foiling gibberish is rapidly
>becoming a common component of spam.
>
>"I'd say at least half of the spam that I bother to look at now
>contains a paragraph or two of random blather. Until recently we'd see
>it in only one or two spams a week at the most," said Anthony Baxter,
>one of the developers of SpamBayes, a free, open-source Bayesian
>antispam filter.
>
>"This is yet another escalation of the arms race between spammers and
>those people who like to have a useful e-mail inbox," Baxter added.
>
>The addition of seemingly nonsensical words is aimed at confusing the
>antispam filters that incorporate Bayesian analysis techniques, such
>as SpamBayes and SpamAssassin. These filters examine incoming e-mail
>messages and calculate the probability of it being spam based on each
>message's contents.
>
>But unlike simple content filters that simply troll text looking for
>specific words like Nigeria, money and opt, Bayesian spam filters
>evolve according to each user's needs, analyzing all mail to determine
>what words and phrases are apt to appear in a user's legitimate e-mail
>and which are not. This process is called training, and results in a
>highly personalized and efficient filtering system.
>
>By throwing a hundred or so random words rarely used in sales spiels
>into each e-mail missive, spammers hope to thwart Bayesian filters by
>making the spam appear to be personal correspondence. Incorporating
>words that might be used in legitimate e-mails is also intended to
>poison the checklist the filter uses, forcing it to mark, for example,
>e-mails with somewhat common words like Amazon and fish as spam
>indicators.
>
>The strange strings of words, which usually appear at the bottom of
>spam and sometimes in the subject line, are automatically added by
>spammers' mass-mailer software, according to Steve Linford of
>Spamhaus, an antispam advocacy organization.
>
>"This random noise is technically known as a 'hash buster,'" Linford
>explained. "Hashing" is a technique used by some spam filters to
>quickly compare incoming mail to known spam.
>
>"Most of the illegal-exploit spammers use hash busters and any other
>trick they can to get past filters, refusing to accept that people use
>spam filters because they really don't want spam," Linford added.
>
>Baxter and Linford said that spammers' use of hash busting is
>definitely on the rise, but such tricks can rarely circumvent a
>well-trained Bayesian filter.
>
>"To slip past the filters, spam messages need a lot of 'good' words in
>the hash buster," Baxter explained. "Good words vary a lot by person
>-- for instance, I would have a lot of computer terms in my e-mail,
>while a friend of mine uses e-mail to discuss his love of 1960s
>Corvettes. Words that my filter says are good wouldn't work that well
>for my friend's e-mail."
>
>Content filters, which just look for specific words, can get hung up
>on analyzing a torrent of jumbled jargon, but the use of a hash buster
>in an e-mail is also a prime way of identifying e-mail marketers who
>are knowingly and deliberately spewing spam, said Linford.
>
>"What spammers probably don't realize is that the mere presence of
>hash busters screams 'Spam!' and it's impossible for spammers to claim
>they're not spamming when the spam contains hash busters," Linford
>said. "Spamhaus sees hash busters as proof a spammer knows he's
>spamming and is deliberately trying to get past filters, so we
>actually come down on them harder when they're using hash busters."
>
>And as much as spammers would like to believe that they can cleverly
>disguise their unsolicited missives, there's just no way to cloak
>sappy sales pitches.
>
>"Spam is trying to sell you something," Baxter said. "So they still
>need to include their sales spiel, and they can't put too much garbage
>in the message or else the people they're trying to reach will not
>read the message."
>
>Some spammers have started hiding hash busters from consumers by
>formatting the filter-fouling gibberish in white text on a white
>background. Users probably won't see it, but the filters will still be
>able to "read" it.
>
>But it's not hard to filter for that trick, either.
>
>"In the end spammers who use hash busters are just making it easier
>for filters to spot spam," said Suresh Ramasubramanian, manager of
>security and antispam operations for Outblaze, a Hong Kong-based
>provider of e-mail and messaging solutions. "You just train your
>Bayesian filters to look for the presence of white noise, and treat
>that as a sure sign that the message is spam.
>
>"Happily, spammers are sometimes a bit too clever for their own good."