Michael Hutchinson wrote:
>> -----Original Message-----
>> From: Mike Pepe [mailto:lamune@doki-doki.net]
>> Sent: Thursday, 20 March 2008 5:18 a.m.
>> To: users@spamassassin.apache.org
>> Subject: Cyrillic spam
>> For some strange reason, I'm seeing Cyrillic spams very frequently

> lately.
>> None of my users read any Eastern European languages- is there a quick
>> way to catch these?
>> thanks
>> -Mike

> You could use the ok_languages and ok_locales settings. I'm sure
> discussions on those can be found in the archives.
> I employed these rules for my site:

I'll have to check those myself.

Since I do have users that get Cyrillic content, I have to include
Cyrillic in my ok_locales.

I did a simple header rule that does a raw search for koi-8 . From
there, I did a couple of meta rules that give big scores to the
combination of Cyrillic plus at least one of: The Bat! as the sending
client, or SPAMMY-XMAILER or OUTLOOK_3416 and gave suitably high scores.
My Cyrillic spam has pretty much vanished.

Before I implemented these, I checked with my users who do Cyrillic, and
have no complaints from them since implementing. Even though there is a
Russian spell-checking module for The Bat!, as far as I can tell none of
my users exchange mail with Russian-speaking users of The Bat!

It's been discussed in this list before that going after content with
The Bat! is dangerous, because it's a legitimate client, but among my
users, the frequency of inbound mail with The Bat! is virtually zero.
Thus, although I score 2.1 points for The Bat!, I tend to use that rule
frequently in metas that combine with other more frequently hit rules.

To me, this is some of the real elegance of SpamAssassin, in that you
can score some number of common patterns with low scores, and beyond the
cumulative score of what turns up, using meta rules to look for
combinations of this, this and that (and when that particular
combination gets a hit, assign suitable high scores) is really useful.