From: Troy Settle
Date: Mon, 10 Nov 2008 11:30:27 -0500

I received a piece of junkmail this morning:
http://home.psknet.com/troy/1.txt

In the spam report, I see this: BAYES_00=-2.599

So, I run it through sa-learn with --spam:

Learned tokens from 1 message(s) (1 message(s) examined)

Then, I re-scan it using spamc, and still I get:

BAYES_00=-2.599

What gives? I don't expect the total score to come up much, but the
bayes should at least go from a negative number to a positive number...
shouldn't it?

The answer Depends on how many tokens bayes is looking at and how
spammy those tokens are. You can see what bayes thinks about each
token with --debug output. I get BAYES_40 on your message.

% wget http://home.psknet.com/troy/1.txt
% spamassassin -D --test-mode --debug all,bayes < 1.txt 2>&1 | grep bayes:
...
[14389] dbg: bayes: corpus size: nspam = 426975, nham = 53737
[14389] dbg: bayes: token 'Dodge' => 0.999612090680101
[14389] dbg: bayes: token 'sincerely' => 0.999492864983535
[14389] dbg: bayes: token 'decode' => 0.0344385308520192
[14389] dbg: bayes: token 'I'll' => 0.0365668821340277
[14389] dbg: bayes: token 'Perspective' => 0.0404549158471554
...
[14389] dbg: bayes: score = 0.310353325094371

After you learn a message as spam the numbers and raw score should
increase somewhat depending on how many times that token has been
seen. I get BAYES_60 on the message after learning.

% sa-learn --spam < 1.txt
% sa-learn --sync
% spamassassin -D --test-mode --debug all,bayes < 1.txt 2>&1 | grep bayes:
[14618] dbg: bayes: corpus size: nspam = 426990, nham = 53737
...
[14618] dbg: bayes: token 'Dodge' => 0.999615320566195
[14618] dbg: bayes: token 'sincerely' => 0.999498371335505
[14618] dbg: bayes: token 'decode' => 0.0348456512323892
[14618] dbg: bayes: token 'I'll' => 0.0366062570517363
[14618] dbg: bayes: token 'Perspective' => 0.0670493467695761
...
[14618] dbg: bayes: token 'omaha' => 0.958
[14618] dbg: bayes: token 'elsasser' => 0.958
[14618] dbg: bayes: token 'riders' => 0.958
...
[14618] dbg: bayes: score = 0.659988861825694


-jeff