RE: Seeing Bayes token matches for an email - SpamAssassin
This is a discussion on RE: Seeing Bayes token matches for an email - SpamAssassin ; Theo Van Dinter wrote:
> On Fri, May 23, 2008 at 12:29:21PM -0400, Bowie Bailey wrote:
> > I have an email that does not look at all spammy to me. On an
> > account where Bayes is trained ...
-
RE: Seeing Bayes token matches for an email
Theo Van Dinter wrote:
> On Fri, May 23, 2008 at 12:29:21PM -0400, Bowie Bailey wrote:
> > I have an email that does not look at all spammy to me. On an
> > account where Bayes is trained manually (no auto-learn at all), it
> > got marked with BAYES_99. Is there a way to see what tokens Bayes
> > is keying on? I tried running it through "spamassassin -D", but I
> > didn't see it there.
>
> "spamassassin -D bayes"
>
> It's too noisy for the standard debug output, but the bayes channel
> will give you the full info.
I thought I already tried that. Oh well...
That gives me the info, now how do I interpret it?
[26088] dbg: bayes: token 'fax' => 0.0466125715533146
[26088] dbg: bayes: token 'may' => 0.951423151521604
[26088] dbg: bayes: token 'send' => 0.947622860965791
[26088] dbg: bayes: token 'sent' => 0.0562793425809908
The numbers seem to go from 0 to 1. Is 0 non-spammy and 1 spammy, or is
there more to it than that? If that's the case, there are quite a few
common words that bayes doesn't seem to like.
--
Bowie
-
RE: Seeing Bayes token matches for an email
On Fri, 23 May 2008, Bowie Bailey wrote:
> That gives me the info, now how do I interpret it?
>
> [26088] dbg: bayes: token 'fax' => 0.0466125715533146
> [26088] dbg: bayes: token 'may' => 0.951423151521604
> [26088] dbg: bayes: token 'send' => 0.947622860965791
> [26088] dbg: bayes: token 'sent' => 0.0562793425809908
>
> The numbers seem to go from 0 to 1. Is 0 non-spammy and 1 spammy, or is
> there more to it than that? If that's the case, there are quite a few
> common words that bayes doesn't seem to like.
Remember, Bayes interprets tokens in relation to other tokens. Those
high-scoring tokens may be high-scoring due to context.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Think Microsoft cares about your needs at all?
"A company wanted to hold off on upgrading Microsoft Office for a
year in order to do other projects. So Microsoft gave a 'free' copy
of the new Office to the CEO -- a copy that of course generated
errors for anyone else in the firm reading his documents. The CEO
got tired of getting the 'please re-send in XX format' so he
ordered other projects put on hold and the Office upgrade to be top
priority." -- Cringely, 4/8/2004
-----------------------------------------------------------------------
2 days until the Mars Phoenix lander arrives at Mars