RE: Seeing Bayes token matches for an email - SpamAssassin

This is a discussion on RE: Seeing Bayes token matches for an email - SpamAssassin ; Theo Van Dinter wrote: > On Fri, May 23, 2008 at 12:29:21PM -0400, Bowie Bailey wrote: > > I have an email that does not look at all spammy to me. On an > > account where Bayes is trained ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: RE: Seeing Bayes token matches for an email

  1. RE: Seeing Bayes token matches for an email

    Theo Van Dinter wrote:
    > On Fri, May 23, 2008 at 12:29:21PM -0400, Bowie Bailey wrote:
    > > I have an email that does not look at all spammy to me. On an
    > > account where Bayes is trained manually (no auto-learn at all), it
    > > got marked with BAYES_99. Is there a way to see what tokens Bayes
    > > is keying on? I tried running it through "spamassassin -D", but I
    > > didn't see it there.

    >
    > "spamassassin -D bayes"
    >
    > It's too noisy for the standard debug output, but the bayes channel
    > will give you the full info.


    I thought I already tried that. Oh well...

    That gives me the info, now how do I interpret it?

    [26088] dbg: bayes: token 'fax' => 0.0466125715533146
    [26088] dbg: bayes: token 'may' => 0.951423151521604
    [26088] dbg: bayes: token 'send' => 0.947622860965791
    [26088] dbg: bayes: token 'sent' => 0.0562793425809908

    The numbers seem to go from 0 to 1. Is 0 non-spammy and 1 spammy, or is
    there more to it than that? If that's the case, there are quite a few
    common words that bayes doesn't seem to like.

    --
    Bowie


  2. RE: Seeing Bayes token matches for an email

    On Fri, 23 May 2008, Bowie Bailey wrote:

    > That gives me the info, now how do I interpret it?
    >
    > [26088] dbg: bayes: token 'fax' => 0.0466125715533146
    > [26088] dbg: bayes: token 'may' => 0.951423151521604
    > [26088] dbg: bayes: token 'send' => 0.947622860965791
    > [26088] dbg: bayes: token 'sent' => 0.0562793425809908
    >
    > The numbers seem to go from 0 to 1. Is 0 non-spammy and 1 spammy, or is
    > there more to it than that? If that's the case, there are quite a few
    > common words that bayes doesn't seem to like.


    Remember, Bayes interprets tokens in relation to other tokens. Those
    high-scoring tokens may be high-scoring due to context.

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Think Microsoft cares about your needs at all?
    "A company wanted to hold off on upgrading Microsoft Office for a
    year in order to do other projects. So Microsoft gave a 'free' copy
    of the new Office to the CEO -- a copy that of course generated
    errors for anyone else in the firm reading his documents. The CEO
    got tired of getting the 'please re-send in XX format' so he
    ordered other projects put on hold and the Office upgrade to be top
    priority." -- Cringely, 4/8/2004
    -----------------------------------------------------------------------
    2 days until the Mars Phoenix lander arrives at Mars


+ Reply to Thread