RE: spam getting through because of bayes confidence - SpamAssassin

This is a discussion on RE: spam getting through because of bayes confidence - SpamAssassin ; Alex Woick wrote: > > BAYES_00 means that the bayes engine thinks the message is > > definitely not spam. If this rule is hitting on spam messages, you > > have a problem. Unless this is just a really ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: RE: spam getting through because of bayes confidence

  1. RE: spam getting through because of bayes confidence

    Alex Woick wrote:
    > > BAYES_00 means that the bayes engine thinks the message is
    > > definitely not spam. If this rule is hitting on spam messages, you
    > > have a problem. Unless this is just a really hammy looking spam,
    > > you may want to consider retraining your bayes database. And
    > > regardless, you should always manually retrain bayes with any
    > > messages that you catch being misclassified.

    >
    > Generally, you should always train *all* messages that were not
    > trained already. Even mail that was classified correctly. This makes
    > the database definitely more stable than only training a few
    > misclassified messages.
    >
    > The only mail you should omit is mail that was already auto-learned
    > correctly, bounces, backscatter and generally most of the
    > auto-generated stuff (DSN's, statistic reports, cron reports...)


    For best results, you should manually train everything, but that is not
    always practical. I try to manually train bayes for our main company
    email addresses, but even with a low volume server, I frequently find
    that I just don't have time to go through all of the messages.

    I was just trying to point out that regardless of anything else you do,
    you MUST manually train misclassified mail to keep bayes running well.

    --
    Bowie


  2. Re: spam getting through because of bayes confidence

    Hi,

    Our system does train Ham and I do train spam that gets through (where
    possible).
    I thought though that training say 5 emails as spam (assuming they were
    all the same) won't necessarily change the Bayes confidence, is this not
    correct?

    Kate

    Bowie Bailey wrote:
    > Alex Woick wrote:
    >
    >>> BAYES_00 means that the bayes engine thinks the message is
    >>> definitely not spam. If this rule is hitting on spam messages, you
    >>> have a problem. Unless this is just a really hammy looking spam,
    >>> you may want to consider retraining your bayes database. And
    >>> regardless, you should always manually retrain bayes with any
    >>> messages that you catch being misclassified.
    >>>

    >> Generally, you should always train *all* messages that were not
    >> trained already. Even mail that was classified correctly. This makes
    >> the database definitely more stable than only training a few
    >> misclassified messages.
    >>
    >> The only mail you should omit is mail that was already auto-learned
    >> correctly, bounces, backscatter and generally most of the
    >> auto-generated stuff (DSN's, statistic reports, cron reports...)
    >>

    >
    > For best results, you should manually train everything, but that is not
    > always practical. I try to manually train bayes for our main company
    > email addresses, but even with a low volume server, I frequently find
    > that I just don't have time to go through all of the messages.
    >
    > I was just trying to point out that regardless of anything else you do,
    > you MUST manually train misclassified mail to keep bayes running well.
    >
    >


    --

    Kate Kleinschafer
    Internet Services
    GetRheel

    /A division of Rheel Electronics Ltd /
    Phone +64-3-386 3070 Fax +64-3-386-3071
    Mobile +64-21-386-394

    email: kate@rheel.co.nz
    www.getrheel.co.nz

    This e-mail together with any attachments is confidential, may be
    subject to legal privilege and may contain proprietary information,
    including information protected by copyright. If you are not the
    intended recipient, please do not copy, use or disclose this e-mail;
    please notify us immediately by return e-mail and then delete this e-mail.


+ Reply to Thread