On 8 Nov 2008, at 00:09, Matt Kettler wrote:

> Matt Kettler wrote:
>> Neil wrote:
>>
>>> So maybe this is moving slightly off on a tangent, but:
>>> Why does auto-learn sometimes learn spam with a rating of X, but not
>>> spam with a rating of X+Y? Where's it's methodology?
>>>

>>
>> First, there's several rules involved here.
>>
>> To autolearn as spam *ALL* of the following must be met:
>>
>> -must have at least 3 points from header type rules
>> -must have at least 3 points from body type rules
>> -must not already match a low-scoring bayes rule in the existing
>> training (ie: BAYES_00) This prevents autolearning from
>> contradicting
>> existing training.
>> -After recomputing the score of the message as if bayes and all
>> userconf
>> rules were disabled (including changing the scoreset! This makes a
>> big
>> difference in some cases.), that score must be over the spam learning
>> threshold. This prevents bayes from engaging in self-feedback or
>> feedback based on manual whitelists (which, if misconfigured would
>> cause
>> a "bayes hangover" of mis-learned mail).
>>
>> Generally speaking, the score you see in the message header has
>> only a
>> loose correlation with the score used for learning checks.
>>
>>

> Oh, one more rule I missed:
>
> -The write lock for the bayes DB must be free. (ie: no other
> learning or
> expiry going on at the time). It will not block and wait for it, it
> will
> simply move on, but it will report autolearn=failed instead of
> autolearn=no. This prevents autolearning from log jamming your mail
> queue.


Thanks for that in-depth description; it helps me have a (less) vague
idea of what I'm doing.

-N.