How to get clarity on AWL? - SpamAssassin

This is a discussion on How to get clarity on AWL? - SpamAssassin ; A lot of my mail is tagged with AWL, and I am often baffled. Here are what I think are the relevent headers from a perplexing example: Return-Path: X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on fnord.ir.bbn.com X-Spam-Status: Yes, score=6.8 required=1.0 tests=AWL,BAYES_95,DEAR_WINNER, HTML_MESSAGE,SUBJ_ALL_CAPS ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: How to get clarity on AWL?

  1. How to get clarity on AWL?

    A lot of my mail is tagged with AWL, and I am often baffled. Here are
    what I think are the relevent headers from a perplexing example:

    Return-Path:
    X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on fnord.ir.bbn.com
    X-Spam-Status: Yes, score=6.8 required=1.0 tests=AWL,BAYES_95,DEAR_WINNER,
    HTML_MESSAGE,SUBJ_ALL_CAPS autolearn=spam version=3.2.4
    X-Spam-Report:
    * 2.1 SUBJ_ALL_CAPS Subject is all capitals
    * 3.2 DEAR_WINNER BODY: DEAR_WINNER
    * 0.0 HTML_MESSAGE BODY: HTML included in message
    * 3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99%
    * [score: 0.9582]
    * -1.5 AWL AWL: From: address is in the auto white-list
    From: "AUSTRALIAN LOTTERY INTL"

    Reading http://wiki.apache.org/spamassassin/AwlWrongWay, I realize I am
    confused - this sender has a positive average, and this message was more
    spammy, and thus given credit for somewhat-less-spammy previous mail.

    I think that I should be able to infer that because this message was 8.3
    before AWL, and AWL was -1.5, that the average is 5.3. But if the message said

    * -1.5 AWL AWL: From: address is in the auto white-list at 5.3 for 12 messages

    it would make things easier to follow. Plus, the AutoWhitelist wiki
    entry says that the key is also IP address that the mail "originated
    at", and it would be nice to print that out, since it's non-obvious what
    that means (last hop before trusted relay, or relying on maybe-forged
    received lines?).

    Somewhat separately, the spamassasin program has options to manipulate
    whitelist, blacklist:

    -W, --add-to-whitelist Add addresses in mail to persistent address whitelist
    --add-to-blacklist Add addresses in mail to persistent address blacklist
    -R, --remove-from-whitelist Remove all addresses found in mail from
    persistent address list
    --add-addr-to-whitelist=addr Add addr to persistent address whitelist
    --add-addr-to-blacklist=addr Add addr to persistent address blacklist
    --remove-addr-from-whitelist=addr Remove addr from persistent address list

    but I don't see any to print out the lists and scores for inspection,
    and I'm unclear on the AWL vs persistent white/black lists. I think it would make sense to have

    --print-whitelist
    --print-blacklist
    --print-autowhitelist

    or perhaps only one is needed, and also

    --lookup-in-whitelists=addr

    to print the white/black/auto status of an address.


  2. Re: How to get clarity on AWL?

    Greg Troxel wrote:
    > A lot of my mail is tagged with AWL, and I am often baffled. Here are
    > what I think are the relevent headers from a perplexing example:
    >
    > Return-Path:
    > X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on fnord.ir.bbn.com
    > X-Spam-Status: Yes, score=6.8 required=1.0 tests=AWL,BAYES_95,DEAR_WINNER,
    > HTML_MESSAGE,SUBJ_ALL_CAPS autolearn=spam version=3.2.4
    > X-Spam-Report:
    > * 2.1 SUBJ_ALL_CAPS Subject is all capitals
    > * 3.2 DEAR_WINNER BODY: DEAR_WINNER
    > * 0.0 HTML_MESSAGE BODY: HTML included in message
    > * 3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99%
    > * [score: 0.9582]
    > * -1.5 AWL AWL: From: address is in the auto white-list
    > From: "AUSTRALIAN LOTTERY INTL"
    >
    > Reading http://wiki.apache.org/spamassassin/AwlWrongWay, I realize I am
    > confused - this sender has a positive average, and this message was more
    > spammy, and thus given credit for somewhat-less-spammy previous mail.
    >
    > I think that I should be able to infer that because this message was 8.3
    > before AWL, and AWL was -1.5, that the average is 5.3. But if the message said
    >
    > * -1.5 AWL AWL: From: address is in the auto white-list at 5.3 for 12 messages
    >
    > it would make things easier to follow. Plus, the AutoWhitelist wiki
    > entry says that the key is also IP address that the mail "originated
    > at", and it would be nice to print that out, since it's non-obvious what
    > that means (last hop before trusted relay, or relying on maybe-forged
    > received lines?).
    >

    Agreed this would make things clearer.. either that or have a tag setup
    so you can add it to the report or an X-Spam-AWL header with these
    details, should you so choose.

    > Somewhat separately, the spamassasin program has options to manipulate
    > whitelist, blacklist:
    >
    > -W, --add-to-whitelist Add addresses in mail to persistent address whitelist
    > --add-to-blacklist Add addresses in mail to persistent address blacklist
    > -R, --remove-from-whitelist Remove all addresses found in mail from
    > persistent address list
    > --add-addr-to-whitelist=addr Add addr to persistent address whitelist
    > --add-addr-to-blacklist=addr Add addr to persistent address blacklist
    > --remove-addr-from-whitelist=addr Remove addr from persistent address list
    >
    > but I don't see any to print out the lists and scores for inspection,
    > and I'm unclear on the AWL vs persistent white/black lists. I think it would make sense to have
    >

    All of the above pertains to the AWL only. Persistent white/black list
    entries in your local.cf or user_prefs will show up as separate rule
    hits like USER_IN_WHITELIST.

    > --print-whitelist
    > --print-blacklist
    > --print-autowhitelist
    >
    > or perhaps only one is needed, and also
    >
    > --lookup-in-whitelists=addr
    >
    > to print the white/black/auto status of an address.
    >

    There is a tool that does this, but it's not included in the
    distribution. The check_whitelist script is available from the SVN.

    http://svn.apache.org/repos/asf/spam...heck_whitelist

    However, this tool is a bit crude, and it would be much nicer if this
    was all built into a separate sa-learn-like utility that handled AWL
    learning, forgetting and dumping.


  3. Re: How to get clarity on AWL?

    On Friday 23 May 2008 9:42 am, Greg Troxel wrote:
    > A lot of my mail is tagged with AWL, and I am often baffled. Here are
    > what I think are the relevent headers from a perplexing example:
    >
    > Return-Path:
    > X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on
    > fnord.ir.bbn.com X-Spam-Status: Yes, score=6.8 required=1.0
    > tests=AWL,BAYES_95,DEAR_WINNER, HTML_MESSAGE,SUBJ_ALL_CAPS autolearn=spam
    > version=3.2.4
    > X-Spam-Report:
    > * 2.1 SUBJ_ALL_CAPS Subject is all capitals
    > * 3.2 DEAR_WINNER BODY: DEAR_WINNER
    > * 0.0 HTML_MESSAGE BODY: HTML included in message
    > * 3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99%
    > * [score: 0.9582]
    > * -1.5 AWL AWL: From: address is in the auto white-list
    > From: "AUSTRALIAN LOTTERY INTL"
    >
    > Reading http://wiki.apache.org/spamassassin/AwlWrongWay, I realize I am
    > confused - this sender has a positive average, and this message was more
    > spammy, and thus given credit for somewhat-less-spammy previous mail.
    >
    > I think that I should be able to infer that because this message was 8.3
    > before AWL, and AWL was -1.5, that the average is 5.3. But if the message
    > said
    >
    > * -1.5 AWL AWL: From: address is in the auto white-list at 5.3
    > for 12 messages
    >

    I use a little perl script that I got somewhere in 2004 that takes your AWL
    and makes a hashed and plain test version. The entries look like this:

    7929be75889dbf08c8efc87d226a1974 2 82.058
    iuuzn@msn.com|ip=220.81 2 82.058

    Here is the explanation from the script itself:

    # The keys of this hash are like
    # pamela4701@eudoramail.com|ip=213.41|totscore
    # and the values are like
    # 8.7472
    # test with values(%hash); and keys(%hash);
    # every mail address has two entries:
    # e.g.
    # pamela4701@eudoramail.com|ip=213.41|totscore
    # pamela4701@eudoramail.com|ip=213.41
    # where totscore is the over-all score (value) and the
    # value of the second line is the count
    # of mails received from this sender
    # write this to a file one entry per line and nice it a little bit
    # replace | with ' '
    # do it with a hash of hashes, keys are mailaddresses, subkeys are totalscore
    and score
    # IMPORTANT: Every time the hash is accessed it returns the value
    # key triples in a different order
    # (the triples not the keys and values itself of course)
    # just in case you are wondering

    If this is something like you're looking for I could post it at a download
    site.

    --
    Chris
    KeyID 0xE372A7DA98E6705C

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEABECAAYFAkg3fH0ACgkQ43Kn2pjmcFyo3wCeIvgyfT5+Td +bm2PbhAEsBYFL
    NEsAnjW7Pj8AbmkU1A7xLtksydyyPHnr
    =jeha
    -----END PGP SIGNATURE-----


+ Reply to Thread