SA experts needed here - SPAM examples - SpamAssassin

This is a discussion on SA experts needed here - SPAM examples - SpamAssassin ; Hi, I am losing confident in SA, the training process is pretty slow or it doesn't seem to be learning. I am training SA with around 30-50 manually identified spam (moving spam mails to and spam folder created in squirrelmail ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 31

Thread: SA experts needed here - SPAM examples

  1. SA experts needed here - SPAM examples

    Hi,

    I am losing confident in SA, the training process is pretty slow or it
    doesn't seem to be learning.

    I am training SA with around 30-50 manually identified spam (moving spam
    mails to and spam folder created in squirrelmail and crond the sa-train
    command on that folder every hour to train and delete them).



    The script is tested to be working on the shell before I put it on crond



    However, I found that the learning process is either not right or it is
    rather slow.



    I gone through the headers of the spams and found that even almost identical
    (in content) spams always got a score 0.1 and these spams are received on
    separated occasions across several days. This had made me losing confident
    on SA.



    I wonder if had it setup correct to detect and learn spams . I am using a
    default setup from qmail-toaster cnt50 , do I need more filters to harden my
    defense? Any recommendations you will be appreciated.



    Here are sample samples I taken from my mailbox on this server,

    (eg, sample spam 1 and 8 are almost identical in content but they are both
    scored with only 0.1 . : (



    http://www.keac.com/id3303/spam-egs.txt



  2. Re: SA experts needed here - SPAM examples

    NGSS wrote:
    >
    > Hi,
    >
    > I am losing confident in SA, the training process is pretty slow or it
    > doesn’t seem to be learning.
    >
    > I am training SA with around 30-50 manually identified spam (moving
    > spam mails to and spam folder created in squirrelmail and crond the
    > sa-train command on that folder every hour to train and delete them).
    >
    > The script is tested to be working on the shell before I put it on crond
    >
    > However, I found that the learning process is either not right or it
    > is rather slow.
    >
    > I gone through the headers of the spams and found that even almost
    > identical (in content) spams always got a score 0.1 and these spams
    > are received on separated occasions across several days. This had made
    > me losing confident on SA.
    >
    > I wonder if had it setup correct to detect and learn spams . I am
    > using a default setup from qmail-toaster cnt50 , do I need more
    > filters to harden my defense? Any recommendations you will be
    > appreciated.
    >
    > Here are sample samples I taken from my mailbox on this server,
    >
    > (eg, sample spam 1 and 8 are almost identical in content but they are
    > both scored with only 0.1 … : (
    >
    > http://www.keac.com/id3303/spam-egs.txt
    >


    Turn on URIBLs and Razor

    Content analysis details: (11.6 points, 5.0 required)

    pts rule name description
    ---- ----------------------
    --------------------------------------------------
    0.0 TO_MALFORMED To: has a malformed address
    1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
    above 50%
    [cf: 100]
    0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
    0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
    [cf: 100]
    2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
    [URIs: jesecretary.com]
    2.1 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
    [URIs: jesecretary.com]
    2.9 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
    [URIs: jesecretary.com]
    2.1 URIBL_OB_SURBL Contains an URL listed in the OB SURBL blocklist
    [URIs: jesecretary.com]


  3. Re: SA experts needed here - SPAM examples

    On Tue, 17 Jun 2008, NGSS wrote:

    > I am training SA with around 30-50 manually identified spam (moving spam
    > mails to and spam folder created in squirrelmail and crond the sa-train
    > command on that folder every hour to train and delete them).


    I would suggest hourly is too often (but that may be personal preference),
    and you don't want to delete them. It's a good idea to retain your
    training corpus in case you need to retrain from scratch for some reason.

    > However, I found that the learning process is either not right or it is
    > rather slow.


    What does the learning process report? Are you capturing the output of the
    cron'd sa-learn script?

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    The world has enough Mouse Clicking System Engineers.
    -- Dave Pooser
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  4. Re: SA experts needed here - SPAM examples

    I could be wrong, but I believe for the learning process to be useful,
    you also need to learn HAM.

    (IIRC, an equal amount of each.)

    Evan

    NGSS wrote:
    >
    > Hi,
    >
    > I am losing confident in SA, the training process is pretty slow or it
    > doesn’t seem to be learning.
    >
    > I am training SA with around 30-50 manually identified spam (moving
    > spam mails to and spam folder created in squirrelmail and crond the
    > sa-train command on that folder every hour to train and delete them).
    >
    > The script is tested to be working on the shell before I put it on crond
    >
    > However, I found that the learning process is either not right or it
    > is rather slow.
    >
    > I gone through the headers of the spams and found that even almost
    > identical (in content) spams always got a score 0.1 and these spams
    > are received on separated occasions across several days. This had made
    > me losing confident on SA.
    >
    > I wonder if had it setup correct to detect and learn spams . I am
    > using a default setup from qmail-toaster cnt50 , do I need more
    > filters to harden my defense? Any recommendations you will be
    > appreciated.
    >
    > Here are sample samples I taken from my mailbox on this server,
    >
    > (eg, sample spam 1 and 8 are almost identical in content but they are
    > both scored with only 0.1 … : (
    >
    > http://www.keac.com/id3303/spam-egs.txt
    >



  5. RE: SA experts needed here - SPAM examples

    HI,
    Thanks for the response.

    May I know how I can capture the output of the sa trainer ? I using the
    follow script to do training,

    cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur
    /usr/bin/sa-learn --spam ./*
    cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/* $DIRCOLLECTSPAM
    rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
    cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new
    /usr/bin/sa-learn --spam ./*
    cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/* $DIRCOLLECTSPAM
    rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/*


    I also do the same for the HAM using the same script which section is not
    shown here .


    -----Original Message-----
    From: John Hardin [mailto:jhardin@impsec.org]
    Sent: Tuesday, June 17, 2008 2:54 AM
    To: NGSS
    Cc: users@spamassassin.apache.org; out@netgraphy.com
    Subject: Re: SA experts needed here - SPAM examples

    On Tue, 17 Jun 2008, NGSS wrote:

    > I am training SA with around 30-50 manually identified spam (moving spam
    > mails to and spam folder created in squirrelmail and crond the sa-train
    > command on that folder every hour to train and delete them).


    I would suggest hourly is too often (but that may be personal preference),
    and you don't want to delete them. It's a good idea to retain your
    training corpus in case you need to retrain from scratch for some reason.

    > However, I found that the learning process is either not right or it is
    > rather slow.


    What does the learning process report? Are you capturing the output of the
    cron'd sa-learn script?

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    The world has enough Mouse Clicking System Engineers.
    -- Dave Pooser
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  6. Re: SA experts needed here - SPAM examples

    On Mon, 16 Jun 2008, Evan Platt wrote:

    > I could be wrong, but I believe for the learning process to be useful, you
    > also need to learn HAM.
    >
    > (IIRC, an equal amount of each.)


    Minimum 100 of each spam and ham. The balance should ideally reflect your
    actual ham/spam balance.

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Where We Want You To Go Today 07/05/07: Microsoft patents in-OS
    adware architecture incorporating spyware, profiling, competitor
    suppression and delivery confirmation (U.S. Patent #20070157227)
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  7. RE: SA experts needed here - SPAM examples

    On Tue, 17 Jun 2008, NGSS wrote:

    > HI,
    > Thanks for the response.
    >
    > May I know how I can capture the output of the sa trainer ?


    Well, if you're running the script from cron, stdout and stderr should
    automatically be emailed to the owner of the cron job - unless you are
    explicitly redirecting that output.

    > I using the follow script to do training,
    >
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/*


    Do you see a report of how many messages were seen and how many were
    learned from when you run that interactively? You should see the same
    output in email from the cron job.

    Question: what sets $DOMAIN and $SPAM for the cron job? Remember, cron
    scripts start out with an empty environment. The cron job may not be
    learning anything because the directory paths are screwed up due to
    $DOMAIN and/or $SPAM not being set.

    It's a good idea to do something like this for dynamic paths:

    if ! cd "/home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
    then
    echo "Could not cd to /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
    exit 1
    fi

    > I also do the same for the HAM using the same script which section is
    > not shown here .


    Good.

    You might want to add this to the end of your script to get bayes database
    stats afterward:

    /usr/bin/sa-learn --dump magic

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Where We Want You To Go Today 07/05/07: Microsoft patents in-OS
    adware architecture incorporating spyware, profiling, competitor
    suppression and delivery confirmation (U.S. Patent #20070157227)
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  8. Re: SA experts needed here - SPAM examples

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1



    NGSS schrieb:

    | I am losing confident in SA, the training process is pretty slow or it
    | doesn?t seem to be learning.

    I don't think training is your first and foremost problem.

    It seems that you are not running network tests [1] (esp. RBLs), which
    greatly reduces effectiveness.

    - -- Matthias

    [1] dsl.dynamic81214156119.ttnet.net.tr (81.214.156.119) should trigger
    on half a dozen lists.
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (Darwin)

    iD8DBQFIVsSQxbHw2nyi/okRAjIRAKC95UnLxAcCStxVzS4jJ0r6dOmUygCfXqOi
    TbtfSPptQEDpVcR5LRpQ3hw=
    =h9dV
    -----END PGP SIGNATURE-----


  9. Re: SA experts needed here - SPAM examples

    > Hi,
    > I am losing confident in SA, the training process is
    > pretty slow or it doesn't seem to be learning.
    > I am training SA with around 30-50 manually identified
    > spam (moving spam mails to and spam folder created in
    > squirrelmail and crond the sa-train command on that
    > folder every hour to train and delete them).
    >
    > The script is tested to be working on the shell before I
    > put it on crond
    >
    > However, I found that the learning process is either not
    > right or it is rather slow.
    >
    > I gone through the headers of the spams and found that
    > even almost identical (in content) spams always got a
    > score 0.1 and these spams are received on separated
    > occasions across several days. This had made me losing
    > confident on SA.
    >
    > I wonder if had it setup correct to detect and learn
    > spams . I am using a default setup from qmail-toaster
    > cnt50 , do I need more filters to harden my defense? Any
    > recommendations you will be appreciated.
    >
    > Here are sample samples I taken from my mailbox on this
    > server,
    > (eg, sample spam 1 and 8 are almost identical in content
    > but they are both scored with only 0.1 . : (
    >
    > http://www.keac.com/id3303/spam-egs.txt


    Mail #1 here

    Content preview: == US Drugstore == Voted as No.1 US pharmacy on Internet
    Over 80 meds on our online store We accept Visa, Master Card, JCB, Dinner
    & eCheck [...]

    Content analysis details: (17.4 points, 5.0 required)

    pts rule name description
    ---- ---------------------- --------------------------------------------------
    3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
    [68.243.81.116 listed in zen.spamhaus.org]
    0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
    2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
    [Blocked - see ]
    1.2 INVALID_DATE Invalid Date: header (not RFC 2822)
    1.2 TO_MALFORMED To: has a malformed address
    4.0 BOTNET Relay might be a spambot or virusbot
    [botnet0.8,ip=68.243.81.116,rdns=68-243-81-116.area7.spcsdns.net,maildomain=mediafutures.org, client,ipinhostname]
    1.0 BAYES_60 BODY: Bayesian spam probability is 60 to 80%
    [score: 0.6572]
    0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS
    4.0 JM_SOUGHT_2 JM_SOUGHT_2




  10. Re: SA experts needed here - SPAM examples

    >> http://www.keac.com/id3303/spam-egs.txt
    >
    > 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
    > [68.243.81.116 listed in zen.spamhaus.org]


    Indeed.

    Suggestion: put zen.spamhaus.org in your MTA's DNSBL list. That's a
    reliable BL and should be part of your up-front filtering.

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Your mouse has moved. Your Windows Operating System must be
    relicensed due to this hardware change. Please contact Microsoft
    to obtain a new activation key. If this hardware change results in
    added functionality you may be subject to additional license fees.
    Your system will now shut down. Thank you for choosing Microsoft.
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  11. Re: SA experts needed here - SPAM examples

    On Monday 16 June 2008 2:23 pm, NGSS wrote:
    > HI,
    > Thanks for the response.
    >
    > May I know how I can capture the output of the sa trainer ? I using the
    > follow script to do training,
    >
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/*
    >
    >

    Click on the below link to download a perl script that I run once a week or
    so. It will learn all your spam/ham, report that spam to Razor/Pyzor/DCC (if
    so desired) and also Spamcop if so desired. Prior to the 3.*.* series of SA
    the script 'used' to send me a report of the number of spam/ham learned and
    the total number of spam/ham in my Bayes db. That suddenly quit and since I'm
    no perl programmer I have no idea how to fix it. Maybe someone can take a
    look and see where the problem(s) may lie. I have this located in
    my /usr/local/bin folder and run it like this -
    [chris@cpollock ~]$ reporter.pl

    http://beta.d.thelinkup.com/default....ion=attachment

    HTH
    Chris

    --
    Chris
    KeyID 0xE372A7DA98E6705C

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEABECAAYFAkhXC6UACgkQ43Kn2pjmcFwrCACdF0ycsQKNrP y36vtg2gi2+4O/
    jWcAoIMJxAvxRWFn0EYZ6TVlO/I44jrv
    =kGTY
    -----END PGP SIGNATURE-----


  12. Re: SA experts needed here - SPAM examples

    On Monday 16 June 2008 1:42 pm, NGSS wrote:
    > Hi,
    >
    > I am losing confident in SA, the training process is pretty slow or it
    > doesn't seem to be learning.
    >

    I've taken each of your spam and run them through my home setup. Note, I don't
    run a server and a "sa-learn --dump magic shows:

    147521 0 non-token data: nspam
    28513 0 non-token data: nham

    as the number of spam/ham the system has learned after about 3 years of using
    SA. I've uploaded the results of "spamassassin -D -t "spam" to here:

    http://beta.d.thelinkup.com/default....ion=attachment

    Let me know if you have any problems when/if you go to download the file.

    I've tried sending this multiple times with an example of the score I got from
    one of your spam, however, something seems to prevent it from getting to the
    list, at least I'm not receiving it. Apologies if some of you receive this
    multiple times.

    --
    Chris
    KeyID 0xE372A7DA98E6705C

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEABECAAYFAkhXE/MACgkQ43Kn2pjmcFxVCgCaAyS/obQJun1H2pdJKToYbNa6
    Ds4An0JHNuKh+lYp7Yiax8DpAmBYq0Hl
    =WI5K
    -----END PGP SIGNATURE-----


  13. RE: SA experts needed here - SPAM examples

    Hi Jari,

    This is impressive! I am impressed by the high score it got from your
    machine's analysis. I think this is what I am looking for.

    The lowest score among the rule is 0.9, it is well way of my 0.1 total
    score. I think I really missed out quite a few things. May I know where I
    can alter the ruleset? Do I require additional plugins ? I am using the
    defaults plugins set from Qmail-toaster cnt50 .





    From: Jari Fredriksson [mailto:jarif@iki.fi]
    Sent: Tuesday, June 17, 2008 5:07 AM
    To: NGSS; users@spamassassin.apache.org
    Cc: out@netgraphy.com
    Subject: Re: SA experts needed here - SPAM examples



    > Hi,
    > I am losing confident in SA, the training process is
    > pretty slow or it doesn't seem to be learning.
    > I am training SA with around 30-50 manually identified
    > spam (moving spam mails to and spam folder created in
    > squirrelmail and crond the sa-train command on that
    > folder every hour to train and delete them).
    >
    > The script is tested to be working on the shell before I
    > put it on crond
    >
    > However, I found that the learning process is either not
    > right or it is rather slow.
    >
    > I gone through the headers of the spams and found that
    > even almost identical (in content) spams always got a
    > score 0.1 and these spams are received on separated
    > occasions across several days. This had made me losing
    > confident on SA.
    >
    > I wonder if had it setup correct to detect and learn
    > spams . I am using a default setup from qmail-toaster
    > cnt50 , do I need more filters to harden my defense? Any
    > recommendations you will be appreciated.
    >
    > Here are sample samples I taken from my mailbox on this
    > server,
    > (eg, sample spam 1 and 8 are almost identical in content
    > but they are both scored with only 0.1 . : (
    >
    > http://www.keac.com/id3303/spam-egs.txt


    Mail #1 here



    Content preview: == US Drugstore == Voted as No.1 US pharmacy on Internet
    Over 80 meds on our online store We accept Visa, Master Card, JCB, Dinner
    & eCheck [...]



    Content analysis details: (17.4 points, 5.0 required)



    pts rule name description
    ---- ----------------------
    --------------------------------------------------
    3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
    [68.243.81.116 listed in zen.spamhaus.org]
    0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
    2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
    [Blocked - see
    ]
    1.2 INVALID_DATE Invalid Date: header (not RFC 2822)
    1.2 TO_MALFORMED To: has a malformed address
    4.0 BOTNET Relay might be a spambot or virusbot
    [botnet0.8,ip=68.243.81.116,rdns=68-243-81-116.area7.spcsdns.net,maildomain=
    mediafutures.org,client,ipinhostname]
    1.0 BAYES_60 BODY: Bayesian spam probability is 60 to 80%
    [score: 0.6572]
    0.1 RDNS_NONE Delivered to trusted network by a host with no
    rDNS
    4.0 JM_SOUGHT_2 JM_SOUGHT_2








  14. RE: SA experts needed here - SPAM examples

    Hi John
    I quite sure that the script is running and the variable in $DOMAIN and
    $SPAM are correct ( I defined it early in the script, which are not shown
    here) because the I got a copy for each them in $DIRCOLLECTSPAM and nothing
    in the learning folder, /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*

    I did the The dump from your command and which had given me this

    0.000 0 3 0 non-token data: bayes db version
    0.000 0 1337 0 non-token data: nspam
    0.000 0 6 0 non-token data: nham
    0.000 0 41188 0 non-token data: ntokens
    0.000 0 920269009 0 non-token data: oldest atime
    0.000 0 1213715208 0 non-token data: newest atime
    0.000 0 0 0 non-token data: last journal sync
    atime
    0.000 0 0 0 non-token data: last expiry atime
    0.000 0 0 0 non-token data: last expire atime
    delta
    0.000 0 0 0 non-token data: last expire
    reduction count



    -----Original Message-----
    From: John Hardin [mailto:jhardin@impsec.org]
    Sent: Tuesday, June 17, 2008 3:50 AM
    To: NGSS
    Cc: users@spamassassin.apache.org; out@netgraphy.com
    Subject: RE: SA experts needed here - SPAM examples

    On Tue, 17 Jun 2008, NGSS wrote:

    > HI,
    > Thanks for the response.
    >
    > May I know how I can capture the output of the sa trainer ?


    Well, if you're running the script from cron, stdout and stderr should
    automatically be emailed to the owner of the cron job - unless you are
    explicitly redirecting that output.

    > I using the follow script to do training,
    >
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
    > cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new
    > /usr/bin/sa-learn --spam ./*
    > cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/* $DIRCOLLECTSPAM
    > rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/*


    Do you see a report of how many messages were seen and how many were
    learned from when you run that interactively? You should see the same
    output in email from the cron job.

    Question: what sets $DOMAIN and $SPAM for the cron job? Remember, cron
    scripts start out with an empty environment. The cron job may not be
    learning anything because the directory paths are screwed up due to
    $DOMAIN and/or $SPAM not being set.

    It's a good idea to do something like this for dynamic paths:

    if ! cd "/home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
    then
    echo "Could not cd to /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
    exit 1
    fi

    > I also do the same for the HAM using the same script which section is
    > not shown here .


    Good.

    You might want to add this to the end of your script to get bayes database
    stats afterward:

    /usr/bin/sa-learn --dump magic

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Where We Want You To Go Today 07/05/07: Microsoft patents in-OS
    adware architecture incorporating spyware, profiling, competitor
    suppression and delivery confirmation (U.S. Patent #20070157227)
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  15. RE: SA experts needed here - SPAM examples

    Hi John
    I afraid I had move the ling "-r zen.spamhaus.org" from the
    /var/qmail/control/blacklists .
    Because with this line is in, I can't perform send/receive from most of the
    external network using my Outlook. Is that what you talking about?




    -----Original Message-----
    From: John Hardin [mailto:jhardin@impsec.org]
    Sent: Tuesday, June 17, 2008 5:37 AM
    Cc: NGSS; SpamAssassin Users List
    Subject: Re: SA experts needed here - SPAM examples

    >> http://www.keac.com/id3303/spam-egs.txt

    >
    > 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
    > [68.243.81.116 listed in zen.spamhaus.org]


    Indeed.

    Suggestion: put zen.spamhaus.org in your MTA's DNSBL list. That's a
    reliable BL and should be part of your up-front filtering.

    --
    John Hardin KA7OHZ http://www.impsec.org/~jhardin/
    jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
    key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
    -----------------------------------------------------------------------
    Your mouse has moved. Your Windows Operating System must be
    relicensed due to this hardware change. Please contact Microsoft
    to obtain a new activation key. If this hardware change results in
    added functionality you may be subject to additional license fees.
    Your system will now shut down. Thank you for choosing Microsoft.
    -----------------------------------------------------------------------
    2 days until SWMBO's Birthday


  16. Re: SA experts needed here - SPAM examples

    NGSS wrote:
    > Hi John
    > I quite sure that the script is running and the variable in $DOMAIN and
    > $SPAM are correct ( I defined it early in the script, which are not shown
    > here) because the I got a copy for each them in $DIRCOLLECTSPAM and nothing
    > in the learning folder, /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
    >
    > I did the The dump from your command and which had given me this
    >
    > 0.000 0 3 0 non-token data: bayes db version
    > 0.000 0 1337 0 non-token data: nspam
    > 0.000 0 6 0 non-token data: nham


    You need to learn 200 spam _AND_ 200 HAM messages before Bayes will
    start scoring.



    --
    Anthony Pea****
    CHIME, Royal Free & University College Medical School
    WWW: http://www.chime.ucl.ac.uk/~rmhiajp/
    Study Health Informatics - Modular Postgraduate Degree
    http://www.chime.ucl.ac.uk/study-health-informatics/


  17. Re: SA experts needed here - SPAM examples

    * NGSS :
    > Hi John
    > I afraid I had move the ling "-r zen.spamhaus.org" from the
    > /var/qmail/control/blacklists .
    > Because with this line is in, I can't perform send/receive from most of the
    > external network using my Outlook. Is that what you talking about?


    That's a clear case of a misconfiguration. The host in that RBL may
    not send mail to you, but YOU as AUTHORIZED client may of course send.

    Make sure that the RBL is only applied to non-authorized clients.

    --
    Ralf Hildebrandt (i.A. des IT-Zentrums) Ralf.Hildebrandt@charite.de
    Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
    Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
    IT-Zentrum Standort CBF send no mail to snickebo@charite.de


  18. Can't find re2c

    Hi,
    I tried to do a sa-compile the first time after successfully downloaded the ruleset recommended. But I got this error.

    .....
    re2c -i -b -o scanner1.c scanner1.re
    Can't exec "re2c": No such file or directory at /usr/bin/sa-compile line 287, <$fh> line 974.


    It seemed that it cannot find re2c . I tried to installed the latest spamassassin + tools rpm but still no success (in getting this file). Anyone knows where I can get this file ? is it suppose to come with the package?


  19. Re: Can't find re2c

    * NGSS :
    > Hi,
    > I tried to do a sa-compile the first time after successfully downloaded the ruleset recommended. But I got this error.
    >
    > ....
    > re2c -i -b -o scanner1.c scanner1.re
    > Can't exec "re2c": No such file or directory at /usr/bin/sa-compile line 287, <$fh> line 974.
    >
    >
    > It seemed that it cannot find re2c . I tried to installed the latest spamassassin + tools rpm but still no success (in getting this file). Anyone knows where I can get this file ? is it suppose to come with the package?


    $ apt-cache search re2c
    re2c - tool for generating fast C-based recognizers

    It's a sep. package

    --
    Ralf Hildebrandt (i.A. des IT-Zentrums) Ralf.Hildebrandt@charite.de
    Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
    Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
    IT-Zentrum Standort CBF send no mail to snickebo@charite.de


  20. Re: Can't find re2c

    On 17.06.08 18:52, NGSS wrote:
    > I tried to do a sa-compile the first time after successfully downloaded
    > the ruleset recommended. But I got this error.


    Please, configure your mailer to wrap long lines below 80 characters per
    line.

    > ....
    > re2c -i -b -o scanner1.c scanner1.re
    > Can't exec "re2c": No such file or directory at /usr/bin/sa-compile line 287, <$fh> line 974.
    >
    >
    > It seemed that it cannot find re2c . I tried to installed the latest
    > spamassassin + tools rpm but still no success (in getting this file).
    > Anyone knows where I can get this file ? is it suppose to come with the
    > package?


    re2c is external package, not part of SA.
    --
    Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
    Warning: I wish NOT to receive e-mail advertising to this address.
    Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
    BSE = Mad Cow Desease ... BSA = Mad Software Producents Desease


+ Reply to Thread
Page 1 of 2 1 2 LastLast