What about the junk filter? - Mozilla

This is a discussion on What about the junk filter? - Mozilla ; On 2007-01-20 04:05 (-0700 UTC), Moz Champion (Dan) wrote: > Chris Barnes wrote: >> Moz Champion (Dan) wrote: >>> Well, a Baysien filter doesnt look at only subject lines or such it >>> looks at the entire message. >>> >>> ...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 48

Thread: What about the junk filter?

  1. Re: What about the junk filter?

    On 2007-01-20 04:05 (-0700 UTC), Moz Champion (Dan) wrote:

    > Chris Barnes wrote:
    >> Moz Champion (Dan) wrote:
    >>> Well, a Baysien filter doesnt look at only subject lines or such it
    >>> looks at the entire message.
    >>>
    >>> And I never found that to be the case, I had to click on a 'similar'
    >>> spam for perhaps 3 times, and after that it was caught. My JMC
    >>> catches image spams, every other type of spam, you name it. As I said
    >>> it is catching 200 spam for every 1 I see in my inbox..

    >>
    >> Sort of a related question for you: I am fairly familiar with
    >> SpamAssassin (a server side spam scoring system). It's Baysian filter
    >> requires that it you teach it not only what is spam, but also what is
    >> NOT spam (ie. ham).
    >> To do this, I took a corpus of ~500 legitimate messages and "taught"
    >> SA that they were ham.
    >>
    >> Now for the question:
    >> Does TB's JMC learn ham? If so, how (other than marking the rare
    >> message that is mis-identified as spam as "not junk")?

    >
    > In a minimal manner.
    > When you first start JMC, all messages are marked as spam until you mark
    > some as non junk (and restart). Then all messages are not marked as spam
    > until you mark some as junk (and restart)
    >
    > Other than that, its false positives.


    Wrong.

    > But, then, why does it need to? A catch ratio of 99.8% is hardly poor or
    > non effective. In other words, for every 1 spam that 'gets through'
    > there were 200 spams that did not. Quite effective.
    >
    > How much time did you spend marking those 500 messages to teach it ham?
    > Twenty minutes? Well, I could mark all the ones JMC misses for almost 10
    > years with those twenty minutes (2 seconds per). So even if teaching it
    > what is ham improved your catch ratio to 100% (which I doubt) - You have
    > still spend MORE time doing it than I will for 10 years.
    >
    > Teaching it with more ham will NOT improve your catch ratio, it will
    > improve your false positive ratio - which even you admit is rare! So
    > what is the point? You might be 500 times LESS likely to get a false
    > positive than I am. I only get one in six months anyway, do you get one
    > in 250 years .


    If, either through inability or wilfulness, you do not or will understand
    how something is intended to work, fine -- but please don't try to pass off
    that mis-understanding on others.

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  2. Re: What about the junk filter?

    On 2007-01-20 04:18 (-0700 UTC), Moz Champion (Dan) wrote:

    > Tumbleweed wrote:
    >> "Moz Champion (Dan)" wrote in message
    >> newsoednT4B542Phy3YnZ2dnUVZ_sfinZ2d@mozilla.org...
    >>
    >>> Again, how is 99.8 percent catch ratio and one false postive in over
    >>> six months not working?
    >>>

    >>
    >> does the JMC filter only work if its you? Otherwise, why doesnt it
    >> work for me and others? Is our spam more sophisticated?
    >>
    >> my training.dat is 748kb. Thats after about 2 months of use. Should I
    >> delete and start again?
    >>
    >> Also, how do I teach it what a 'good' message is? is it necessary to
    >> have false hits and tell it they arent spam? I dont get many of those,
    >> just lots of spam that isnt marked as such. I'd have thought that if
    >> every single message that contained the word '0EM' was marked by me as
    >> spam (when it missed it as it seems to) that after at least 20 hits,
    >> maybe many more, it would have worked that out.

    >
    > Thats not what I wrote.


    God only knows what you (meant to) write, because you seem to insist on the
    right to interpret your (own) words in whatever way you wish. :-(

    > IF you are not having problems, then dont even WORRY about the size of
    > your training.dat.
    > PROBLEMS due to training.dat size dont even START till over 1MB anyway.


    You are aware that you seem to be the only one here fixated on this idea
    that there is a threshold at which problems due to the size of training.dat
    begin to occur.

    You also use this odd time argument in defending your specific use of
    training.dat. I really do wish you would take some of the time you spend
    defending your spurious arguments regarding JMC actually to inform yourself
    about Bayesian filters and how they work. :-(

    > Well, I dont know about more sophisticated spam, but whenever a user
    > here has forwarded on spam to see if my JMC 'catches' it, it has, when
    > his has not!
    >
    > What is YOUR catch ratio? To figure it out simply count the number of
    > messages that you see in your inbox (es) and then the number of spam
    > messages caught by JMC over the same period of time. To make it easier
    > to figure use the length of time it takes to accumulate 100 spams
    > (caught by JMC) as the time period.
    >
    > All I do is. Mark any message that is spam as junk, and unmark any
    > message that isnt spam (but JMC caught) as non junk. Period. I dont
    > worry about 'ham' or teaching it ham, or whats good or not. All I care
    > about is is catching most of the spam and putting it aside for me. And
    > it is catching 99.8% of the spam sent my way.


    How nice of you finally to admit that you don't give a rat's a** about how
    Bayesian filters are designed -- /i.e./, how they actually, you know, are
    intended to work. :-(

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  3. Re: What about the junk filter?

    On 2007-01-20 04:50 (-0700 UTC), Tumbleweed wrote:

    > "Brian Heinrich" wrote in message
    > news:jrednZkHcOE2xCzYnZ2dnUVZ_segnZ2d@mozilla.org. ..
    >>> I dont get many of those, just lots of spam that isnt marked as such. I'd
    >>> have thought that if every single message that contained the word '0EM'
    >>> was marked by me as spam (when it missed it as it seems to) that after at
    >>> least 20 hits, maybe many more, it would have worked that out.

    >> Not necessarily. OEM on itself is just a token. If you've marked e-mail
    >> in which you've corresponded with someone named Frank about OEM software
    >> as ham, there's a not-insignificant probability that the tokens |Frank|
    >> and |OEM| in the same e-mail won't be marked as spam.
    >>
    >> The problem is that the token |OEM| might be found in conjunction with
    >> |cheap|, |che ap|, |cheep|, |ch eep|, |ch eap|, |Microsoft|, |Macrosoft|,
    >> |Microsfat|, |Microsaft|, |Macrosaft|, &c, &c; in other words, all those
    >> mis-spellings are just another way of 'poisoning' Bayesian filters.

    >
    > this is '0EM where the '0' is a zero, not a letter. Hence its never
    > encountered elsewhere, which is why I am surprised its missing it. Same for
    > s0ftware where the second letter is a number not a letter.


    I'm not entirely sure what's happening; I just checked my own training.dat,
    and I neither |OEM| nor |0EM| nor |s0ftware| appears there (altho'
    |software| does); you might want to have a look at your own.

    Sorry I wasn't able to be of more help. . . .

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  4. Re: What about the junk filter?

    Brian Heinrich wrote:
    > On 2007-01-20 04:05 (-0700 UTC), Moz Champion (Dan) wrote:
    >
    >> Chris Barnes wrote:
    >>> Moz Champion (Dan) wrote:
    >>>> Well, a Baysien filter doesnt look at only subject lines or such it
    >>>> looks at the entire message.
    >>>>
    >>>> And I never found that to be the case, I had to click on a 'similar'
    >>>> spam for perhaps 3 times, and after that it was caught. My JMC
    >>>> catches image spams, every other type of spam, you name it. As I
    >>>> said it is catching 200 spam for every 1 I see in my inbox..
    >>>
    >>> Sort of a related question for you: I am fairly familiar with
    >>> SpamAssassin (a server side spam scoring system). It's Baysian
    >>> filter requires that it you teach it not only what is spam, but also
    >>> what is NOT spam (ie. ham).
    >>> To do this, I took a corpus of ~500 legitimate messages and "taught"
    >>> SA that they were ham.
    >>>
    >>> Now for the question:
    >>> Does TB's JMC learn ham? If so, how (other than marking the rare
    >>> message that is mis-identified as spam as "not junk")?

    >>
    >> In a minimal manner.
    >> When you first start JMC, all messages are marked as spam until you
    >> mark some as non junk (and restart). Then all messages are not marked
    >> as spam until you mark some as junk (and restart)
    >>
    >> Other than that, its false positives.

    >
    > Wrong.
    >
    >> But, then, why does it need to? A catch ratio of 99.8% is hardly poor
    >> or non effective. In other words, for every 1 spam that 'gets through'
    >> there were 200 spams that did not. Quite effective.
    >>
    >> How much time did you spend marking those 500 messages to teach it
    >> ham? Twenty minutes? Well, I could mark all the ones JMC misses for
    >> almost 10 years with those twenty minutes (2 seconds per). So even if
    >> teaching it what is ham improved your catch ratio to 100% (which I
    >> doubt) - You have still spend MORE time doing it than I will for 10
    >> years.
    >>
    >> Teaching it with more ham will NOT improve your catch ratio, it will
    >> improve your false positive ratio - which even you admit is rare! So
    >> what is the point? You might be 500 times LESS likely to get a false
    >> positive than I am. I only get one in six months anyway, do you get
    >> one in 250 years .

    >
    > If, either through inability or wilfulness, you do not or will
    > understand how something is intended to work, fine -- but please don't
    > try to pass off that mis-understanding on others.
    >
    > /b.
    >



    You keep calling me wrong... yet my catch ratio is 99.8 percent. How
    wrong is that?

    Once again, you remind me of a scientest who claims that flies cant fly,
    because the theory says they cant! Well gum durn it, flies DO fly, seems
    no one taught them they cant!

    What works, works.

  5. Re: What about the junk filter?

    Brian Heinrich wrote:
    > On 2007-01-20 04:18 (-0700 UTC), Moz Champion (Dan) wrote:
    >
    >> Tumbleweed wrote:
    >>> "Moz Champion (Dan)" wrote in message
    >>> newsoednT4B542Phy3YnZ2dnUVZ_sfinZ2d@mozilla.org...
    >>>
    >>>> Again, how is 99.8 percent catch ratio and one false postive in over
    >>>> six months not working?
    >>>>
    >>>
    >>> does the JMC filter only work if its you? Otherwise, why doesnt it
    >>> work for me and others? Is our spam more sophisticated?
    >>>
    >>> my training.dat is 748kb. Thats after about 2 months of use. Should I
    >>> delete and start again?
    >>>
    >>> Also, how do I teach it what a 'good' message is? is it necessary to
    >>> have false hits and tell it they arent spam? I dont get many of
    >>> those, just lots of spam that isnt marked as such. I'd have thought
    >>> that if every single message that contained the word '0EM' was marked
    >>> by me as spam (when it missed it as it seems to) that after at least
    >>> 20 hits, maybe many more, it would have worked that out.

    >>
    >> Thats not what I wrote.

    >
    > God only knows what you (meant to) write, because you seem to insist on
    > the right to interpret your (own) words in whatever way you wish. :-(
    >
    >> IF you are not having problems, then dont even WORRY about the size of
    >> your training.dat.
    >> PROBLEMS due to training.dat size dont even START till over 1MB anyway.

    >
    > You are aware that you seem to be the only one here fixated on this idea
    > that there is a threshold at which problems due to the size of
    > training.dat begin to occur.
    >
    > You also use this odd time argument in defending your specific use of
    > training.dat. I really do wish you would take some of the time you
    > spend defending your spurious arguments regarding JMC actually to inform
    > yourself about Bayesian filters and how they work. :-(
    >
    >> Well, I dont know about more sophisticated spam, but whenever a user
    >> here has forwarded on spam to see if my JMC 'catches' it, it has, when
    >> his has not!
    >>
    >> What is YOUR catch ratio? To figure it out simply count the number of
    >> messages that you see in your inbox (es) and then the number of spam
    >> messages caught by JMC over the same period of time. To make it easier
    >> to figure use the length of time it takes to accumulate 100 spams
    >> (caught by JMC) as the time period.
    >>
    >> All I do is. Mark any message that is spam as junk, and unmark any
    >> message that isnt spam (but JMC caught) as non junk. Period. I dont
    >> worry about 'ham' or teaching it ham, or whats good or not. All I care
    >> about is is catching most of the spam and putting it aside for me. And
    >> it is catching 99.8% of the spam sent my way.

    >
    > How nice of you finally to admit that you don't give a rat's a** about
    > how Bayesian filters are designed -- /i.e./, how they actually, you
    > know, are intended to work. :-(
    >
    > /b.
    >



    And, according to scientists flies cant fly either. So where does that
    leave you? Denying what works.

    Once again a 99.8% success ratio is working - regardless of your
    complaints to the contrary - regardless that you complain it cant,
    regardless of your theories.

    It works, plain and simple. And until it doesnt work all your theories
    dont mean squat (thanks to Bill Horne)

  6. Re: What about the junk filter?

    On 2007-01-20 12:58 (-0700 UTC), Moz Champion (Dan) wrote:

    > Brian Heinrich wrote:




    >> If, either through inability or wilfulness, you do not or will
    >> understand how something is intended to work, fine -- but please don't
    >> try to pass off that mis-understanding on others.

    >
    > You keep calling me wrong... yet my catch ratio is 99.8 percent. How
    > wrong is that?
    >
    > Once again, you remind me of a scientest who claims that flies cant fly,
    > because the theory says they cant! Well gum durn it, flies DO fly, seems
    > no one taught them they cant!
    >
    > What works, works.


    Dan, this *isn't* about *you*, *your* use or mis-use of JMC, or even *your*
    catch ratio -- so little so, in fact, that I've not bothered to raise the
    question of your wonky numbers -- ; it *is* about providing *users* with the
    best-informed answers and best options available.

    Moreover, by constantly hammering home how efficient JMC are for you, you
    implicitly raise unreasonable expectations for the practice you recommend.

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  7. Re: What about the junk filter?

    On 2007-01-20 13:02 (-0700 UTC), Moz Champion (Dan) wrote:



    > And, according to scientists flies cant fly either. So where does that
    > leave you? Denying what works.


    Nope. Not at all.

    > Once again a 99.8% success ratio [ . . . ]


    This has been bugging me for a while now: 99.8% isn't a ratio; nor does the
    number indicate what you seem to think it means. . . .

    > [ . . . ] is working - regardless of your
    > complaints to the contrary - regardless that you complain it cant,
    > regardless of your theories.


    What theories? I've merely indicated how Bayesian filters work.

    > It works, plain and simple. And until it doesnt work all your theories
    > dont mean squat (thanks to Bill Horne)


    Again: What theories?

    Let me point this out to you again (quoting from
    ):


    Bayes's theorem, in the context of spam, says that the probability that an
    email is spam, given that it has certain words in it, is equal to the
    probability of finding those certain words in spam email, times the
    probability that any email is spam, divided by the probability of finding
    those words in any email:

    Pr(spam|words) = (Pr(words|spam) * Pr(spam)) / Pr(words)


    You did catch the bit there about 'the probability of finding those words in
    any email', right?

    So, are you now telling us that the people who actually implemented these
    filtering techniques based on Bayes' theorem were wrong and that you
    understand the mathematics behind it better than they do?

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  8. Re: What about the junk filter?

    On 1/20/2007 5:06 PM Brian Heinrich spake these words of knowledge:

    > On 2007-01-20 13:02 (-0700 UTC), Moz Champion (Dan) wrote:
    >
    >
    >
    >> And, according to scientists flies cant fly either. So where does that
    >> leave you? Denying what works.

    >
    > Nope. Not at all.
    >
    >> Once again a 99.8% success ratio [ . . . ]

    >
    > This has been bugging me for a while now: 99.8% isn't a ratio; nor does the
    > number indicate what you seem to think it means. . . .
    >


    In fact, any number expressed as a fraction (including percentages,
    which are fractions with a denominator of 100) is by definition a ratio.

    Outside of that, FWIW, I agree with you about the entire thread, Brian.


    Dan: the science of aerodynamics once led scientists to state that
    according to accepted and working aerodynamic theories, *bumblebees*
    could not fly. Not flies. This has not been the case for some time.

    Brian: in the words of Robert A. Heinlein, never attempt to teach a pig
    to sing. It wastes your time and annoys the pig. I know you wish to
    continue to be civil, and I applaud that. But the word 'fruitless'
    comes to mind.


    RFT!!!
    Dave Kelsen
    --
    At dinner yesterday, I tried to cut myself a slice of prime rib, but it
    was only divisible by itself and one.

  9. Re: What about the junk filter?

    On 2007-01-20 22:38 (-0700 UTC), Dave Kelsen wrote:

    > On 1/20/2007 5:06 PM Brian Heinrich spake these words of knowledge:
    >
    >> On 2007-01-20 13:02 (-0700 UTC), Moz Champion (Dan) wrote:
    >>
    >>
    >>
    >>> And, according to scientists flies cant fly either. So where does
    >>> that leave you? Denying what works.

    >>
    >> Nope. Not at all.
    >>
    >>> Once again a 99.8% success ratio [ . . . ]

    >>
    >> This has been bugging me for a while now: 99.8% isn't a ratio; nor
    >> does the number indicate what you seem to think it means. . . .

    >
    > In fact, any number expressed as a fraction (including percentages,
    > which are fractions with a denominator of 100) is by definition a ratio.


    Fair 'nuff. :-[

    I was thinking of something that was dinned into me in grade school;
    basically, if Dan's *rate* of successfully identified spam is 99.8%, the
    *ratio* of correctly identified spam to false positives would be 500:1, and
    there would be a *probability* of .998 that JMC will correctly identify spam.

    > Outside of that, FWIW, I agree with you about the entire thread, Brian.
    >
    > Dan: the science of aerodynamics once led scientists to state that
    > according to accepted and working aerodynamic theories, *bumblebees*
    > could not fly. Not flies. This has not been the case for some time.
    >
    > Brian: in the words of Robert A. Heinlein, never attempt to teach a pig
    > to sing. It wastes your time and annoys the pig. I know you wish to
    > continue to be civil, and I applaud that. But the word 'fruitless'
    > comes to mind.


    Unfortunately, I lent my copy of /Time Enough for Love/ to someone in 1993
    and never got it back. I should pro'ly invest in a new copy; after the last
    couple o' years, I think it's definitely time to read 'The Man who was too
    Lazy to Fail' again. :-D

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  10. Re: What about the junk filter?

    Brian Heinrich wrote:
    > On 2007-01-20 12:58 (-0700 UTC), Moz Champion (Dan) wrote:
    >
    >> Brian Heinrich wrote:

    >
    >
    >
    >>> If, either through inability or wilfulness, you do not or will
    >>> understand how something is intended to work, fine -- but please
    >>> don't try to pass off that mis-understanding on others.

    >>
    >> You keep calling me wrong... yet my catch ratio is 99.8 percent. How
    >> wrong is that?
    >>
    >> Once again, you remind me of a scientest who claims that flies cant
    >> fly, because the theory says they cant! Well gum durn it, flies DO
    >> fly, seems no one taught them they cant!
    >>
    >> What works, works.

    >
    > Dan, this *isn't* about *you*, *your* use or mis-use of JMC, or even
    > *your* catch ratio -- so little so, in fact, that I've not bothered to
    > raise the question of your wonky numbers -- ; it *is* about providing
    > *users* with the best-informed answers and best options available.
    >
    > Moreover, by constantly hammering home how efficient JMC are for you,
    > you implicitly raise unreasonable expectations for the practice you
    > recommend.
    >
    > /b.
    >



    And telling them what works for me so well is NOT the best option
    available? It works, it works well, and it takes almost no time at all
    to do it. Thats not a good option?

    One again, you are telling flies they cant fly, well, the flies aint
    listening!

    What works, works. Period.


    You can explain again and again and again why flies CANT fly. But they
    do. You can explain again and again and again why the way I suggest
    SHOULDNT work, but the fact of the matter is, it does work. Period.

    In fact in the thread that started this again, the OP came back and said
    his JMC has improved since he followed the instructions I suggestted.
    And so have others in the past. So it works, and it works well.

    So fine, you got a BETTER way to do it? One that takes LESS time, is
    LESS complicated? No? Then why is it better? To do it YOUR way you have
    to download another program, learn how to use it, and then spend some
    time 'doing' it. For what benefit? 1 more message caught out of each 200?

  11. Re: What about the junk filter?

    On 2007-01-21 03:49 (-0700 UTC), Moz Champion (Dan) wrote:

    > Brian Heinrich wrote:
    >> On 2007-01-20 12:58 (-0700 UTC), Moz Champion (Dan) wrote:
    >>
    >>> Brian Heinrich wrote:

    >>
    >>
    >>
    >>>> If, either through inability or wilfulness, you do not or will
    >>>> understand how something is intended to work, fine -- but please
    >>>> don't try to pass off that mis-understanding on others.
    >>>
    >>> You keep calling me wrong... yet my catch ratio is 99.8 percent. How
    >>> wrong is that?
    >>>
    >>> Once again, you remind me of a scientest who claims that flies cant
    >>> fly, because the theory says they cant! Well gum durn it, flies DO
    >>> fly, seems no one taught them they cant!
    >>>
    >>> What works, works.

    >>
    >> Dan, this *isn't* about *you*, *your* use or mis-use of JMC, or even
    >> *your* catch ratio -- so little so, in fact, that I've not bothered to
    >> raise the question of your wonky numbers -- ; it *is* about providing
    >> *users* with the best-informed answers and best options available.
    >>
    >> Moreover, by constantly hammering home how efficient JMC are for you,
    >> you implicitly raise unreasonable expectations for the practice you
    >> recommend.

    >
    > And telling them what works for me so well is NOT the best option
    > available? It works, it works well, and it takes almost no time at all
    > to do it. Thats not a good option?


    How much clearer can I make it that this isn't about you or what works for
    you? It is about the fact that the advice you give overtly ignores --
    indeed, denies -- one part of how Bayes' theorem is applied to spam.

    > One again, you are telling flies they cant fly, well, the flies aint
    > listening!


    This comparison is not at all to the point.

    If a scientist is looking at a bumblebee and wondering how it is that it can
    fly, he or she insisting to the bee, 'You can't fly!'; he or she is
    thinking, 'Within our current theoretical understanding of how these things
    work, here is an instance that would seem to contradict that understanding
    -- the proverbial exception that proves the rule -- ; how, then, must that
    theoretical framework be adjusted or reworked in order to include this
    instance?'

    A better example would be Newtonian mechanics: By the latter part of the
    19th century, physicists thought they had it all pretty much sussed out and
    all that was required was to fill in some gaps to arrive at The Grand
    Unified Theory of how Everything Works. Then this Swiss postal clerk came
    along a threw a spanner into the works. Did the physicists say, 'You're
    wrong!'? or did they try to figure out how their theories worked within the
    context of this new theory?

    > What works, works. Period.


    And there's more than one way in which to skin a cat.

    > You can explain again and again and again why flies CANT fly. But they
    > do. You can explain again and again and again why the way I suggest
    > SHOULDNT work, but the fact of the matter is, it does work. Period.


    Would you be so kind as to find a posting in which I've said that what you
    so insistently assert shouldn't or doesn't work?

    I have said -- seemingly /ad nauseum/ -- is that what you're asserting will
    work as a short-term solution, but, given the conditions of a Bayesian
    filter (and bearing in mind that Bayes' theorem has to be with conditional
    probabilities), performance will eventually degrade.

    I've indicated elsewhere that, using my own method (/i.e./, marking the
    occasional ham message as such), it took close to a year for the performance
    of JMC to degrade enough for me to want to do something about it.

    > In fact in the thread that started this again, the OP came back and said
    > his JMC has improved since he followed the instructions I suggestted.
    > And so have others in the past. So it works, and it works well.


    .. . . at least for a time, and for at least two reasons:

    1) not providing adequate good tokens will skew the performance of JMC
    sooner rather than later; /and/
    2) you assume that time spent maintaining JMC is wasted time and that it's
    more time-efficient to reset training.dat and start from scratch when
    performance degrades.

    Again, I've never said that your solution doesn't work; I've said that it's
    a short-term solution that (because it doesn't fully take into account how
    JMC function) has a very strong likelihood of recreating the situation it
    was intended to cure.

    > So fine, you got a BETTER way to do it?


    Well, um, yes; that's kind of what I've been stating for the past several weeks.

    > One that takes LESS time, is
    > LESS complicated? No? Then why is it better?


    Because it

    1) actually takes into account how JMC are intended to work; /and/
    2) gives the user options beyond just resetting training.dat.

    And trying to suggest that a methodology is somehow 'better' simply because
    it takes less time and is less complicated is so specious as to be laughable.

    > To do it YOUR way you have
    > to download another program, learn how to use it, and then spend some
    > time 'doing' it.


    Trust me, I'd far rather have the abilities in the BJT directly in the UI --
    this is one of those occasions on which it seems that m.o, in the pursuit of
    simplicity and user friendliness, has implemented a function that is
    'stupid' (in the sense that the user have very little control over it) -- ;
    in fact, there would ideally be a user-settable automatic method for
    cleaning the cruft out of training.dat at fixed intervals (which, unlike
    auto-compaction, would be linked to the number of e-mail processed).

    > For what benefit? 1 more message caught out of each 200?


    Well, wouldn't that really depend? -- If you're getting three or four
    hundred pieces of spam a day, taking five minutes every couple of weeks to
    ensure that your JMC are working as effectively as possible strikes me as a
    better use of time than resetting it every month or so.

    But apparently that's just me: I'd rather spend the time now in order to
    save myself the time later. . . .

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  12. Re: What about the junk filter?


    "Moz Champion (Dan)" wrote in message
    news:1_OdnR7UfpWs2C7YnZ2dnUVZ_veinZ2d@mozilla.org. ..

    >
    >
    > And telling them what works for me so well is NOT the best option
    > available? It works, it works well, and it takes almost no time at all to
    > do it. Thats not a good option?
    >
    > One again, you are telling flies they cant fly, well, the flies aint
    > listening!
    >
    > What works, works. Period.
    >


    its not for me as it patently doesnt work! If it cant spot 0EM (zero e m) in
    the header after 20 goes, when will it spot it?

    --
    Tumbleweed

    email replies not necessary but to contact use;
    tumbleweednews at hotmail dot com




  13. Re: What about the junk filter?

    On 1/19/2007 10:42 AM, Chris Barnes wrote:

    > Moz Champion (Dan) wrote:
    >> Again, how is 99.8 percent catch ratio and one false postive in
    >> over six months not working?

    >
    > I see that the question I just asked is very much in line with
    > Brian's comments (the quote above was your reply to Brian's statement
    > that teaching ham was integral to an effective Baysian filter).
    >
    > To which my comment is: I'm not getting anywhere near 98%
    > effectiveness with TB's JMC. It's perhaps somewhere around 60%.
    >
    > Now granted, that is 60% of what SpamAssassin on the server missed
    > (SA catches about 80% of the total spam I receive). But even
    > combining SA with JMC, that still works out to a 92% TOTAL effective
    > rate.


    My problem is not false positives; I get very, very few. But I get a
    lot of false negatives. And no matter how much training I do, that
    doesn't change.

    I would guess my "catch" percentage is about 70%, and at over 200 emails
    a day, that is not sufficient. On the other hand, Forte Agent's spam
    filtering, which is also Bayesian, has a catch ratio of 99.9% on 13,450
    emails.

    --
    Best regards
    Gord McFee

  14. Re: What about the junk filter?

    On Thu, 18 Jan 2007 11:03:30 +0000, Jim S wrote:

    > Can the adaptive junk filter in TB have got less efficient?
    > I used to find it very efficient, but I have found SpamBrave for OE much
    > more reliable and even SpamBayes for Outlook catches more suspects.
    > My provider can tag spam and even with that switched on TB is missing some
    > of the tagged ones.
    > Has something changed?


    Thought I would pop back here to point out that the Junk Filter in SeaMonkey
    is working perfectly. The Thunderbird one was reset at the time of
    installing SM and the same mails are going to both. TB (1.5.0.9) is missing
    some that SM catches.
    --
    Jim S
    Tyneside UK
    http://www.jimscott.co.uk

  15. Re: What about the junk filter?

    On 2007-01-21 14:41 (-0700 UTC), Jim S wrote:

    > On Thu, 18 Jan 2007 11:03:30 +0000, Jim S wrote:
    >
    >> Can the adaptive junk filter in TB have got less efficient?
    >> I used to find it very efficient, but I have found SpamBrave for OE much
    >> more reliable and even SpamBayes for Outlook catches more suspects.
    >> My provider can tag spam and even with that switched on TB is missing some
    >> of the tagged ones.
    >> Has something changed?

    >
    > Thought I would pop back here to point out that the Junk Filter in SeaMonkey
    > is working perfectly. The Thunderbird one was reset at the time of
    > installing SM and the same mails are going to both. TB (1.5.0.9) is missing
    > some that SM catches.


    Do you have build numbers for both SM and Tb?

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  16. Re: What about the junk filter?

    On Sun, 21 Jan 2007 15:12:49 -0700, Brian Heinrich wrote:

    > On 2007-01-21 14:41 (-0700 UTC), Jim S wrote:
    >
    >> On Thu, 18 Jan 2007 11:03:30 +0000, Jim S wrote:
    >>
    >>> Can the adaptive junk filter in TB have got less efficient?
    >>> I used to find it very efficient, but I have found SpamBrave for OE much
    >>> more reliable and even SpamBayes for Outlook catches more suspects.
    >>> My provider can tag spam and even with that switched on TB is missing some
    >>> of the tagged ones.
    >>> Has something changed?

    >>
    >> Thought I would pop back here to point out that the Junk Filter in SeaMonkey
    >> is working perfectly. The Thunderbird one was reset at the time of
    >> installing SM and the same mails are going to both. TB (1.5.0.9) is missing
    >> some that SM catches.

    >
    > Do you have build numbers for both SM and Tb?
    >
    > /b.


    Tb 1.5.0.9(20061207) SM 1.1 (latest)
    --
    Jim S
    Tyneside UK
    http://www.jimscott.co.uk

  17. Re: What about the junk filter?

    On 2007-01-21 16:45 (-0700 UTC), Jim S wrote:

    > On Sun, 21 Jan 2007 15:12:49 -0700, Brian Heinrich wrote:




    >> Do you have build numbers for both SM and Tb?

    >
    > Tb 1.5.0.9(20061207) SM 1.1 (latest)


    Hmm . . . not your fault, but that wasn't quite what I was hoping for. :- \

    Given that the Mozilla codebase tends to branch a lot, I was hoping to be
    able to establish whether or not the two clients were built from the same
    branch, but forgot that some of that info, at least in Tb, isn't that easy
    to access. . . .

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  18. Re: What about the junk filter?

    On Sun, 21 Jan 2007 22:01:02 -0700, Brian Heinrich wrote:

    > On 2007-01-21 16:45 (-0700 UTC), Jim S wrote:
    >
    >> On Sun, 21 Jan 2007 15:12:49 -0700, Brian Heinrich wrote:

    >
    >
    >
    >>> Do you have build numbers for both SM and Tb?

    >>
    >> Tb 1.5.0.9(20061207) SM 1.1 (latest)

    >
    > Hmm . . . not your fault, but that wasn't quite what I was hoping for. :- \
    >
    > Given that the Mozilla codebase tends to branch a lot, I was hoping to be
    > able to establish whether or not the two clients were built from the same
    > branch, but forgot that some of that info, at least in Tb, isn't that easy
    > to access. . . .
    >
    > /b.


    If it's possible I can probably do it.
    BTW TB is having a 25% miss rate this morning.
    --
    Jim S
    Tyneside UK
    http://www.jimscott.co.uk

  19. Re: What about the junk filter?

    On 2007-01-22 03:48 (-0700 UTC), Jim S wrote:

    > On Sun, 21 Jan 2007 22:01:02 -0700, Brian Heinrich wrote:
    >
    >> On 2007-01-21 16:45 (-0700 UTC), Jim S wrote:
    >>
    >>> On Sun, 21 Jan 2007 15:12:49 -0700, Brian Heinrich wrote:

    >>
    >>
    >>>> Do you have build numbers for both SM and Tb?
    >>> Tb 1.5.0.9(20061207) SM 1.1 (latest)

    >> Hmm . . . not your fault, but that wasn't quite what I was hoping for. :- \
    >>
    >> Given that the Mozilla codebase tends to branch a lot, I was hoping to be
    >> able to establish whether or not the two clients were built from the same
    >> branch, but forgot that some of that info, at least in Tb, isn't that easy
    >> to access. . . .

    >
    > If it's possible I can probably do it.
    > BTW TB is having a 25% miss rate this morning.


    What I was really hoping for was a quick way to access the equivalent of
    |Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a2pre) Gecko/20070121
    Minefield/3.0a2pre|.

    states that SM 1.1 is
    '[p]owered by the same engine as Firefox 2 and the upcoming Thunderbird 2',
    but there's nothing on the release notes page to indicate any changes to
    JMC, so I'm not quite sure what to make of it.

    Sorry I couldn't be of more help. . . .

    /b.

    --
    People are stupid. /A/ person may be smart, but /people/ are stupid.
    --Stephen M. Graham

  20. Re: What about the junk filter?

    On Mon, 22 Jan 2007 09:25:39 -0700, Brian Heinrich wrote:

    > On 2007-01-22 03:48 (-0700 UTC), Jim S wrote:
    >
    >> On Sun, 21 Jan 2007 22:01:02 -0700, Brian Heinrich wrote:
    >>
    >>> On 2007-01-21 16:45 (-0700 UTC), Jim S wrote:
    >>>
    >>>> On Sun, 21 Jan 2007 15:12:49 -0700, Brian Heinrich wrote:
    >>>
    >>>
    >>>>> Do you have build numbers for both SM and Tb?
    >>>> Tb 1.5.0.9(20061207) SM 1.1 (latest)
    >>> Hmm . . . not your fault, but that wasn't quite what I was hoping for. :- \
    >>>
    >>> Given that the Mozilla codebase tends to branch a lot, I was hoping to be
    >>> able to establish whether or not the two clients were built from the same
    >>> branch, but forgot that some of that info, at least in Tb, isn't that easy
    >>> to access. . . .

    >>
    >> If it's possible I can probably do it.
    >> BTW TB is having a 25% miss rate this morning.

    >
    > What I was really hoping for was a quick way to access the equivalent of
    >|Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a2pre) Gecko/20070121
    > Minefield/3.0a2pre|.
    >
    > states that SM 1.1 is
    > '[p]owered by the same engine as Firefox 2 and the upcoming Thunderbird 2',
    > but there's nothing on the release notes page to indicate any changes to
    > JMC, so I'm not quite sure what to make of it.
    >
    > Sorry I couldn't be of more help. . . .
    >
    > /b.


    I've reset both of them today, so we'll see.
    --
    Jim S
    Tyneside UK
    http://www.jimscott.co.uk

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast