Regex Question - Ubuntu

This is a discussion on Regex Question - Ubuntu ; I'm trying to set Pan 0.14.2.91 to reject crossposts here, and my newsgroup-field-matching regex expression ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu$ is grabbing non-crossposts too. What is wrong with the above expression? (I reely reely hate regex, which in turn makes ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Regex Question

  1. Regex Question

    I'm trying to set Pan 0.14.2.91 to reject crossposts here, and my
    newsgroup-field-matching regex expression

    ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    \,alt\.os\.linux\.ubuntu$

    is grabbing non-crossposts too.

    What is wrong with the above expression?

    (I reely reely hate regex, which in turn makes it harder to use, which
    makes me hate it even more, which . . . .)

    --

    Kill. Kill! KILL!

    < comp.os.linux.misc

    http://rute.2038bug.com/index.html.gz


  2. Re: Regex Question

    * mimus wrote in alt.os.linux.ubuntu:

    > I'm trying to set Pan 0.14.2.91 to reject crossposts here, and my
    > newsgroup-field-matching regex expression
    >
    > ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    > \,alt\.os\.linux\.ubuntu$
    >
    > is grabbing non-crossposts too.
    >
    > What is wrong with the above expression?
    >
    > (I reely reely hate regex, which in turn makes it harder to use, which
    > makes me hate it even more, which . . . .)
    >


    You should be using Xref: to score Crossposts not Newsgroups:
    as newsgroups isnt in the XOVER.

    Can you post the full scoring rule you are using?

    --
    David

  3. Re: Regex Question

    On Mon, 07 Apr 2008 19:31:20 +0000, SINNER wrote:

    > * mimus wrote in alt.os.linux.ubuntu:
    >
    >> I'm trying to set Pan 0.14.2.91 to reject crossposts here, and my
    >> newsgroup-field-matching regex expression
    >>
    >> ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    >> \,alt\.os\.linux\.ubuntu$
    >>
    >> is grabbing non-crossposts too.
    >>
    >> What is wrong with the above expression?
    >>
    >> (I reely reely hate regex, which in turn makes it harder to use, which
    >> makes me hate it even more, which . . . .)

    >
    > You should be using Xref: to score Crossposts not Newsgroups:
    > as newsgroups isnt in the XOVER.


    Yeah, I got to there, although it supplies the newsgroups list in the
    "Create Score" window, so it was worth a try . . . .

    I tried Xrefs: with two different expressions, "\: {3,}" and ".*:.*
    ..*:.*", and no luck there either (you can't do Xrefs from the "Create
    Score" window, so I put them into the scorefile manually).

    --

    Usenet: The Biggest and Oldest and Most Powerful Net-Forum of All!


  4. Re: Regex Question

    mimus wrote:
    >
    > ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    > \,alt\.os\.linux\.ubuntu$
    >
    > is grabbing non-crossposts too.
    >
    > What is wrong with the above expression?
    >
    > (I reely reely hate regex, which in turn makes it harder to use, which
    > makes me hate it even more, which . . . .)
    >


    It is always slightly suspicious when you use anchors like that, as they
    bind tighter than |.

  5. Re: Regex Question

    On Mon, 07 Apr 2008 15:15:56 -0400, mimus
    wrote:

    >I'm trying to set Pan 0.14.2.91 to reject crossposts here, and my
    >newsgroup-field-matching regex expression
    >
    >^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    >\,alt\.os\.linux\.ubuntu$
    >
    >is grabbing non-crossposts too.
    >
    >What is wrong with the above expression?
    >
    >(I reely reely hate regex, which in turn makes it harder to use, which
    >makes me hate it even more, which . . . .)


    Just a guess here, but someone could put a space between the string and
    the commas.

    It might be simpler to give a low score to anything that has
    comp.os.linux.advocacy in its newsgroups: field.

    It seems to be working for me. I'm going to add 3 or four names to be
    filtered, and that should do it.

    This group went from being unmanageable to quite reasonable with that
    one filter.

    BTW, I'm posting from Agent due to a large library of saved posts. If I
    drop Windows completely at some point, I'll most likely run Agent under
    Wine or Virtual Box. Those who filter based on newsreader are likely to
    get false positives.

    Barry
    Barry Jones

  6. Re: Regex Question

    On Tue, 08 Apr 2008 16:11:59 -0400, mimus wrote:
    >
    > "Anchors"? you're talkin' to someone whose regex reference is "man regex",
    > and it mentions no "anchors" (it took some effort not to use the word
    > "steenkin" there).


    Heheh, anchors like, start/end of line, begin/end word, start/end string,...


    http://www.greenend.org.uk/rjk/2002/06/regexp.html

  7. Re: Regex Question

    On Tue, 08 Apr 2008 17:15:03 +0200, Henrik wrote:

    > mimus wrote:
    >>
    >> ^alt\.os\.linux\.ubuntu\, | \,alt\.os\.linux\.ubuntu\, |
    >> \,alt\.os\.linux\.ubuntu$
    >>
    >> is grabbing non-crossposts too.
    >>
    >> What is wrong with the above expression?
    >>
    >> (I reely reely hate regex, which in turn makes it harder to use, which
    >> makes me hate it even more, which . . . .)

    >
    > It is always slightly suspicious when you use anchors like that, as they
    > bind tighter than |.


    "Anchors"? you're talkin' to someone whose regex reference is "man regex",
    and it mentions no "anchors" (it took some effort not to use the word
    "steenkin" there).

    But you see, I'm sure, what I was _trying_ to do, to get a match that
    would indicate that "alt.os.linux.ubuntu" was part of a multiple
    newsgroups list . . . .

    (Ah, for the fuzzy software that does what we _want_ instead of what we
    _tell_ it to do!)

    --

    Usenet: The Biggest and Oldest and Most Powerful Net-Forum of All!



  8. Re: Regex Question

    On Tue, 08 Apr 2008 19:18:58 +0000, Bit Twister wrote:

    > On Tue, 08 Apr 2008 16:11:59 -0400, mimus wrote:
    >
    >> "Anchors"? you're talkin' to someone whose regex reference is "man regex",
    >> and it mentions no "anchors" (it took some effort not to use the word
    >> "steenkin" there).

    >
    > Heheh, anchors like, start/end of line, begin/end word, start/end string,...
    >
    > http://www.greenend.org.uk/rjk/2002/06/regexp.html


    Gotcha, but they were deliberate as part of the whole expression.

    I suspect that a simple filtering on subject with regard to "microsoft"
    and "vista" (_Pan's_ regex implementation is case-insensitive) would do
    most of the deed . . . .

    Not that I haven't been enjoying the Vista debacle tremendously.

    --

    It is great Pity, no doubt, fo Fine a Project fhould Mifcarry.

    < _A Modest Defence of Publick Stews_





+ Reply to Thread