On Mon, 2008-04-21 at 22:16 +0200, mouss wrote:
> untested yet:


> uri URI_DEEP5 m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.|
> score URI_DEEP5 0.1
>
> uri URI_DEEP6 m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.|
> score URI_DEEP6 1.0
>
> uri URI_DEEP7
> m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.|
> score URI_DEEP7 2.0


Beware, those are adding up. Since you didn't anchor the end of the RE
to ($|/), whatever hits URI_DEEP7 hits the previous ones, too. Effective
score: 3.1

They don't work anyway. You are testing for single chars between the
dots. And the '-' should be first in a char class, if it is to represent
itself. Also, I'd prefer to keep them cleaner and more readable using
quantifiers, rather than copying parts 7 times...

uri URI_DEEP7 m,https?://([-\w]+\.){6},

The above forces 6 dots, and thus "7 levels". Hits on even longer URIs,
too -- the same constraint of adding scores applies here.

Oh, and yes -- this one is untested, too.

guenther


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a \x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}