Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I haven't run any real statistics about this, but it's worth realizing
that unless there's a significant number of spams that have this behavior,
a rule probably costs more in resource use than it provides in hits.

A quick:

pcregrep -ri 'http://(?:[^/.]+\.){7}'

in my corpus shows about 20 spam hits in some 245000 mails. There could be
reasons this RE wouldn't hit, but in general I wouldn't bother.

On Tue, Apr 22, 2008 at 01:24:37AM +0200, Karsten Br=E4ckelmann wrote:
> On Mon, 2008-04-21 at 22:16 +0200, mouss wrote:
> > untested yet:

> > uri URI_DEEP5 m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.|
> > score URI_DEEP5 0.1
> >=20
> > uri URI_DEEP6 m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-=

> > score URI_DEEP6 1.0
> >=20
> > uri URI_DEEP7 =20
> > m|https?://[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.[\w-]\.|
> > score URI_DEEP7 2.0

> Beware, those are adding up. Since you didn't anchor the end of the RE
> to ($|/), whatever hits URI_DEEP7 hits the previous ones, too. Effective
> score: 3.1
> They don't work anyway. You are testing for single chars between the
> dots. And the '-' should be first in a char class, if it is to represent
> itself. Also, I'd prefer to keep them cleaner and more readable using
> quantifiers, rather than copying parts 7 times...
> uri URI_DEEP7 m,https?://([-\w]+\.){6},
> The above forces 6 dots, and thus "7 levels". Hits on even longer URIs,
> too -- the same constraint of adding scores applies here.
> Oh, and yes -- this one is untested, too.
> guenther
> --=20
> char *t=3D"\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a \x10\xf4\xf4=

> main(){ char h,m=3Dh=3D*t++,*x=3Dt+2*h,c,i,l=3D*x,s=3D0; for (i=3D0;i
++){ i%8? c<<=3D1:
> (c=3D*++x); c&128 && (s+=3Dh); if (!(h>>=3D1)||!t[s+h]){ putchar(t[s]);h=

=3Dm;s=3D0; }}}

Randomly Selected Tagline:
Hear Me, California! Tomorrow you vote. Again. Good luck, and I hope
you get the Governor you deserve. I think it was Adlai Stevenson who said
that there's nothing more inspiring in human society than the spectacle
of the democratic process being bizarrely subverted by a well-funded
partisan exploitation of a constitutional loophole. How true that is.
- Adam Felber, http://www.felbers.net/mt/archives/001654.html

Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.8 (GNU/Linux)