This is a discussion on Re: MD5 Hash of URL's - SpamAssassin ; John D. Hardin wrote: > On Tue, 3 Jul 2007, Matt wrote: > >> Why can't Spamassassin do like a MD5 hash of any URL's in a >> message and check them against a database? I just think it would ...
John D. Hardin wrote:
> On Tue, 3 Jul 2007, Matt wrote:
>> Why can't Spamassassin do like a MD5 hash of any URL's in a
>> message and check them against a database? I just think it would
>> help catch things like: geocities.com/spamer123/ or
>> spamer123.tripod.com and etc.
> Too easy to defeat using a URI with random parameters pointing to a
> PHP et. al. page that ignores parameters (assuming you include
> parameters in the hash) or via wildcard DNS using random third- or
> fourth-level hostnames.
Even the path could be made random if they use mod_rewrite or
equivalent. If http://example.com/random/path/gets/ignored always
serves up the contents of salespitch.html, they can generate as many
URLs as they want.
The concept might still be useful for specific known "grey" hosts with a
mix of legit sites and spam sites -- geocities, tripod, blogspot, etc.
--where the URL patterns are known. If you know the pattern is
account.example.com, or example.com/account, then throw away the rest of
the URL and list/lookup the base pattern.