This is a discussion on Re: Starting a URIBL - Howto? [OT] - SpamAssassin ; Marc Perkel wrote: > I was just wondering from those of you who have done it - how to start > a URIBL. I'm guessing the process (simplified) is: > > 1) Mine messages for links > 2) Subtract out ...
Marc Perkel wrote:
> I was just wondering from those of you who have done it - how to start
> a URIBL. I'm guessing the process (simplified) is:
> 1) Mine messages for links
> 2) Subtract out anything matching a fairly large white list
> So my first question here is - what do most of you used to mine the
> links in a message with? Can someone point me in the right direction?
> Also - I'm willing to work with and share data with others who are
> already doing this.
Just like a regular sender's IP dnsbl (aka "RBL"), the hardest part is
not having FPs... in fact, this is probably *harder* for URIBLs compared
to RBLs. The second hardest part is being able to list spammer's URIs
*quickly* (particularly since trying to do so exacerbates the first
The process you described is the best way to start... it is where
everyone starts. But many have started with amazing whitelists, done
what you described, and have failed. It take much more than a great
whitelist to make a great blacklist.
In fact, I know someone who frequents these anti-spam lists ...who I
consider smarter than either you or me... and I happen to consider him
the world's foremost authority on how to create and maintain a *great*
RBL. (I'm not allowed to mention who he is... in this context... but
just about everyone reading this would recognize his name... NO, this is
NOT Steve Linford... please, no questions or guesses about this!)
Anyway, over the past several months... he tried to create a great URIBL
and, so far, his URIBL falls far short of SURBL and URIBL and ivmURI.
Marc, if I had to make a short list of those who I thought might be able
to pull this off... you'd definitely be on the short list.
However, don't be discouraged if you come up short and/or if it takes
many months... even years... to accomplish what you seek. If the guy I
described can't do it (at least last I checked...), then believe me,
this is NOT an easy task.
I know MUCH about this. I've been one of the admins for SURBL for the
past 4+ years. Additionally, I created own URIBL called "ivmURI", which
is now *easily* in the same league as SURBL and URIBL... In fact, ivmSIP
is probably even better... at least, according to the hit stats and FP
stats that some of my users have provided me where all three URI
blacklists are compared to each other. (Of course, all three lists are
indispensable... I use ALL of them in my spam filtering... and ALL 3
catch stuff the other 2 miss... FOR EXAMPLE:
At this time, there is no other publicly available URI blacklist that
comes close to SURBL and URIBL and ivmURI. No "close" 4th place. Again,
*not* *even* *close*.
I hope this helps and doesn't discourage you. I had a wise college
professor tell me "big problem, big solution... little problem, little
solution". Spammer's URIs is a big problem that requires a big solution.
Knowing what you're up against in creating a URI blacklist might seem
discouraging in the short term, but might give you the proper long-term
focus and patience you need to really pull this off.
Best wishes for your success in this endeavor!
(creator of the "invaluement.com" DNSBLs, ivmURI & ivmSIP)