This is a discussion on A plan for HAM - White list for ham domains - SpamAssassin ; A little play on words spoofing "A plan for spam". I have been testing a new technique for detecting ham that is working quite well. It's nearly (or possibly at) 100% accurate in that what it identifies is ham. First ...
A little play on words spoofing "A plan for spam".
I have been testing a new technique for detecting ham that is working
quite well. It's nearly (or possibly at) 100% accurate in that what it
identifies is ham.
First of all you get a verified RDNS lookup on the host. Verified means
that you do a reverse lookup and then look up the host name to see if it
resolves to the same IP that you looked up. That's something spammers
can't spoof. Then you separate the name at the registrar barrier and
look up that name from a list of host domains that never send spam. For
example, all hosts that end in apache.org are considered spam.
This idea is different that an IP based whitelist in that you are really
whitelisting based on a list of blessed host names rather than just
unnamed IP addresses.
Also - a dynamic whitelist could be generated in the fly if someone
could write a custom DNS server. Here's how it would work. You send a
request about an IP address. If the server doesn't already know the IP
then it does a reverse DNS to get the name and them looks up the name to
verify the name resolves to the same IP address. If it does you then
break the name at the registrar barrier and do a lookup to see if the
name is on the blessed list. If it is you return a cude indicating it is
whitelisted and you cache the IP of the lookup.
The master list of blessed host names could be dynamically generated by
some sort of automated reputation system where ham and spam are reported
by IP address from some trusted sources. Those domains that are
consistently producing nothing but ham make the list.
The advantage of this is increased accuracy and lower system load.
Domains that are whitelisted need not be further tested and can be
instantly classified as ham and fed into the bayes learner. This should
greatly reduce false positives.
Who likes this idea?