Greetings.

We are running two Bind 9.2.1 name servers on Solaris. We are having
trouble with a particular domain -- sbj.net. I know there is a problem
with the domain. The root servers think that ns1-auth.sprintlink.net and
ns1.corpranet.net are supposed to be the authoritative servers for the
domain, whereas ns1.corpranet.net and ns1.positech.net are apparently
*supposed* to be the authoritative servers, and ns1-auth.sprintlink.net
indicates that it is *not* authoritative for sbj.net.

If I flush the cache (rndc flush) on our servers, they will successfully
resolve the A record for sbj.net. A dump of the database at that point
shows that our servers have cached ns1.corpranet.net and ns1.positech.net
as nameservers for sbj.net:

----------------------------------------------------------------
; authauthority
sbj.net. 3554 NS ns1.positech.net.
3554 NS ns1.corpranet.net.
; authanswer
3554 A 69.27.136.10
; authanswer
www.sbj.net. 3554 CNAME sbj.net.
----------------------------------------------------------------


After the NS records for sbj.net time out (1 hour), our servers then
return SERVFAIL for sbj.net. A dump of the database at that point shows
that our servers have cached ns1.corpranet.net and ns1-auth.sprintlink.net
as nameservers for sbj.net:

------------------------------------------------------------------
; glue
sbj.NET. 155685 NS ns1.corpranet.net.
155685 NS ns1-auth.sprintlink.net.
; glue
sbs2003.NET. 149675 NS ns1.sbs2003.net.
------------------------------------------------------------------


My questions are:

1) Why do our servers sometime cache ns1.corpranet.net and
ns1.positech.net as the nameservers for sbj.net, and why do they sometimes
cache ns1.corpranet.net and ns1-auth.sprintlink.net instead? Why are they
not consistent?

2) *Should* our nameservers be caching ns1-auth.sprintlink.net as a
nameserver for sbj.net, since that server is lame for sbj.net?

3) If the answer to (2) is yes, is there any way to configure our servers
to keep them from caching lame servers (JUST the lame servers without
affecting caching for anything else)?

4) Why are our nameservers returning SERVFAIL when ns1-auth.sprintlink.net
is in the cache, since ns1.corpranet.net is also in the cache and is
authoritative for sbj.net. (In other words, why don't our servers go
ahead and try to query ns1.corpranet.net even though
ns1-auth.sprintlink.net is lame for sbj.net?)

I'm not saying our servers are doing anything wrong. I just want to
understand why they are doing what they are doing.

Thanks.

Ben Bridges
Network Engineer
SpringNet / City Utilities of Springfield, MO