It's been a while since I posted but, I thought I followup with what the
problem was that we were experiencing. It had nothing to do with BIND
9.x, Checkpoint, RHEL3 or RH 7.2 (old server). What we were experiencing
was intermittent drops in DNS traffic, queries timing out on the
resolver (the queries would evetually come back 20 seconds or so but,
that was way too long for client resolver libraries) etc.

Anyway, we had four T1's and incoming traffic was being saturated w/
http & rtsp etc., after upgrading to six T1's `magically` the DNS issues
have not reared up since early December. We've made no further changes
to named ... the only change that occurred was the bandwidth upgrade.
Thanks to all who responded.

>>> Mark Andrews 11/24/2004 3:46:06 PM >>>


> BIND 9.2.4 on RHEL3 (Update 3) recompiled from a source rpm w/
> --disable-ipv6 added in the configure flags.
>
> Background - named runs fine for a while then quits resolving

external
> names as illustrated below. When I issue # rndc dumpdb then grep for
> www.yahoo.com in named_dump.db there is no answer. If I wait about

15
> minutes www.yahoo.com will resolve OK - no action taken - it's found

in
> the named_dump.db. Recursion is turned on for the nameserver

172.16.8.4
> (my.nameserver), the client is 172.17.217.55 trying to resolve
> www.yahoo.com and the secondary nameserver is 172.16.8.104
> (2nd.nameserver).
>
> Finally, we have a hunch this may be related to bandwidth issues as
> executing $ dig +trace www.yahoo.com does resolve ... after 20

seconds.
> Any thoughts are greatly appreciated.
>
> # tcpdump -vvvnnttttXs 1512 host 172.17.217.55
>
> 11/23/2004 23:13:21.233812 172.17.217.55.32890 > 172.16.8.4.53: [udp
> sum ok] 50298+ AAAA? my.nameserver.. (37) (DF) (ttl 63, id 50678,

len
> 65)
> 0x0000 4500 0041 c5f6 4000 3f11 3c58 ac11 d937
> E..A..@.?. > 0x0010 ac10 0804 807a 0035 002d 4a86 c47a 0100
> .....z.5.-J..z..
> 0x0020 0001 0000 0000 0000 0362 7732 096a 6566
> .........my.name
> 0x0030 6665 7273 6f6e 0263 6f02 7573 0000 1c00
> server..........
> 0x0040 01 .
> 11/23/2004 23:13:21.233955 172.16.8.4.53 > 172.17.217.55.32890: [bad
> udp cksum 948a!] 50298* q: AAAA? my.nameserver. 0/1/0 ns:
> my.domain.name. SOA my.nameserver. root.my.nameserver. 2004112202

14400
> 3600 3600000 3600 (78) (DF) (ttl 64, id31953, len 106)
> 0x0000 4500 006a 7cd1 4000 4011 8454 ac10 0804
> E..j|.@.@..T....
> 0x0010 ac11 d937 0035 807a 0056 39c5 c47a 8580
> ...7.5.z.V9..z..
> 0x0020 0001 0000 0001 0000 0362 7732 096a 6566
> .........my.name
> 0x0030 6665 7273 6f6e 0263 6f02 7573 0000 1c00
> server..........
> 0x0040 01c0 1000 0600 0100 000e 1000 1dc0 0c04
> ................
> 0x0050 726f 6f74 c00c 7774 534a 0000 3840 0000
> root..wtSJ..8@..
> 0x0060 0e10 0036 ee80 0000 0e10 ...6......
> 11/23/2004 23:13:21.234153 172.17.217.55.32890 > 172.16.8.4.53: [udp
> sum ok] 50299+ A? my.nameserver. (37) (DF) (ttl 63, id 50678, len

65)
> 0x0000 4500 0041 c5f6 4000 3f11 3c58 ac11 d937
> E..A..@.?. > 0x0010 ac10 0804 807a 0035 002d 6585 c47b 0100
> .....z.5.-e..{..
> 0x0020 0001 0000 0000 0000 0362 7732 096a 6566
> .........my.name
> 0x0030 6665 7273 6f6e 0263 6f02 7573 0000 0100
> server..........
> 0x0040 01 .
> 11/23/2004 23:13:21.234311 172.16.8.4.53 > 172.17.217.55.32890: [bad
> udp cksum 953c!] 50299* q: A? my.nameserver. 1/2/1 my.nameserver. A
> 172.16.8.4 ns: my.domain.name. NS 2nd.nameserver., my.domain.name.

NS
> my.nameserver. ar: 2nd.nameserver. A 172.16.8.104 (103) (DF) (ttl 64,

id
> 31954, len 131)
> 0x0000 4500 0083 7cd2 4000 4011 843a ac10 0804
> E...|.@.@..:....
> 0x0010 ac11 d937 0035 807a 006f 39de c47b 8580
> ...7.5.z.o9..{..
> 0x0020 0001 0001 0002 0001 0362 7732 096a 6566
> .........my.name
> 0x0030 6665 7273 6f6e 0263 6f02 7573 0000 0100
> server..........
> 0x0040 01c0 0c00 0100 0100 0151 8000 04ac 1008
> .........Q......
> 0x0050 04c0 1000 0200 0100 0151 8000 0805 6973
> .........Q....2n
> 0x0060 6161 63c0 10c0 1000 0200 0100 0151 8000
> d.nameserver.Q..
> 0x0070 02c0 0cc0 4100 0100 0100 0151 8000 04ac
> ....A......Q....
> 0x0080 1008 68 ..h
> 11/23/2004 23:13:21.235059 172.17.217.55.32890 > 172.16.8.4.53: [udp
> sum ok] 33145+ A? www.yahoo.com. (31) (DF) (ttl 63, id 0, len 59)
> 0x0000 4500 003b 0000 4000 3f11 0255 ac11 d937
> E..;..@.?..U...7
> 0x0010 ac10 0804 807a 0035 0027 fd08 8179 0100
> .....z.5.'...y..
> 0x0020 0001 0000 0000 0000 0377 7777 0579 6168
> .........www.yah
> 0x0030 6f6f 0363 6f6d 0000 0100 01 oo.com.....
> 11/23/2004 23:13:26.233446 172.17.217.55.32890 > 172.16.8.4.53: [udp
> sum ok] 33145+ A? www.yahoo.com. (31) (DF) (ttl 63, id 1, len 59)
> 0x0000 4500 003b 0001 4000 3f11 0254 ac11 d937
> E..;..@.?..T...7
> 0x0010 ac10 0804 807a 0035 0027 fd08 8179 0100
> .....z.5.'...y..
> 0x0020 0001 0000 0000 0000 0377 7777 0579 6168
> .........www.yah
> 0x0030 6f6f 0363 6f6d 0000 0100 01 oo.com.....
> 11/23/2004 23:13:51.253575 172.16.8.4.53 > 172.17.217.55.32890: [bad
> udp cksum f042!] 33145 ServFail q: A? www.yahoo.com. 0/0/0 (31)

(DF)
> (ttl 64, id 32144, len 59)
> 0x0000 4500 003b 7d90 4000 4011 83c4 ac10 0804
> E..;}.@.@.......
> 0x0010 ac11 d937 0035 807a 0027 3996 8179 8182
> ...7.5.z.'9..y..
> 0x0020 0001 0000 0000 0000 0377 7777 0579 6168
> .........www.yah
> 0x0030 6f6f 0363 6f6d 0000 0100 01 oo.com.....
> 11/23/2004 23:13:51.253592 172.16.8.4.53 > 172.17.217.55.32890: [bad
> udp cksum f042!] 33145 ServFail q: A? www.yahoo.com. 0/0/0 (31)

(DF)
> (ttl 64, id 32145, len 59)
> 0x0000 4500 003b 7d91 4000 4011 83c3 ac10 0804
> E..;}.@.@.......
> 0x0010 ac11 d937 0035 807a 0027 3996 8179 8182
> ...7.5.z.'9..y..
> 0x0020 0001 0000 0000 0000 0377 7777 0579 6168
> .........www.yah
> 0x0030 6f6f 0363 6f6d 0000 0100 01 oo.com.....


I would be looking at a broken / misconfigured firewall.

The following two queries should return a referral to the
COM servers from A.ROOT-SERVERS.NET (198.41.0.4). The first
one is plain DNS, the second is EDNS and the answer size will
exceed the 512 bytes supported by plain DNS.

dig soa com +norec @198.41.0.4
dig soa com +norec @198.41.0.4 +bufsize=4096

Your firewall should be capable of supporting EDNS as it has
been
on the Standards Track for 5 years now.

Network Working Group P.
Vixie
Request for Comments: 2671
ISC
Category: Standards Track August
1999


Extension Mechanisms for DNS (EDNS0)

If you upgrade to 9.3.0 you can use "edns-udp-size 512;"
to work around the firewall but the extra answer space
provided by EDNS *is* required for correct DNS operation. We
knew 5 years ago that it would be required, we just didn't
know when we would exceed the capabilities of plain DNS.
That time has now come.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: Mark_Andrews@isc.org