On 14 Jul 2006, at 08:29 , Kevin Darcy wrote:

> Merton Campbell Crockett wrote:
>> On 13 Jul 2006, at 11:43 , Smith, William E. ((Bill)), Jr. wrote:
>>> -----Original Message-----
>>> From: Mark_Andrews@isc.org [mailto:Mark_Andrews@isc.org]
>>> Sent: Thursday, July 13, 2006 1:55 PM
>>> To: Smith, William E. (Bill), Jr.
>>> Cc: bind-users@isc.org
>>> Subject: Re: Strange / Frustrating Caching Problems
>>>> For the past few months, I have been trying to resolve
>>>> (unsuccessfully
>>>> to thi s point) with a trio of caching only name servers that we
>>>> have
>>>> in place. The general nature of the problem is as follows. A dhcp
>>>> client originally gets an IP address on subnet A but at some point
>>>> prior to lease expiration moves to subnet B, where they obtain a
>>>> new
>>>> IP address successfully. The problem that I am seeing is that
>>>> after
>>>> the move to subnet B, one or more of our caching only name servers
>>>> are still returning the old IP address when a lookup of the
>>>> hostname
>>>> occurs. This behavior seems reasonable at first glance since
>>>> caching
>>>> only servers should retain the information they have in cache until
>>>> the TTL expires and/or the cache is flushed. After digging into
>>>> this
>>>> further, I'm finding that that the TTL for the hosts whose forward
>>>> lookups are returning the wrong IP are set to 604800 seconds or 168
>>>> hours. I've determined this by dumping / viewing the cache. In
>>>> addition, I've also discovered that the TTL for the reverse record
>>>> for the same client is also set to this high value. This behavior
>>>> would seem reasonable if this high value was the TTL value
>>>> configured
>>>> for the domain, which is not the case here. We have the default
>>>> TTL
>>>> in our environment set for 10800 seconds or 4 hours. Thus, I'm a
>>>> little baffled as to why the TTL for some of these DHCP clients are
>>>> being set to such a high value when other clients have their TTL's
>>>> set
>>>> to the 10800 v alue configured at
>>>> the domain level. I've checked the registration at the ob ject
>>>> level
>>>> (in our IP management application) and the TTL field is blank,
>>>> thu s
>>> implying the default TTL is in place.
>>>> Aside from the above details, I can also note that the problematic
>>>> lookups se em to involve the same DHCP clients. The only reason I
>>>> know about these clie nts is that they are unable to SSH to some
>>>> Unix
>>>> boxes in a DMZ that restrict access to hosts that they can perform
>>> both forward and reverse lookups for.
>>>> In this scenario, the forward lookup is failing since it's
>>>> returning
>>>> the old IP address of the client. When this problem occurs, it
>>>> tends
>>>> to affect one o r two of the caching servers but not all three.
>>>> Furthermore, it is somewhat random as to which of the 3 servers are
>>> affected.
>>>> The caching servers in question are all Solaris 9 running BIND
>>>> 9.3.2
>>>> If anyone can provide some insight here, it would be much
>>>> appreciated.
>>>> I can provide additional information and/or elaborate on
>>>> something as
>>> needed.
>>>> Bill Smith
>>>> ISS Server Systems Group
>>>> Johns Hopkins University Applied Physics Laboratory 11100 Johns
>>>> Hopkins Road Laurel, MD 20723
>>>> Phone: 443-778-5523
>>>> Web: http://www.jhuapl.edu
>>> Nameservers do what the dhcp servers tell them to do. The TTL
>>> is set by the DHCP server. Try lowering the dhcp lease time as
>>> that influences the DNS TTL.

>> In an environment where people can wander with their laptops from
>> subnet to subnet, why do you have caching only name servers?
>> These name servers should, at least, have the local zones defined as
>> forward or stub zones to minimize the amount of erroneous data being
>> returned in a volatile environment.

> Uh, how will that help? Caching still occurs -- and TTLs are
> honored --
> even for names in "forward" or "stub" zones.
> The only way I can think of to speed up this propagation, short of
> reducing the TTLs that are set by the DHCP server, or running a
> modified
> version of BIND (e.g. QIP's version, in which secondaries can receive
> Dynamic Updates), or an out-of-band replication mechanism, is to
> set up
> all of the servers as stealth slaves enumerated in the relevant
> also-notify(s), so that the changes should replicate fairly quickly.

Right you are. A momentary brain-fade. You can, however, configure a
name server to override the TTL value received in a query response by
defining max-cache-ttl in the global options of the configuration file.

If max-cache-ttl is not defined in the configuration file, BIND sets
the value of max-cache-ttl to its default value of 7 days. Any query
response with a TTL greater than max-cache-ttl will have the TTL
replaced with max-cache-ttl.

This is why Bill Smith was seeing a TTL of 7 days in the caching only
name server for a system with a TTL of 14 days. Depending upon the
volume of DNS queries in the DMZs, it might be reasonable to define
"max-cache-ttl 1800;" to force the name server to perform a new query
after 30 minutes.

Merton Campbell Crockett