Kevin Darcy wrote:
>>> Perhaps the interim solution is to tune openser's lookup timeout-retry
>>> parameters.
>>>

>> That gives faster timeouts - but I want to get rid of timeouts
>> completely - of course the first lookup will time out, but the name
>> servers should be marked as down for some time and sequential lookups
>> should be avoided.
>>

> I think you're setting yourself for a very fragile and fickle lookup
> subsystem, since we're talking primarily about an unreliable protocol
> (UDP) being used over long-distance networks with varying latencies.
> Packet delays and drops are commonplace.
>
> But go ahead with your experiments/modifications and let us know how it
> works out.


Implementing, that's the point ;-)

It's not that straight forward because (as you said) there may be
sometimes situations where packets are lost. E.g. if the recursive NS
itself gets offline for some time, it would mark all nameservers as down
as the queries will timeout.

Thus there must be some intelligence which detects if all queries will
timeout and consider this not as failure of a certain authoritative name
server. Further, the name servers should not be marked as "down" after
one timeout, but only if several timeouts fail.

I suspect I wont have enough programming/bind experience to do this
myself :-(

regards
klaus