JINMEI Tatuya / 神明達哉 wrote, On 07/23/2008 09:59 PM:
> At Wed, 23 Jul 2008 06:50:42 -0400,
> Alan Clegg wrote:
>
>>>> We've heard this failure from others. I've been thinking about it,
>>>> but could not yet come up with an idea of how that can happen (much
>>>> less how to fix it).
>>> Same problem here, running bind 9.4.3b2 on Solaris 9.
>>>
>>> 23-Jul-2008 08:52:44.906 general: resolver.c:5494: REQUIRE((((query) !=
>>> 0) && (((const isc__magic_t *)(query))->magic == ((('Q') << 24 | ('!')
>>> << 16 | ('!') << 8 | ('!')))))) failed
>>> 23-Jul-2008 08:52:44.906 general: exiting (due to assertion failure)

>> At this time, ISC is recommending that everyone stay within their branch
>> and move to the -P1 releases (ie, if you are at 9.3.x, move to 9.3.5-P1,
>> 9.4.x users should move to 9.4.2-P1).
>>
>> Unless you have a definitive need to run the beta code, please remain
>> with the -P1 releases to reduce the number of changes that are being
>> introduced into your environment.

>
> This is all completely true. On top of that principle:
>
> we'd still very much welcome beta version testers. Regarding this
> crash, we've noticed it and chased it, but have not yet identified how
> that could happen. Without a core it's unlikely we can debug it
> further, and even if we have one, I suspect it's not very informative
> due to the nature of this bug (it should have been happened at an
> earlier stage of the code, and is just revealed at this point).


We run 9.4.3b2 because when we deployed 9.4.2-P1 we got flooded with
errors like the ones reported in
http://marc.info/?l=bind-users&m=121694398401977&w=2 (while solaris
reported unlimited resources).

> For those who are willing to help debug this crash: please apply the
> patch available at:
> http://www.jinmei.org/bind-9.4.3b2-dispatch.diff
>
> it doesn't change the behavior and shouldn't do any more harm. It
> just adds some debug information we can examine when the crash
> happens. And please report the backtrace next time named crashes at
> this point.
>
> For those who are willing to help test beta but don't want to see this
> crash: if you can live with the possibly decreased performance without
> threads, please rebuild named without threads. This is most likely an
> inter-thread race, and shouldn't happen without threads.


We don't want to disable threads (yet), so we applied the above patch.
We also applied the one mentioned at
http://marc.info/?l=bind-users&m=121694051627133&w=2 and have increased
ISC_SOCKET_MAXECENTS to 128 because of the problem mentioned in
http://marc.info/?l=bind-users&m=121619301702838&w=2

After 12 hours of running, no problems yet.
Sotiris.