> -----Original Message-----
> From: bind-users-bounce@isc.org [mailto:bind-users-bounce@isc.org] On
> Behalf Of JINMEI Tatuya / ????
> Sent: Thursday, August 07, 2008 3:56 AM
> To: Vinny Abello
> Cc: bind-users@isc.org
> Subject: Re: dnsperf and BIND memory consumption
>
> At Thu, 7 Aug 2008 00:58:23 -0400,
> Vinny Abello wrote:
>
> > OK. I've recompiled BIND 9.5.0-P2 (from ports) without threads
> > enabled. I no longer see the memory leak at all. I'm running dnsperf
> > and I see a constant of 18MB which is much more reasonable for what
> > I am doing. For me it's easy to reproduce. Some more information
> > that may help reproduce it:

>
> > FreeBSD 7.0 STABLE AMD64 (cvsup'ed within the past week)
> > BIND 9.5.0-P2 installed via ports with threads enabled
> > Server is a Dell PowerEdge 2850 with 2 CPU's, Hyperthreading

> disabled, 4GB of RAM and a 36GB RAID1 array on a Perc4 controller (LSI
> MegaRAID chipset)
> > Dnsperf run from a different server on the same network segment over

> Gig-E
>
> This looks quite similar to the one we heard before. I suspect this
> is due to some bad interaction between BIND9 and the FreeBSD's thread
> library or its kernel, rather than application memory leak (in which
> case you can confirm it by stopping named while its memory is growing
> and seeing it crash). Here is what I suggested at that time to
> identify the memory eater (but unfortunately we couldn't get any
> feedback on it at that time), could you try it?


Sure, I can give it a shot.

> ================================================== =====================
> - create a symbolic link from "/etc/malloc.conf" to "X":
> # ln -s X /etc/malloc.conf


What exactly is this trying to accomplish here? JFYI, I don't have a file /etc/malloc.conf on my server. Did you mean /etc/make.conf? Where is X being referenced?

> - start named with a moderate limitation of virtual memory size, e.g.
> # /usr/bin/limits -v 384m $path_to_named/named
>
> Then the named process will eventually abort itself with a core dump
> due to malloc failure. Please show us the stack trace at that point.
> Hopefully it will reveal the malloc call that keeps consuming memory.


How would I show the trace that you require once this happens?

>
> Notes:
> - of course, this is a very radical way of diagnosing; you need to
> keep watching the process because it's "guaranteed" to be aborted.
> - the VM size must be carefully chosen so that malloc failure won't
> happen due to normal named processing. I think 384MB is reasonable
> enough according to the statistics you provided so far, but I'm not
> 100% sure about that.
> - it's better to keep my latest patch to adb.c and to run named with
> '-n 1' so that the mutex_init in adb.c won't trigger the malloc
> failure.
> - the global symbolic link from /etc/make.conf affects other
> processes. So, if you're running a different process than named
> that can consume a lot of memory or can cause malloc failure, we
> should find an alternative approach (there are some, but they are
> more complicated so let's discuss those only when they are really
> necessary).


Shouldn't be a problem here. Again, it's just being tested and this is the only thing the server is doing.

> ================================================== =====================
>
> BTW, you should be able to find the previous discussion on this matter
> by searching the bind-users@isc.org list with the subject of
> "max-cache-size doesn't work with 9.5.0b1".


I may have to go back and find this thread.

>
> ---
> JINMEI, Tatuya
> Internet Systems Consortium, Inc.
>
> p.s. I'm pretty sure it's different from the 'memory leak' issue of
> BIND9/Windows. Let's forget it in this context.


Fair enough. I'll trust you on that.