Hi all

Follow up to this post, we've been able to capture a gdb
backtrace. Can anyone help with guidance as to what this
means. See below:

(gdb) bt
#0 0xffffe410 in ?? ()
#1 0x00000001 in ?? ()
#2 0x00000000 in ?? ()
#3 0xbfffc9d8 in ?? ()
#4 0x402b36e3 in __waitpid_nocancel () from
/lib/tls/libc.so.6
#5 0x4025ef58 in do_system () from /lib/tls/libc.so.6
#6 0x402268dd in system () from /lib/tls/libpthread.so.0
#7 0x0822b612 in smb_panic (why=0x0) at lib/util.c:1608
#8 0x08219b3f in fault_report (sig=-512) at lib/fault.c:47
#9 0x08219b50 in sig_fault (sig=-512) at lib/fault.c:70
#10
#11 0x40292d1b in strlen () from /lib/tls/libc.so.6
#12 0x40268242 in vfprintf () from /lib/tls/libc.so.6
#13 0x40285e76 in vsnprintf () from /lib/tls/libc.so.6
#14 0x08219956 in dbgtext (format_str=0x6d2e5c73 "") at
lib/debug.c:1011
#15 0x0825b360 in oplock_timeout_handler (te=0x844ce10,
now=0xbfffd9c0,
private_data=0x84492f0) at smbd/oplock.c:351
#16 0x08242d7d in run_events () at lib/events.c:102
#17 0x080f2801 in receive_message_or_smb (buffer=0x40433008
"",
buffer_len=131137, timeout=60000) at smbd/process.c:457
#18 0x080f4122 in smbd_process () at smbd/process.c:1649
#19 0x082beea9 in main (argc=1831754867, argv=0xbfffdd34) at
smbd/server.c:1024

This is similar to the following panic message recorded in
syslog:

Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
smbd/oplock.cplock_timeout_handler(351)
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(41)
Jun 13 12:57:29 uhti02 smbd[16322]:
================================================== =============
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(42)
Jun 13 12:57:29 uhti02 smbd[16322]: INTERNAL ERROR: Signal
11 in pid 16322 (3.0.24-SerNet-SuSE)
Jun 13 12:57:29 uhti02 smbd[16322]: Please read the
Trouble-Shooting section of the Samba3-HOWTO
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(44)
Jun 13 12:57:29 uhti02 smbd[16322]:
Jun 13 12:57:29 uhti02 smbd[16322]: From:
http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(45)
Jun 13 12:57:29 uhti02 smbd[16322]:
================================================== =============
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:smb_panic(1599)
Jun 13 12:57:29 uhti02 smbd[16322]: PANIC (pid 16322):
internal error
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:log_stack_trace(1706)
Jun 13 12:57:29 uhti02 smbd[16322]: BACKTRACE: 14 stack
frames:
Jun 13 12:57:29 uhti02 smbd[16322]: #0
/usr/sbin/smbd(log_stack_trace+0x22) [0x822b6fb]
Jun 13 12:57:29 uhti02 smbd[16322]: #1
/usr/sbin/smbd(smb_panic+0x6f) [0x822b59a]
Jun 13 12:57:29 uhti02 smbd[16322]: #2 /usr/sbin/smbd
[0x8219b3f]
Jun 13 12:57:29 uhti02 smbd[16322]: #3 /usr/sbin/smbd
[0x8219b50]
Jun 13 12:57:29 uhti02 smbd[16322]: #4 [0xffffe420]
Jun 13 12:57:29 uhti02 smbd[16322]: #5
/lib/tls/libc.so.6(vsnprintf+0xb6) [0x40285e76]
Jun 13 12:57:29 uhti02 smbd[16322]: #6
/usr/sbin/smbd(dbgtext+0x2e) [0x8219956]
Jun 13 12:57:29 uhti02 smbd[16322]: #7 /usr/sbin/smbd
[0x825b360]
Jun 13 12:57:29 uhti02 smbd[16322]: #8
/usr/sbin/smbd(run_events+0x15f) [0x8242d7d]
Jun 13 12:57:29 uhti02 smbd[16322]: #9 /usr/sbin/smbd
[0x80f2801]
Jun 13 12:57:29 uhti02 smbd[16322]: #10
/usr/sbin/smbd(smbd_process+0x10e) [0x80f4122]
Jun 13 12:57:29 uhti02 smbd[16322]: #11
/usr/sbin/smbd(main+0x946) [0x82beea9]
Jun 13 12:57:29 uhti02 smbd[16322]: #12
/lib/tls/libc.so.6(__libc_start_main+0xd0) [0x40240210]
Jun 13 12:57:29 uhti02 smbd[16322]: #13 /usr/sbin/smbd
[0x808ceb1]
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:smb_panic(1607)
Jun 13 12:57:29 uhti02 smbd[16322]: smb_panic(): calling
panic action [/bin/sleep 90000]

Versions:
Kernel: 2.6.5-7.97-bigsmp
smbd, nmbd, winbindd: Version 3.0.24-SerNet-SuSE

As I said earlier this problem occurs intermittently every
2-3 days, in 2 separate Samba installations, and when it
occurs Samba requires a restart to clear.

Much appreciated.

Joe




----- Original Message Follows -----
> Hi Samba list,
>
> We're experiencing some issues with our Samba 3.0.24
> environments. Hopefully somebody can offer suggestions or
> guidance.
>
> A bit of background. We have 3 application environments,
> which consist of a Samba host providing file sharing
> services to 7 Windows application servers.
>
> These Samba hosts intermittently experiencing problem
> providing file sharing. So far we haven't established a
> pattern with the failures, so for now the best we can
> establish is that every couple of days a Samba host will
> experience a Internal Error (signal 11) in an smbd
> process. From that point onwards the smbd process will
> operate unreliability such that Windows clients will
> generally not be able to connect to the share, file copies
> that were underway will abort with errors, etc. All this
> will require a restart of the Linux host to clear, and
> once restarted things are fine.
>
> All three environments are the same for hardware/OS and
> software. They operate independently of each other. All
> experience the same issue. Other than this issue we do not
> experience any other Samba problems, the file shares run
> without problems, until a signal 11 occurs.
>
> - SuSE Enterprise Linux 9 (2.6.5-7.97-bigsmp)
> - Samba 3.0.24
> - /data (total 1TB, .5TB in use) - /dev/sdc1 type ext3
> (rw,acl,user_xattr)
>
> The signal 11 crashes appear to have started following our
> upgrading to Samba 3.0.24 in March 2007.
>
> Example message attached in signal_11.txt. I've attached
> these instead of placing inline as my webmail has fixed
> width formatting which messes up the syslog line - hope
> this is okay.
>
> Things we've tested:
>
> - fsck
> - testparm
> - Samba config changes:
> kernel oplocks = no
> oplocks = False
> level2 oplocks = False
>
> I though I'd preemptively post this to the mailing list to
> see if anyone has experienced similar issues. I will post
> some 'gdb smb PID' output once I'm able to catch it.
>
> Our suspicion is that this occurs under load, though we've
> not yet been able to reproduce the problem under testing.
> Upgrading to 3.0.25 is an option, although we'd like to do
> this once we more clearly identified the cause and fix.
>
> Finally, an example of the volume of errors we're
> experiencing (from a single host) is attached in
> volume.txt.
>
> Happy to post other info.
>
> Kind regards
> Joe Murphy
> Info Systems Technical Team
> joe.murphy@clear.net.nz
>
>
> [Attachment: signal_11.txt]
> [Attachment: volume.txt]

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/listinfo/samba