[Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4 - Samba

This is a discussion on [Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4 - Samba ; Does anybody have any ideas on this? On our server that has been running 'rock-solid' with no crashes we have now had 3 kernel panics that each appear to have been triggered by the newly upgraded samba daemon. We used ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: [Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4

  1. [Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4



    Does anybody have any ideas on this? On our server that has been running
    'rock-solid' with no crashes we have now had 3 kernel panics that each
    appear to have been triggered by the newly upgraded samba daemon.

    We used to run samba 3.0.21a for 'years' with no crashes.

    On May 26 we upgraded to 3.0.25

    June 9 10:58:28 first crash kernel panic
    process involved according to log file smbd (see below)

    June 14 16:32:23 second crash kernel panic
    process involved according to log file smbd (see below)

    In the morning of June 15 we upgraded to 3.0.25a

    June 15 17:26:36 third crash kernel panic
    process involved according to log file smbd (see below)


    Some specs on our server.

    OS: CentOS 4.4
    kernel: 2.6.9-22.0.1.EL.1smp
    CPU: SMP Dual AMD Opteron(tm) Processor 246 2GHz (about 4000 bogomips)
    RAM: 2GB
    SWAP: 5.8GB
    users: peak at ~ 50 - 60 (varies - usually or on average closer to 30 or so)

    Here are the log files of the kernel panics. Is this a kernel bug
    triggered by a samba daemon, or a samba daemon bug that crashed the kernel?

    ******************** first crash ************************
    Jun 9 10:58:03 uk smbd[21513]: [2007/06/09 10:58:03, 0]
    smbd/service.c:make_connection_snum(928)
    Jun 9 10:58:03 uk smbd[21513]: Can't become connected user!
    Jun 9 10:58:05 10.37.2.139 SecurityCenter: N/A: The Security Center
    service has been stopped. It was prevented from running by a software
    group policy.
    Jun 9 10:58:10 10.37.2.139 W32Time: N/A: Time Provider NtpClient: This
    machine is configured to use the domain hierarchy to determine its time
    source, but the computer is joined to a Windows NT 4.0 domain. Windows
    NT 4.0 domain controllers do not have a time service and do not support
    domain hierarchy as a time source. NtpClient will attempt to use an
    alternate configured external time source if available. If an external
    time source is not configured or used for this computer, you may choose
    to disable the NtpClient.
    Jun 9 10:58:10 10.37.2.139 W32Time: N/A: The time provider NtpClient is
    configured to acquire time from one or more time sources, however none
    of the sources are accessible. NtpClient has no source of accurate time.
    Jun 9 10:58:20 10.37.2.139 E100B: N/A: Intel(R) PRO/100 VM Network
    Connection driver has been started
    Jun 9 10:58:28 uk kernel: ------------[ cut here ]------------
    Jun 9 10:58:28 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 9 10:58:28 uk kernel: invalid operand: 0000 [#1]
    Jun 9 10:58:28 uk kernel: SMP
    Jun 9 10:58:28 uk kernel: Modules linked in: nls_utf8 usb_storage vfat
    fat md5 ipv6 parport_pc lp parport tun sunrpc ipt_MASQUERADE ipt_TOS
    ipt_LOG iptable_filter iptable_mangle iptable_nat ip_conntrack ip_tables
    button battery ac ohci_hcd e1000 tg3 floppy st ext3 jbd dm_mod gdth
    aic79xx sata_sil libata sd_mod scsi_mod
    Jun 9 10:58:28 uk kernel: CPU: 0
    Jun 9 10:58:28 uk kernel: EIP: 0060:[] Not tainted VLI
    Jun 9 10:58:28 uk kernel: EFLAGS: 00010216 (2.6.9-22.0.1.EL.1omsmp)
    Jun 9 10:58:28 uk kernel: EIP is at vma_prio_tree_add+0x36/0x95
    Jun 9 10:58:28 uk kernel: eax: 00000009 ebx: e721c17c ecx: 00000000
    edx: 000000b3
    Jun 9 10:58:28 uk kernel: esi: f47ab85c edi: c293ba88 ebp: d0f28250
    esp: db80ef3c
    Jun 9 10:58:28 uk kernel: ds: 007b es: 007b ss: 0068
    Jun 9 10:58:28 uk kernel: Process smbd (pid: 21513, threadinfo=db80e000
    task=c269eef0)
    Jun 9 10:58:28 uk kernel: Stack: e721c17c f76b4c40 c014e1ee e721c17c
    000000fb 00000000 eaae6640 c014ed2e
    Jun 9 10:58:28 uk kernel: d0f28250 d0f28248 00000000 00000001
    00000000 c293b9d8 f76b4c40 000b4000
    Jun 9 10:58:28 uk kernel: b74e8000 d0f2822c d0f28250 d0f28248
    f76b4c40 f76b4c70 db80e000 eaae6640
    Jun 9 10:58:28 uk kernel: Call Trace:
    Jun 9 10:58:28 uk kernel: [] vma_link+0x9c/0xbc
    Jun 9 10:58:28 uk kernel: [] do_mmap_pgoff+0x50e/0x666
    Jun 9 10:58:28 uk kernel: [] sys_mmap2+0x7e/0xaf
    Jun 9 10:58:28 uk kernel: [] syscall_call+0x7/0xb
    Jun 9 10:58:28 uk kernel: Code: c3 39 ca 74 08 0f 0b 0f 02 64 4e 2e c0
    8b 43 08 2b 43 04 c1 e8 0c 8d 54 02 ff 8b 46 08 2b 46 04 c1 e8 0c 8d 44
    01 ff 39 c2 74 08 <0f> 0b 10 02 64 4e 2e c0 c7 43 34 00 00 00 00 83 7e
    34 00 c7 43
    Jun 9 10:58:28 uk kernel: <0>Fatal exception: panic in 5 seconds


    Jun 9 13:04:57 uk syslogd 1.4.1: restart (remote reception).
    Jun 9 13:04:57 uk syslog: syslogd startup succeeded
    Jun 9 13:04:57 uk kernel: klogd 1.4.1, log source = /proc/kmsg started.


    ******************** second crash ************************
    Jun 14 16:25:02 uk nmbd[14947]: [2007/06/14 16:25:02, 0]
    libsmb/nmblib.c:send_udp(791)
    Jun 14 16:25:02 uk nmbd[14947]: Packet send failed to 10.37.2.70(138)
    ERRNO=Operation not permitted
    Jun 14 16:25:16 uk crond(pam_unix)[15907]: session closed for user root
    Jun 14 16:25:33 uk clamd[10155]: SelfCheck: Database status OK.
    Jun 14 16:26:44 uk -- MARK --
    Jun 14 16:27:44 uk -- MARK --
    Jun 14 16:28:01 uk crond(pam_unix)[16009]: session opened for user root
    by (uid=0)
    Jun 14 16:28:01 uk crond[16010]: (root) CMD (ping -c 1 uucp.cid.net >
    /dev/null 2>&1;sleep 8;/usr/sbin/uucico -S mailhost)
    Jun 14 16:28:44 uk -- MARK --
    Jun 14 16:28:45 uk nmbd[14947]: [2007/06/14 16:28:45, 0]
    libsmb/nmblib.c:send_udp(791)
    Jun 14 16:28:45 uk nmbd[14947]: Packet send failed to 10.37.2.35(138)
    ERRNO=Operation not permitted
    Jun 14 16:29:44 uk -- MARK --
    Jun 14 16:30:01 uk crond(pam_unix)[16031]: session opened for user root
    by (uid=0)
    Jun 14 16:30:01 uk crond[16032]: (root) CMD (/usr/lib/sa/sa1 1 1)
    Jun 14 16:30:01 uk crond(pam_unix)[16033]: session opened for user root
    by (uid=0)
    Jun 14 16:30:01 uk crond[16035]: (root) CMD (/opt/sarcheck/bin/prst1)
    Jun 14 16:30:01 uk crond(pam_unix)[16034]: session opened for user root
    by (uid=0)
    Jun 14 16:30:01 uk crond[16037]: (root) CMD (ping -c 1 uucp.cid.net >
    /dev/null 2>&1;sleep 8;/usr/sbin/uucico -S mailhost)
    Jun 14 16:30:01 uk crond(pam_unix)[16031]: session closed for user root
    Jun 14 16:30:02 uk crond(pam_unix)[16033]: session closed for user root
    Jun 14 16:30:10 uk crond(pam_unix)[16034]: session closed for user root
    Jun 14 16:30:44 uk -- MARK --
    Jun 14 16:31:32 uk crond(pam_unix)[16009]: session closed for user root
    Jun 14 16:32:23 uk kernel: ------------[ cut here ]------------
    Jun 14 16:32:23 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 14 16:32:23 uk kernel: invalid operand: 0000 [#1]
    Jun 14 16:32:23 uk kernel: SMP
    Jun 14 16:32:23 uk kernel: Modules linked in: vfat fat md5 ipv6
    parport_pc lp parport tun sunrpc ipt_MASQUERADE ipt_TOS ipt_LOG
    iptable_filter iptable_mangle iptable_nat ip_conntrack ip_tables
    usb_storage button battery ac ohci_hcd e1000 tg3 floppy st ext3 jbd
    dm_mod gdth aic79xx sata_sil libata sd_mod scsi_mod
    Jun 14 16:32:23 uk kernel: CPU: 0
    Jun 14 16:32:23 uk kernel: EIP: 0060:[] Not tainted VLI
    Jun 14 16:32:23 uk kernel: EFLAGS: 00010212 (2.6.9-22.0.1.EL.1omsmp)
    Jun 14 16:32:23 uk kernel: EIP is at vma_prio_tree_add+0x36/0x95
    Jun 14 16:32:23 uk kernel: eax: 00000009 ebx: c8a05804 ecx: 00000000
    edx: 00000041
    Jun 14 16:32:23 uk kernel: esi: f76587ac edi: ec80e450 ebp: e3a1b358
    esp: cb136f3c
    Jun 14 16:32:23 uk kernel: ds: 007b es: 007b ss: 0068
    Jun 14 16:32:23 uk kernel: Process smbd (pid: 17852, threadinfo=cb136000
    task=d22a85b0)
    Jun 14 16:32:23 uk kernel: Stack: c8a05804 e9ae8300 c014e1ee c8a05804
    000000fb 00000000 caa99480 c014ed2e
    Jun 14 16:32:23 uk kernel: e3a1b358 e3a1b350 00000000 00000001
    00000000 ec80e3a0 e9ae8300 00042000
    Jun 14 16:32:23 uk kernel: b7867000 e3a1b334 e3a1b358 e3a1b350
    e9ae8300 e9ae8330 cb136000 caa99480
    Jun 14 16:32:23 uk kernel: Call Trace:
    Jun 14 16:32:23 uk kernel: [] vma_link+0x9c/0xbc
    Jun 14 16:32:23 uk kernel: [] do_mmap_pgoff+0x50e/0x666
    Jun 14 16:32:23 uk kernel: [] sys_mmap2+0x7e/0xaf
    Jun 14 16:32:23 uk kernel: [] syscall_call+0x7/0xb
    Jun 14 16:32:23 uk kernel: Code: c3 39 ca 74 08 0f 0b 0f 02 64 4e 2e c0
    8b 43 08 2b 43 04 c1 e8 0c 8d 54 02 ff 8b 46 08 2b 46 04 c1 e8 0c 8d 44
    01 ff 39 c2 74 08 <0f> 0b 10 02 64 4e 2e c0 c7 43 34 00 00 00 00 83 7e
    34 00 c7 43
    Jun 14 16:32:23 uk kernel: <0>Fatal exception: panic in 5 seconds


    Jun 14 17:11:46 uk syslogd 1.4.1: restart (remote reception).
    Jun 14 17:11:46 uk syslog: syslogd startup succeeded
    Jun 14 17:11:46 uk kernel: klogd 1.4.1, log source = /proc/kmsg started.

    ******************** third crash ************************
    Jun 15 17:26:36 uk kernel: ------------[ cut here ]------------
    Jun 15 17:26:36 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 15 17:26:36 uk kernel: invalid operand: 0000 [#1]
    Jun 15 17:26:36 uk kernel: SMP
    Jun 15 17:26:36 uk kernel: Modules linked in: vfat fat usb_storage md5
    ipv6 parport_pc lp parport tun sunrpc ipt_MASQUERADE ipt_TOS ipt_LOG
    iptable_filter iptable_mangle iptable_nat ip_conntrack ip_tables button
    battery ac ohci_hcd e1000 tg3 floppy st ext3 jbd dm_mod gdth aic79xx
    sata_sil libata sd_mod scsi_mod
    Jun 15 17:26:36 uk kernel: CPU: 0
    Jun 15 17:26:36 uk kernel: EIP: 0060:[] Not tainted VLI
    Jun 15 17:26:36 uk kernel: EFLAGS: 00010216 (2.6.9-22.0.1.EL.1omsmp)
    Jun 15 17:26:36 uk kernel: EIP is at vma_prio_tree_add+0x36/0x95
    Jun 15 17:26:36 uk kernel: eax: 00000009 ebx: f2cf2754 ecx: 00000000
    edx: 00000031
    Jun 15 17:26:36 uk kernel: esi: f649b124 edi: f6616cb0 ebp: daf1b3b0
    esp: c4093f3c
    Jun 15 17:26:36 uk kernel: ds: 007b es: 007b ss: 0068
    Jun 15 17:26:36 uk kernel: Process smbd (pid: 12530, threadinfo=c4093000
    task=f72bf1f0)
    Jun 15 17:26:36 uk kernel: Stack: f2cf2754 f07e2600 c014e1ee f2cf2754
    000000fb 00000000 f374ab00 c014ed2e
    Jun 15 17:26:36 uk kernel: daf1b3b0 daf1b3a8 00000000 00000001
    00000000 f6616c00 f07e2600 00032000
    Jun 15 17:26:36 uk kernel: b7bf6000 daf1b38c daf1b3b0 daf1b3a8
    f07e2600 f07e2630 c4093000 f374ab00
    Jun 15 17:26:36 uk kernel: Call Trace:
    Jun 15 17:26:36 uk kernel: [] vma_link+0x9c/0xbc
    Jun 15 17:26:36 uk kernel: [] do_mmap_pgoff+0x50e/0x666
    Jun 15 17:26:36 uk kernel: [] sys_mmap2+0x7e/0xaf
    Jun 15 17:26:36 uk kernel: [] syscall_call+0x7/0xb
    Jun 15 17:26:36 uk kernel: Code: c3 39 ca 74 08 0f 0b 0f 02 64 4e 2e c0
    8b 43 08 2b 43 04 c1 e8 0c 8d 54 02 ff 8b 46 08 2b 46 04 c1 e8 0c 8d 44
    01 ff 39 c2 74 08 <0f> 0b 10 02 64 4e 2e c0 c7 43 34 00 00 00 00 83 7e
    34 00 c7 43
    Jun 15 17:26:36 uk kernel: <0>Fatal exception: panic in 5 seconds


    Am I reading this right? The Process involved on each of these kernel
    panics is "Process smbd"?

    Jun 9 10:58:28 uk kernel: Process smbd (pid: 21513, threadinfo=db80e000
    task=c269eef0)

    Jun 14 16:32:23 uk kernel: Process smbd (pid: 17852, threadinfo=cb136000
    task=d22a85b0)

    Jun 15 17:26:36 uk kernel: Process smbd (pid: 12530, threadinfo=c4093000
    task=f72bf1f0)

    I am sorry if I point the finger at the wrong thing here. But it seems
    strange that a server starts kernel panicking in this 'consistent' way
    always showing the same process 'smbd' involved and combined with the
    fact that the samba rpm upgrade is the only thing that recently changed
    on this server.


    Or is the fault really a kernel bug as the log file entry suggests with
    "kernel BUG at mm/prio_tree.c:528!"

    Jun 9 10:58:28 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 9 10:58:28 uk kernel: invalid operand: 0000 [#1]
    Jun 9 10:58:28 uk kernel: SMP

    Jun 14 16:32:23 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 14 16:32:23 uk kernel: invalid operand: 0000 [#1]
    Jun 14 16:32:23 uk kernel: SMP

    Jun 15 17:26:36 uk kernel: kernel BUG at mm/prio_tree.c:528!
    Jun 15 17:26:36 uk kernel: invalid operand: 0000 [#1]
    Jun 15 17:26:36 uk kernel: SMP

    Any clever ideas? I will explore the redhat kernel list and see if there
    is a newer one maybe one from CentOS 4.5?

    Google gives me a number of hits dating back many months where the
    kernel BUG "kernel BUG at mm/prio_tree.c:528!" has been triggered with a
    variety of processes (some smbds - but also a few others)

    Many thanks for any pointers. Would be really great if I could tell
    people Monday morning when they come back to work, that we have found
    the culprit, or better that we have managed to fix it even. There is to
    hopeing.

    Regards,

    --
    Urs Rau

    --
    To unsubscribe from this list go to the following URL and read the
    instructions: https://lists.samba.org/mailman/listinfo/samba

  2. Re: [Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4


    Urs Rau wrote:

    > Any clever ideas? I will explore the redhat kernel list and see if there
    > is a newer one maybe one from CentOS 4.5?
    >


    Should have spent some more time on this, in the first place.

    I have found an entry in the redhat bugzilla, that looks like it might fit.

    There are two entries in redhat bugzilla for rhel 4 error "kernel BUG at
    mm/prio_tree.c"

    https://bugzilla.redhat.com/bugzilla....cgi?id=185472
    https://bugzilla.redhat.com/bugzilla....cgi?id=173981

    Bug 173981 was closed with an ERRATA issued in mid 2006 which upgrades
    the kernel up to 2.6.9-34.EL (we are still at 2.6.9-22.0.1)

    http://rhn.redhat.com/errata/RHSA-2006-0132.html

    I have now temporarily upgraded our kernel to the latest centos 4.5 one
    2.6.9-55.EL.

    We will monitor this and report back, hopefully the crashes are really a
    kernel bug and not a samba bug and will now have been fixed by this
    upgrade. Sorry, but it seemed to point in the direction of smbd, at
    least at first glance.

    Will be keeping you posted if this changes again.


    --
    Urs Rau

    --
    To unsubscribe from this list go to the following URL and read the
    instructions: https://lists.samba.org/mailman/listinfo/samba

  3. Re: [Samba] had 3 kernel panics since upgrade from 3.0.21a to 3.0.25 and 3.0.25a on CentOS 4.4

    On Sat, Jun 16, 2007 at 05:47:46PM +0100, Urs Rau wrote:
    > We will monitor this and report back, hopefully the crashes are really a
    > kernel bug and not a samba bug and will now have been fixed by this
    > upgrade. Sorry, but it seemed to point in the direction of smbd, at
    > least at first glance.


    Sorry to be blunt, but kernel crashes are almost always
    kernel problems. Samba partly runs as root and could in
    theory play dirty games making the kernel crash, but I don't
    know any code path that would do so.

    So, crash in the kernel is always it's own business.

    Volker

    --
    To unsubscribe from this list go to the following URL and read the
    instructions: https://lists.samba.org/mailman/listinfo/samba
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.5 (GNU/Linux)

    iD8DBQFGdC0cpZr5CauZH5wRArpoAJ9JL0iUb31VNPAoyR/wTK2qEJyNiACfWQhP
    a3yp9ibHUwDxyomzb6ZGYgs=
    =0zxM
    -----END PGP SIGNATURE-----


+ Reply to Thread