WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xfe/0x17e() with tg3 network - Kernel

This is a discussion on WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xfe/0x17e() with tg3 network - Kernel ; I have duplicate this with kernel 2.6.27.2 and 2.6.27.5, no extra modules, tg3 Gbit networking. I have not yet tested earlier kernels to see if this has been around for a while. So far I have had this error happen ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xfe/0x17e() with tg3 network

  1. WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xfe/0x17e() with tg3 network

    I have duplicate this with kernel 2.6.27.2 and 2.6.27.5, no
    extra modules, tg3 Gbit networking. I have not yet tested
    earlier kernels to see if this has been around for a while.

    So far I have had this error happen 5 times (MTBF is maybe
    12 hours), 4 of the 5 times resulted in the networking being
    broken, one time things came back by itself without a reboot,
    I believe in this case the hang was traffic coming into the
    machine vs the other times going out of the machine.

    Unloading all of the network modules and reloading them did
    not correct the problem.

    Searching google finds a couple of other people getting the
    same error but they have a different network chipset (e1000
    and a rt811C chipset), which makes me thing that there is
    something interacting bad with the network. Or does this
    error truly mean that the network chipset for some unknown reason
    locked itself up?

    http://www.google.com/url?sa=U&start...4q_FEpk6oubxxg
    http://article.gmane.org/gmane.linux.network/110238

    The changes I made recently were to upgrade my MB (old
    was E100 on a 100Mbit network,new is tg3 on a Gbit network,
    cpu and memory are the same, MB chipset is a intel 955
    chipset vs the old being a intel 915 chipset).

    Autoneg is turned on all around, the GBit switch is a
    8-port Dlink switch. The network seems to otherwise be working
    correctly.

    I did test the network under decent load and the error did not
    appear to be any more likely under load, and typically the network
    is under very light load 2-3MB/second.

    The machine originally had 2 HT CPU's showing up, I turned off HT
    so that only one cpu was showing, but this did not change the error.

    I am first turning off all offload capabilities on tg3 and going
    to see if that changes anything.

    The next thing I am going to be doing is to turn of GB capability
    on the networking and see if that does anything.

    I also have a second tg3 port that is slightly different, so I may
    try that eventually.

    What else can I look at?






    Nov 11 00:44:39 computer kernel: ------------[ cut here ]------------
    Nov 11 00:44:39 computer kernel: WARNING: at net/sched/sch_generic.c:219
    dev_watchdog+0xfe/0x17e()
    Nov 11 00:44:39 computer kernel: NETDEV WATCHDOG: eth0 (tg3): transmit timed out
    Nov 11 00:44:39 computer kernel: Modules linked in: nfsd auth_rpcgss exportfs
    w83627ehf hwmon_vid hwmon nfs lockd nfs_acl sunrpc ipv6 xfs raid456 async_xor
    async_memcpy async_tx xor video output sbs sbshc battery ac lgdt330x cx88_dvb
    wm8775 cx88_vp3054_i2c cx25840 tuner_simple tuner_types tda9887 tda8290 tuner
    mt2131 s5h1409 snd_hda_intel snd_seq_dummy ivtv cx8800 snd_seq_oss cx88_alsa
    cx8802 cx88xx cx23885 snd_seq_midi_event snd_seq ir_common videodev v4l1_compat
    i2c_algo_bit cx2341x firewire_ohci iTCO_wdt snd_seq_device compat_ioctl32
    videobuf_dvb i2c_i801 firewire_core tveeprom floppy iTCO_vendor_support
    v4l2_common snd_pcm_oss dvb_core pcspkr tg3 sata_sil i2c_core btcx_risc
    videobuf_dma_sg crc_itu_t snd_mixer_oss libphy videobuf_core snd_pcm parport_pc
    parport snd_timer snd soundcore button snd_page_alloc sg dm_snapshot dm_zero
    dm_mirror dm_log dm_mod ahci ata_piix ata_generic libata sd_mod scsi_mod ext3
    jbd mbcache ehci_hcd ohci_hcd uhci_hcd [last unloaded: eeprom]
    Nov 11 00:44:39 computer kernel: Pid: 0, comm: swapper Not tainted 2.6.27.5 #2
    Nov 11 00:44:39 computer kernel: [] warn_slowpath+0x61/0x83
    Nov 11 00:44:39 computer kernel: [] usb_hcd_submit_urb+0x75c/0x811
    Nov 11 00:44:39 computer kernel: [] hiddev_hid_event+0x0/0x64
    Nov 11 00:44:39 computer kernel: [] hid_process_event+0x58/0x5f
    Nov 11 00:44:39 computer kernel: [] __next_cpu+0x12/0x21
    Nov 11 00:44:39 computer kernel: [] find_busiest_group+0x23e/0x672
    Nov 11 00:44:39 computer kernel: [] clocksource_get_next+0x39/0x3f
    Nov 11 00:44:39 computer kernel: [] update_wall_time+0x567/0x70c
    Nov 11 00:44:39 computer kernel: [] read_tsc+0x6/0x22
    Nov 11 00:44:39 computer kernel: [] getnstimeofday+0x37/0xc1
    Nov 11 00:44:39 computer kernel: [] uhci_scan_schedule+0x11b/0x6b0
    [uhci_hcd]
    Nov 11 00:44:39 computer kernel: [] dev_watchdog+0xfe/0x17e
    Nov 11 00:44:39 computer kernel: [] __mod_timer+0x99/0xa3
    Nov 11 00:44:39 computer kernel: [] rh_timer_func+0x0/0x5
    Nov 11 00:44:39 computer kernel: [] usb_hcd_poll_rh_status+0x12b/0x133
    Nov 11 00:44:39 computer kernel: [] tick_dev_program_event+0x1e/0x81
    Nov 11 00:44:39 computer kernel: [] dev_watchdog+0x0/0x17e
    Nov 11 00:44:39 computer kernel: [] run_timer_softirq+0x10e/0x167
    Nov 11 00:44:39 computer kernel: [] dev_watchdog+0x0/0x17e
    Nov 11 00:44:39 computer kernel: [] __do_softirq+0x5d/0xc1
    Nov 11 00:44:39 computer kernel: [] do_softirq+0x32/0x36
    Nov 11 00:44:39 computer kernel: [] smp_apic_timer_interrupt+0x6e/0x79
    Nov 11 00:44:39 computer kernel: [] apic_timer_interrupt+0x28/0x30
    Nov 11 00:44:39 computer kernel: [] mwait_idle+0x32/0x38
    Nov 11 00:44:39 computer kernel: [] cpu_idle+0xbd/0xd5
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0xfe/0x17e() with tg3 network

    (netdev CC'ed)

    On Tue, 2008-11-11 at 03:48 -0600, Roger Heflin wrote:
    > I have duplicate this with kernel 2.6.27.2 and 2.6.27.5, no
    > extra modules, tg3 Gbit networking. I have not yet tested
    > earlier kernels to see if this has been around for a while.


    How do more recent kernels do?

    > So far I have had this error happen 5 times (MTBF is maybe
    > 12 hours), 4 of the 5 times resulted in the networking being
    > broken, one time things came back by itself without a reboot,
    > I believe in this case the hang was traffic coming into the
    > machine vs the other times going out of the machine.
    >
    > Unloading all of the network modules and reloading them did
    > not correct the problem.
    >
    > Searching google finds a couple of other people getting the
    > same error but they have a different network chipset (e1000
    > and a rt811C chipset), which makes me thing that there is
    > something interacting bad with the network. Or does this
    > error truly mean that the network chipset for some unknown reason
    > locked itself up?
    >
    > http://www.google.com/url?sa=U&start...4q_FEpk6oubxxg
    > http://article.gmane.org/gmane.linux.network/110238
    >
    > The changes I made recently were to upgrade my MB (old
    > was E100 on a 100Mbit network,new is tg3 on a Gbit network,
    > cpu and memory are the same, MB chipset is a intel 955
    > chipset vs the old being a intel 915 chipset).
    >
    > Autoneg is turned on all around, the GBit switch is a
    > 8-port Dlink switch. The network seems to otherwise be working
    > correctly.
    >
    > I did test the network under decent load and the error did not
    > appear to be any more likely under load, and typically the network
    > is under very light load 2-3MB/second.
    >
    > The machine originally had 2 HT CPU's showing up, I turned off HT
    > so that only one cpu was showing, but this did not change the error.
    >
    > I am first turning off all offload capabilities on tg3 and going
    > to see if that changes anything.
    >
    > The next thing I am going to be doing is to turn of GB capability
    > on the networking and see if that does anything.
    >
    > I also have a second tg3 port that is slightly different, so I may
    > try that eventually.
    >
    > What else can I look at?
    >
    >
    >
    >
    >
    >
    > Nov 11 00:44:39 computer kernel: ------------[ cut here ]------------
    > Nov 11 00:44:39 computer kernel: WARNING: at net/sched/sch_generic.c:219
    > dev_watchdog+0xfe/0x17e()
    > Nov 11 00:44:39 computer kernel: NETDEV WATCHDOG: eth0 (tg3): transmit timed out
    > Nov 11 00:44:39 computer kernel: Modules linked in: nfsd auth_rpcgss exportfs
    > w83627ehf hwmon_vid hwmon nfs lockd nfs_acl sunrpc ipv6 xfs raid456 async_xor
    > async_memcpy async_tx xor video output sbs sbshc battery ac lgdt330x cx88_dvb
    > wm8775 cx88_vp3054_i2c cx25840 tuner_simple tuner_types tda9887 tda8290 tuner
    > mt2131 s5h1409 snd_hda_intel snd_seq_dummy ivtv cx8800 snd_seq_oss cx88_alsa
    > cx8802 cx88xx cx23885 snd_seq_midi_event snd_seq ir_common videodev v4l1_compat
    > i2c_algo_bit cx2341x firewire_ohci iTCO_wdt snd_seq_device compat_ioctl32
    > videobuf_dvb i2c_i801 firewire_core tveeprom floppy iTCO_vendor_support
    > v4l2_common snd_pcm_oss dvb_core pcspkr tg3 sata_sil i2c_core btcx_risc
    > videobuf_dma_sg crc_itu_t snd_mixer_oss libphy videobuf_core snd_pcm parport_pc
    > parport snd_timer snd soundcore button snd_page_alloc sg dm_snapshot dm_zero
    > dm_mirror dm_log dm_mod ahci ata_piix ata_generic libata sd_mod scsi_mod ext3
    > jbd mbcache ehci_hcd ohci_hcd uhci_hcd [last unloaded: eeprom]
    > Nov 11 00:44:39 computer kernel: Pid: 0, comm: swapper Not tainted 2.6.27.5 #2
    > Nov 11 00:44:39 computer kernel: [] warn_slowpath+0x61/0x83
    > Nov 11 00:44:39 computer kernel: [] usb_hcd_submit_urb+0x75c/0x811
    > Nov 11 00:44:39 computer kernel: [] hiddev_hid_event+0x0/0x64
    > Nov 11 00:44:39 computer kernel: [] hid_process_event+0x58/0x5f
    > Nov 11 00:44:39 computer kernel: [] __next_cpu+0x12/0x21
    > Nov 11 00:44:39 computer kernel: [] find_busiest_group+0x23e/0x672
    > Nov 11 00:44:39 computer kernel: [] clocksource_get_next+0x39/0x3f
    > Nov 11 00:44:39 computer kernel: [] update_wall_time+0x567/0x70c
    > Nov 11 00:44:39 computer kernel: [] read_tsc+0x6/0x22
    > Nov 11 00:44:39 computer kernel: [] getnstimeofday+0x37/0xc1
    > Nov 11 00:44:39 computer kernel: [] uhci_scan_schedule+0x11b/0x6b0
    > [uhci_hcd]
    > Nov 11 00:44:39 computer kernel: [] dev_watchdog+0xfe/0x17e
    > Nov 11 00:44:39 computer kernel: [] __mod_timer+0x99/0xa3
    > Nov 11 00:44:39 computer kernel: [] rh_timer_func+0x0/0x5
    > Nov 11 00:44:39 computer kernel: [] usb_hcd_poll_rh_status+0x12b/0x133
    > Nov 11 00:44:39 computer kernel: [] tick_dev_program_event+0x1e/0x81
    > Nov 11 00:44:39 computer kernel: [] dev_watchdog+0x0/0x17e
    > Nov 11 00:44:39 computer kernel: [] run_timer_softirq+0x10e/0x167
    > Nov 11 00:44:39 computer kernel: [] dev_watchdog+0x0/0x17e
    > Nov 11 00:44:39 computer kernel: [] __do_softirq+0x5d/0xc1
    > Nov 11 00:44:39 computer kernel: [] do_softirq+0x32/0x36
    > Nov 11 00:44:39 computer kernel: [] smp_apic_timer_interrupt+0x6e/0x79
    > Nov 11 00:44:39 computer kernel: [] apic_timer_interrupt+0x28/0x30
    > Nov 11 00:44:39 computer kernel: [] mwait_idle+0x32/0x38
    > Nov 11 00:44:39 computer kernel: [] cpu_idle+0xbd/0xd5
    > --
    > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    > the body of a message to majordomo@vger.kernel.org
    > More majordomo info at http://vger.kernel.org/majordomo-info.html
    > Please read the FAQ at http://www.tux.org/lkml/

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread