Realtek RTL8111/8168B IRQ clash with IDE driver?
Hi. I've got a brand-new system with a Gigabyte P35-DS4 motherboard,
which has an embedded Realtek RTL8111/8168B gigabit network controller.
I'm running Linux 2.6.23.14, freshly fetched from kernel.org a couple of
weeks ago.
The system was running perfectly ... until I decided to start using the
network. With both the Linux kernel's r8169 module and the r8168 driver
from realtek.com.tw - separately loaded, one at a time - I have the same
problem - the driver loads properly, the eth0 interface configures
properly, all the networking functions operate correctly ... but when I
receive packets at the full 100Mbit/s rate from another machine (both my
eth0 and the other machine auto-negotiated to 100Mb/sec full duplex) I
see various errors suddenly pop up in the syslog:
sshd[4685]: error: channel 0: chan_read_failed for istate 1
sshd[4685]: error: channel 0: chan_read_failed for istate 3
last message repeated 20 times
kernel: hda: cdrom_pc_intr: The drive appears confused (ireason = 0x01).
Trying to recover by ending request.
last message repeated 3 times
kernel: ide: failed opcode was: unknown
kernel: hda: drive not ready for command
kernel: hda: status error: status=0x58 { DriveReady SeekComplete
DataRequest }
And so forth.
When either of the r8169/r8168 modules are loaded they report as follows
in the log (this example is the regular Linux (kernel.org) r8169
module):
kernel: 8169 Gigabit Ethernet driver 2.2LK loaded
kernel: ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 17 (level, low) ->
IRQ 17
kernel: eth0: RTL8168b/8111b at 0xf8d1c000, 00:1a:4d:58:a3:54, XID
38000000 IRQ 17
A look at the IRQ 17 line in /proc/interrupts shows that the IDE driver
and the Realtek driver are both sharing IRQ 17:
# fgrep eth /proc/interrupts
17: 58077 4 98999 129160 IO-APIC-fasteoi ide0, eth0
Given the kernel messages about 'hda' - which is my sole IDE disk device
on the system, the DVD-ROM drive (all my hard disk drives are SATA/AHCI)
- it seems to me that the realtek driver is losing interrupts, or the
IDE driver is picking up the interrupts destined for the ethernet
device. But it's been a loooong time since I had to play with PC
hardware and interrupts ... I don't have a clue how IRQs are
(automatically?) assigned on a PCI bus these days, nor how to change
things.
Has anyone had this problem with the embedded Realtek RTL8168/8111
driver and hardware interrupt confusion with moderate to high network
activity?
How can I 'move' the Realtek device to another interrupt? Is there a
general 'what to do with messy interrupt conflicts on PCI busses' HOWTO
out there for a hardware novice?
Many thanks for any help ... I'm rather desperate - I thought this new
system was working fine until I started to use it for real over the
network! :-(
Regards,
Brad
Re: Realtek RTL8111/8168B IRQ clash with IDE driver?
Brad wrote:[color=blue]
> Hi. I've got a brand-new system with a Gigabyte P35-DS4 motherboard,
> which has an embedded Realtek RTL8111/8168B gigabit network controller.
> I'm running Linux 2.6.23.14, freshly fetched from kernel.org a couple of
> weeks ago.
>
> The system was running perfectly ... until I decided to start using the
> network. With both the Linux kernel's r8169 module and the r8168 driver
> from realtek.com.tw - separately loaded, one at a time - I have the same
> problem - the driver loads properly, the eth0 interface configures
> properly, all the networking functions operate correctly ... but when I
> receive packets at the full 100Mbit/s rate from another machine (both my
> eth0 and the other machine auto-negotiated to 100Mb/sec full duplex) I
> see various errors suddenly pop up in the syslog:
>
> sshd[4685]: error: channel 0: chan_read_failed for istate 1
> sshd[4685]: error: channel 0: chan_read_failed for istate 3
> last message repeated 20 times
> kernel: hda: cdrom_pc_intr: The drive appears confused (ireason = 0x01).
> Trying to recover by ending request.
> last message repeated 3 times
> kernel: ide: failed opcode was: unknown
> kernel: hda: drive not ready for command
> kernel: hda: status error: status=0x58 { DriveReady SeekComplete
> DataRequest }
>
> And so forth.
>
> When either of the r8169/r8168 modules are loaded they report as follows
> in the log (this example is the regular Linux (kernel.org) r8169
> module):
>
> kernel: 8169 Gigabit Ethernet driver 2.2LK loaded
> kernel: ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 17 (level, low) ->
> IRQ 17
> kernel: eth0: RTL8168b/8111b at 0xf8d1c000, 00:1a:4d:58:a3:54, XID
> 38000000 IRQ 17
>
> A look at the IRQ 17 line in /proc/interrupts shows that the IDE driver
> and the Realtek driver are both sharing IRQ 17:
>
> # fgrep eth /proc/interrupts
> 17: 58077 4 98999 129160 IO-APIC-fasteoi ide0, eth0
>
> Given the kernel messages about 'hda' - which is my sole IDE disk device
> on the system, the DVD-ROM drive (all my hard disk drives are SATA/AHCI)
> - it seems to me that the realtek driver is losing interrupts, or the
> IDE driver is picking up the interrupts destined for the ethernet
> device. But it's been a loooong time since I had to play with PC
> hardware and interrupts ... I don't have a clue how IRQs are
> (automatically?) assigned on a PCI bus these days, nor how to change
> things.
>
> Has anyone had this problem with the embedded Realtek RTL8168/8111
> driver and hardware interrupt confusion with moderate to high network
> activity?
>
> How can I 'move' the Realtek device to another interrupt? Is there a
> general 'what to do with messy interrupt conflicts on PCI busses' HOWTO
> out there for a hardware novice?[/color]
The problem is probably with your BIOS settings; make sure that your
BIOS knows that you are running a Plug-and-Play OS and set the "Reset
Interrupt Defaults" option if there is one.
Robert
[color=blue]
>
> Many thanks for any help ... I'm rather desperate - I thought this new
> system was working fine until I started to use it for real over the
> network! :-(
>
> Regards,
>
>
> Brad[/color]
Re: Realtek RTL8111/8168B IRQ clash with IDE driver?
Robert Harris wrote:
[color=blue]
> Brad wrote:[color=green]
>>
>> How can I 'move' the Realtek device to another interrupt? Is there a
>> general 'what to do with messy interrupt conflicts on PCI busses' HOWTO
>> out there for a hardware novice?[/color]
>
> The problem is probably with your BIOS settings; make sure that your
> BIOS knows that you are running a Plug-and-Play OS and set the "Reset
> Interrupt Defaults" option if there is one.[/color]
I've got a fairly recent version of the Award Bios on this new machine;
BIOS version F7, BIOS date 09/07/2007. Unfortunately it doesn't seem
to have any options along those lines.
In a 'PnP/PCI Configuration' menu there are just two options to
set/unset:
PCI1 IRQ assignment
PCI2 IRQ assignment
Both are set to 'auto' and can be changed to a specific number.
But I can't see how changing either of these very general settings
to specific IRQ values will help change the
assignment of the particular IDE0 port or realtek controller?
I've tried a few kernel boot options - 'pci=routeirq', 'pci=noacpi' -
in trying to get the ide0 driver and Realtek driver to use
different IRQs, but they stubbornly keep using the same one. The
'pci=routeirq' option made them *both* shift from IRQ 17 to IRQ 20.
I've got no idea how Linux assigns/reads these IRQs for PCI hardware
so I'm just shooting in the dark. I don't even know if there's a
way to tell the Realtek controller - when I modprobe r8169.ko - to
just go and use something else. Things have changed a lot since
I had to manually go and change ISA IRQ values in the old days!
Arrgh.
Brad