Package: linux-image-2.6.26-1-amd64
Version: 2.6.26-*
Followup-For: Bug #493479


After originally filing this bug report here on the Debian BTS, I
performed a kernel bisection and took my findings to the LKML. About
3 weeks later, the problem had finally been correctly diagnosed:
changes between 2.6.25 and 2.6.26 had reordered the sequence of
function calls responsible for detecting PCI devices, and in my case
an overlap of the memory resource region for the HPET device was
causing kernel hangs on my hardware. This had not occured with
2.6.25, but the changes leading up to 2.6.26 (which apparently fixed a
bug on other hardware) created the problem on my hardware.

My motherboard has an AMD SB600 southbridge, and some of the Linux
kernel team mistakenly believed that the problem is limited to that
chipset. I was able to discover (in a Google search) at least one
other person with completely different hardware -- Intel CPU +
motherboard combination -- who began experiencing hangs in the 2.6.26
kernel series at exactly the same SHA1 hash that I discovered when
carrying out my original bisection, so the changes between 2.6.25 and
2.6.26 affect more hardware combinations than my own.

After a lengthy and frustrating period of time trying to discover the
root cause of the hangs, patches were created that allowed me to build
a kernel that would not hang. These patches were primarily made by
Yinghai Lu and Ingo Molnar, and were against Ingo Molnar's tip/master
git tree. Unfortunately, the patches were created too late in the
development of 2.6.27: when Linus Torvalds saw the PCI code being
touched, and the potential problems that could create, he decided to
reject those changes for the 2.6.27 kernels and postpone them until
the 2.6.28 series.

I had hoped to provide a patch for the 2.6.26 kernels which Debian
will be using for its Lenny release, but my attention moved away from
these kernel issues until release candidates for 2.6.28 became
available. Only today did I return to this issue again to determine
the status of the situation. Here is what I found:


1) A patch had been provided on Sep. 12 by Jordan Crouse (a kernel
developer employed by AMD, IIRC) which should have allowed any
2.6.26 or 2.6.27 kernel to boot on my hardware:

http://www.uwsg.indiana.edu/hypermai...09.1/1902.html

This patch is supposed to prevent the memory resource region of
the HPET device on SB600 southbridge motherboards from overlapping
with the resource regions of other PCI devices. I found that this
patch fails to make any difference on my hardware (with said
southbridge) for any 2.6.26 or 2.6.27 kernel.


2) Since the release candidates for 2.6.28 are now up to "rc3", I
decided to begin with "rc1". I found that kernel 2.6.28-rc1 (from
Torvalds' git tree) would hang during boot when initializing the
HPET device, which I took as a bad sign! Booting the kernel with
the "nohpet" parameter allowed to kernel to boot all the way to a
login prompt, only to hang at that point. (Unlike the hangs
experienced with 2.6.2[67] kernels, I was able to use the Magic
SysRq keys to sync, unmount, and reboot my filesystem in a nice
way, however.)


3) Fully expecting fallout from the mad rush of changes that go
into that first 2-week window of changes, I checked out the
2.6.28-rc3 kernel from the Torvalds tree. Happily, this kernel
boots fine... and without the need for any special parameters.
The HPET device is working, and no hang occurs at the login
prompt. (I am submitting this BTS update using the very same
kernel, as can be seen below under "System Information"!)

I can now state that the 2.6.28 kernel series has resolved the
problems with hangs on my hardware.

It may now be possible to provide patches for 2.6.26 which will allow
it to boot without hanging on machines experiencing such problems.
Does the Debian Kernel Team have any interest in seeing such patches?
If so, I could start working on backporting the minimum set of changes
from 2.6.28 to 2.6.26 which cure the problem on my hardware.


-- System Information:
Debian Release: lenny/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.28-rc3.081104.fileserver.uvesafb (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-image-2.6.26-1-amd64 depends on:
ii debconf [debconf-2.0] 1.5.22 Debian configuration management sy
ii initramfs-tools [linux-initra 0.92j tools for generating an initramfs
ii module-init-tools 3.4-1 tools for managing Linux kernel mo

linux-image-2.6.26-1-amd64 recommends no packages.

Versions of packages linux-image-2.6.26-1-amd64 suggests:
ii grub 0.97-47 GRand Unified Bootloader (Legacy v
pn linux-doc-2.6.26 (no description available)



--
To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org