motherboard failure? Confusing symptoms. - Hardware

This is a discussion on motherboard failure? Confusing symptoms. - Hardware ; On my wife's machine (debian sarge), instead of the default gnome desktop manager, it switched iself to KDE, then the X server disappeared. Then # apt-show-versions says that neither kde or gnome is installed, etc. I thought immediately of hard ...

+ Reply to Thread
Results 1 to 15 of 15

Thread: motherboard failure? Confusing symptoms.

  1. motherboard failure? Confusing symptoms.

    On my wife's machine (debian sarge), instead of the default gnome
    desktop manager, it switched iself to KDE, then the X server
    disappeared. Then # apt-show-versions says that neither kde or gnome is
    installed, etc. I thought immediately of hard disk failure, but now I'm
    concerned about the motherboard itself, and need to figure out which.

    I am able to boot the SCSI hard disk to console. However, if I try to
    boot from a USB-key with debian installer on it, and make USB-key the
    sole boot device, it returns "Boot Failure: Sytem Halted". If there are
    other book device options in BIOS, the USB-key is skipped. If I try to
    boot a Debian installation CD, the same thing happens.

    I can book a Knoppix disk and use it to mount partitions on the hard
    disk and access their files.

    When I do book the hard disk to console, no errors are reported in dmesg
    or kern.log.

    If I try # dpkg-reconfigure xerver-xfree86, I'm told that
    xserver-xfree96 is broken or not fully installed. I can't do much about
    that because network access also broke down. # aptitude update can't
    access the mirror.

    Now I find that the problem with comunications is deep. When I have the
    machine ping itself, I get "Network Unreachable" Well, such a ping does
    not mean much, and so I look at resolv.conf

    # cat /etc/resolv.conf search gateway.2wire.net nameserver
    192.168.1.254

    I've never seen anythink like that (the ADSL modem is a 2Wire Gatesay
    model. Not understanding it, I went on to discover that the machine does
    not see the on board ethernet chip nor the ethernet card. The # ifconfig
    -a command sees lo, but no eth*

    So I look at dmesg: # dmesg | grep eth0, # dmesg | grep eth1. Nothing at
    all returned. In other words, the hardware is not seen during boot.

    Is there any way to decide (given no ability to install anything) if a)
    the hard disk is on its way out, b) the motherboard is flakey, or c) the
    problem is really in software? (I assume RAM is OK).

    --

    Haines Brown, KB1GRM




  2. Re: motherboard failure? Confusing symptoms.

    Haines Brown wrote:
    > On my wife's machine (debian sarge), instead of the default gnome
    > desktop manager, it switched iself to KDE, then the X server
    > disappeared.


    What was being done at the time this happened?

    What do the log files say?

    > I am able to boot the SCSI hard disk to console.
    > I can book a Knoppix disk and use it to mount partitions on the hard
    > disk and access their files.
    > When I do book the hard disk to console, no errors are reported in dmesg
    > or kern.log.


    Ok.

    > If I try # dpkg-reconfigure xerver-xfree86, I'm told that
    > xserver-xfree96 is broken or not fully installed.
    > I can't do much about that because network access also broke down.


    Does network access from the Knoppix disk work?
    Did it work before?

    > Now I find that the problem with comunications is deep. When I have the
    > machine ping itself, I get "Network Unreachable".


    > I've never seen anythink like that (the ADSL modem is a 2Wire Gatesay
    > model. Not understanding it, I went on to discover that the machine does
    > not see the on board ethernet chip nor the ethernet card.


    Did it stop working during use, or following a reboot?

    Could this be a problem with the BIOS parameters?

    What were they set at when this was working?
    What are they set to now?

    > Is there any way to decide (given no ability to install anything) if a)
    > the hard disk is on its way out, b) the motherboard is flakey, or c) the
    > problem is really in software? (I assume RAM is OK).


    The things that were happening at the time the computer stopped working,
    may give a clue as to whether this is a hardware or software failure.

    Work on one problem at a time, and don't panic.

    Boot from the Knoppix disk, and backup the data.

    Test the machine using Knoppix (or Ubuntu).

    Use fsck to check and repair the filesystems.

    Then try fixing your network settings, so that you can ping the local
    machine.

    Next check that you can connect to the internet again.

    Next try to install the missing packages.

    Regards,

    Mark.

    --
    Mark Hobley,
    393 Quinton Road West,
    Quinton, BIRMINGHAM.
    B32 1QE.

  3. Re: motherboard failure? Confusing symptoms.

    > installed, etc. I thought immediately of hard disk failure, but now I'm
    > concerned about the motherboard itself, and need to figure out which.


    One other issue might be bad ram. Although I suspect the hard disk
    myself. Just how old of a machine is it?

  4. Re: motherboard failure? Confusing symptoms.

    Thanks for the reply.

    This machine is six years old (MB Intel P850MVL, CPU Pentium 4 1.9 GHz,
    RAM is 1 Gb non-ECC RDRAM by Kingson).

    I'm vague about what happened at the time of the problem. I suspect my
    wife had problems with Evolution not working or hanging and I logged
    her back in, and found myself with the KDE desktop manager in lieu of the
    default GNOME, and subsequently I was stuck in console without any
    desktop manager at all. In fact now:

    # dpkg -l xfree86-common
    rc xfree86-common 4.5.0.dfsg.1-1 X Window System (XFree86) ...

    That is, while the configuration files are there, the x server has
    fallen into a black hole.

    My first reaction was software, but found that I lost any communication
    in order to install packages (machine can't see interfaces - see below).

    The log files (dmesg or kern.log) did not report errors. But when it
    seems my USB ports are dysfunctional I began to look more closely at
    dmesg. I did # dmesg | grep usbcore, eth0, eth1, usb, hub, ehci_hcd, and
    ohci_hcd and got nothing for any of them!

    # cat /proc/pci shows that the USB Controller and Ethernet controller
    are seen. However, # lsmod has not loaded a driver for any ethernet
    controller (e100), and no usbcore and no ohci_hcd or ehci_hcd drivers
    are loaded. However, I can do # modprobe e100 and usbcore and
    those modules load. Modules for ohci_hcd and ehci_hcd can't be
    located. I would assume these are kernel modules that should show up
    automatically with boot.

    Don't know if the collapse of the interfaces preceeded a reboot or only
    came after, but my sense is that I had a progressive problem that
    started before the first reboot.

    Now sure what your question is about a possible problem with BIOS
    parameters. I'm fairly comfortable with the BIOS boot setup. When I
    tried to boot a USB-key I only enabled USB Boot; Device Priority:
    Removable Device; Removable Device: USB Flash drive. When I tried to
    boot an Etch CD installation disk, I disabled everything except made 1st
    boot priority ATA CD-ROM, and set ATAPI drive to Toshiba DVD-ROM. In
    both cases I got: Boot Failure: System Halted. I've not messed with any
    other BIOS settings.

    Oddly, I can boot the Knoppix cd-rom, and it can mount and access drives
    on the hard disk. No network connection, of course, for there's no eth0.

    When I boot from hard disk or from a boot floppy, dmesg only reports the
    error

    ATAPI device hdc: Error: Medium error - - sense key=0x03)
    ...
    The failed "Test Unit Ready" packet command was...
    hdc: packet command error... (DriveReady SeekComplete Error)

    I was going to suggest these errors do not seem to be a show-stopper,
    but now I can't boot the Knoppix disk either. However, I tried it again
    and it booted, and again and it didn't, and so it looks like a marginal
    and deteriorating situation.

    I've already backed up data, and so I'm not concerned with disk content
    (reinstalling an option, if I could boot the installation CDROM.

    You suggested doing a fsck on the partitions. When I did e2fsck -p on /dev/sda1
    (ro) and on /dev/sda6 (unmounted), it passed both checks without a
    hitch.

    You also suggest getting the network working, but not sure how to do
    that if there's no eth0 interface. I can ping 127.0.0.1 (the kernel),
    but not my machine: 192.168.1.1: "Network is unreachable".

    An obvious place to look is my routing table, but # route shows that the
    routing table is empty. I do # cat /etc/network/interfaces and get:

    auto eth0
    iface eth0 inet dhcp

    # ifconfig -a shows only lo until after I do a modprobe on eth0, when
    also shows up. So more basic than network connectivity seems to be the
    reason why modules are not being loaded, not only with a boot, but more
    fundamentally when I try to boot a USB device and CDROM independently of
    a hard disk.

    --

    Haines Brown, KB1GRM




  5. Re: motherboard failure? Confusing symptoms.

    Haines Brown wrote:
    > This machine is six years old (MB Intel P850MVL, CPU Pentium 4 1.9 GHz,
    > RAM is 1 Gb non-ECC RDRAM by Kingson).


    Many motherboards from that time had problems with the capacitors. Look at
    the capacitors on the motherboard, do they look bulging or maybe even
    leaked? There is a good description of the problem at
    http://www.siliconchip.com.au/cms/A_30328/article.html

    I have seen these kind of capacitor problem causing different kind of
    crashes and problems, however so far I haven't heard of any case were
    files got missing or configuration broke like it has in your case.

    regards Henrik
    --
    The address in the header is only to prevent spam. My real address is:
    hc3(at)poolhem.se Examples of addresses which go to spammers:
    root@localhost postmaster@localhost


  6. Re: motherboard failure? Confusing symptoms.

    Haines Brown wrote:
    > On my wife's machine (debian sarge)


    Is the default release set to "sarge" or to "stable".

    cat /etc/apt/

    APT:efault-Release "sarge" ;

    If this says "stable", or the file does not exist, then the package
    manager may have removed files not selected for removal.

    http://markhobley.yi.org:9090/AnnoyDselectRemoval

    Regards,

    Mark.

    --
    Mark Hobley,
    393 Quinton Road West,
    Quinton, BIRMINGHAM.
    B32 1QE.

  7. Re: motherboard failure? Confusing symptoms.

    markhobley@hotpop.donottypethisbit.com (Mark Hobley) writes:

    > Haines Brown wrote:
    >> On my wife's machine (debian sarge)

    >
    > Is the default release set to "sarge" or to "stable".
    >
    > cat /etc/apt/


    Mark, the sources.list refers to sarge, not testing, but you do remind
    me that just prior to the trouble I had done an # aptitude update, #
    aptitude upgrade.

    By doing # modprobe e100, # /etc/init.d/networking restart, I did take
    one step closer to recovering communications, for now I have a routing
    table:

    # route
    192.168.1.0 * ... eth0
    default home ... eth0

    And of course, # ping -b 192.168.1.0 now works. It seems that the
    machine is interfaced with internet, but not with itself. Not quite sure
    of the next step.

    --

    Haines Brown, KB1GRM




  8. Re: motherboard failure? Confusing symptoms.

    I've known about the capacitor problem for years, but never saw any
    photographs. Wow! Pretty dramatic.

    No, no bad capacitors that I can see.

    Let me ask a philosophical (i.e., not really answerable) question.

    The machine has in it: Intel D850MVL Motherboard
    Intel Pentium 4 1.9 GHz CPU
    1 Gb non-ECC RDRAMM, Rambus RIMM

    I've been through the exercise of replacing a broken MB on another
    machine, and it is hell getting something to accomodate the RAM and
    CPU. Do you think it would be easier and cheaper to buy an up-to-date
    motherboard, CPU and RAM of comparable performance and quality, or try
    to locate an Intel board to accomodate my CPU and RAM?

    --

    Haines Brown, KB1GRM




  9. Re: motherboard failure? Confusing symptoms.

    On Sat, 16 Feb 2008 23:33:23 GMT, Haines Brown wrote:
    >I've known about the capacitor problem for years, but never saw any
    >photographs. Wow! Pretty dramatic.


    >No, no bad capacitors that I can see.


    >Let me ask a philosophical (i.e., not really answerable) question.


    >The machine has in it: Intel D850MVL Motherboard
    > Intel Pentium 4 1.9 GHz CPU
    > 1 Gb non-ECC RDRAMM, Rambus RIMM


    >I've been through the exercise of replacing a broken MB on another
    >machine, and it is hell getting something to accomodate the RAM and
    >CPU. Do you think it would be easier and cheaper to buy an up-to-date
    >motherboard, CPU and RAM of comparable performance and quality, or try
    >to locate an Intel board to accomodate my CPU and RAM?


    Try to find one on ebay but don't pay more than $110.

    You can replace it all for about $140 and end up with DDR2 memory.

  10. Re: motherboard failure? Confusing symptoms.

    Haines Brown wrote:
    > By doing # modprobe e100, # /etc/init.d/networking restart, I did take
    > one step closer to recovering communications, for now I have a routing
    > table:
    >
    > # route
    > 192.168.1.0 * ... eth0
    > default home ... eth0
    >
    > And of course, # ping -b 192.168.1.0 now works. It seems that the
    > machine is interfaced with internet, but not with itself. Not quite sure
    > of the next step.


    Is the above two lines your entire routing table? On all systems that I
    have seen there have also been a line for loopback.

    The next step depends on your problem. Are you able to surf the web? If
    not, it could be a DNS problem if something is missing in /etc/resolv.conf.

    Is home the correct name of your router/firewall? Are you able to
    tracerout an internet ip address like this:

    traceroute 64.233.183.147

    regards Henrik
    --
    The address in the header is only to prevent spam. My real address is:
    hc3(at)poolhem.se Examples of addresses which go to spammers:
    root@localhost postmaster@localhost


  11. Re: motherboard failure? Confusing symptoms.

    AZ Nomad writes:

    > On Sat, 16 Feb 2008 23:33:23 GMT, Haines Brown
    > wrote:


    >>The machine has in it: Intel D850MVL Motherboard
    >> Intel Pentium 4 1.9 GHz CPU
    >> 1 Gb non-ECC RDRAMM, Rambus RIMM

    >
    > Try to find one on ebay but don't pay more than $110.
    >
    > You can replace it all for about $140 and end up with DDR2 memory.


    Never occurred to me to buy on eBay, but you are right, I see this
    (originally fairly expensive) board and the CPU on eBay for very
    little. The RAM would be the big expense.

    When you say "replace it all" are you saying that I could get the same
    motherboard and CPU, but replace the memory with better DDR2 memory that
    would be cheaper than replacing the RDRAMM Rambus? Or are you saying one
    could buy a new motherboard, CPU and DDR2 RAM for not much more than
    what it could cost to duplicate the old parts?

    --

    Haines Brown, KB1GRM




  12. Re: motherboard failure? Confusing symptoms.

    > replace the memory with better DDR2 memory that
    > would be cheaper than replacing the RDRAMM Rambus?


    Didn't Rambus get sued for misrepresenting the abilities/speed of that RAM
    type?

  13. Re: motherboard failure? Confusing symptoms.

    Henrik,

    I think I'm making some progress. Getting the feeling that an # Aptitude
    upgrade wiped out some programs. Such as X server, GNOME, and perhaps
    enough to explain why I'm no longer booting some needed modules.

    I'm not able to do a traceroot, and an # Aptitude update, and so I may
    be in a position to reconstruct the software damage, although I'm not
    sure how to reconstuct the base system which is very possibly
    damaged. Now that I have something running, I might simply do an #
    apt-get dist-upgrade and then install what applications such as X server
    I need.

    I'd feel more comfortable about this if I was more confident I do not
    have a hardware problem, but since I've got data backed up, there's
    nothing to loose.

    You asked if whether

    > # route
    > 192.168.1.0 * ... eth0
    > default home ... eth0


    were my compelte routing table. Yes it is. Looks odd to me, but I never
    thought much of my wife's configuration.

    Traceroot gets out and generates a report, and I'm able to ping a domain
    name. Resolv.conf has:

    search gateway.2wire.net
    nameserver 192.168.1.254

    which also strikes me as unusual. The gateway.2wire is my wife's
    modem. The nameserver address is her machine's address.

    I can't test whether I can surf the web, since my X server no longer
    exists. I went to install link2 to surf from console. However, my system
    failed to install it:

    ...
    Removing libgstreamer-gconf0.8-0...
    /var/lib/dpkg/info/libgstreamer-gconf0.8-0.prerm: line1: gconftool-2:
    command not found
    dpkg: error processing libgstreamer-gconf0.8-0 (--remove):
    subprocess pre-removalscript returned error exist status 127

    I'm not famliiar with an error like this, but rather than get tangled up
    in a lot of purges and installs, I will try to dist-upgrade.

    Incidentally, I put GNOME on my wife's system because I assumed that it
    would make her life easier, since she had no contact with computers at
    all. The disappearance of GNOME when I did an update/upgrade may account
    for the switch of her machine (until the X server died) to KDE. The
    error above smells like it might be related. She was frustrated with her
    O.O mail reader, and combined with my experience I now have, I'll just
    get rid of any desktop manager on her machine as I have always done with
    my own, and have her run mutt.
    --

    Haines Brown, KB1GRM




  14. Re: motherboard failure? Confusing symptoms.

    On Feb 17, 7:21 pm, Haines Brown
    wrote:
    > Henrik,
    >
    > I think I'm making some progress. Getting the feeling that an # Aptitude
    > upgrade wiped out some programs. Such as X server, GNOME, and perhaps
    > enough to explain why I'm no longer booting some needed modules.
    >
    > I'm not able to do a traceroot, and an # Aptitude update, and so I may
    > be in a position to reconstruct the software damage, although I'm not
    > sure how to reconstuct the base system which is very possibly
    > damaged. Now that I have something running, I might simply do an #
    > apt-get dist-upgrade and then install what applications such as X server
    > I need.
    >
    > I'd feel more comfortable about this if I was more confident I do not
    > have a hardware problem, but since I've got data backed up, there's
    > nothing to loose.
    >
    > You asked if whether
    >
    > > # route
    > > 192.168.1.0 * ... eth0
    > > default home ... eth0

    >
    > were my compelte routing table. Yes it is. Looks odd to me, but I never
    > thought much of my wife's configuration.
    >
    > Traceroot gets out and generates a report, and I'm able to ping a domain
    > name. Resolv.conf has:
    >
    > search gateway.2wire.net
    > nameserver 192.168.1.254
    >
    > which also strikes me as unusual. The gateway.2wire is my wife's
    > modem. The nameserver address is her machine's address.
    >
    > I can't test whether I can surf the web, since my X server no longer
    > exists. I went to install link2 to surf from console. However, my system
    > failed to install it:
    >
    > ...
    > Removing libgstreamer-gconf0.8-0...
    > /var/lib/dpkg/info/libgstreamer-gconf0.8-0.prerm: line1: gconftool-2:
    > command not found
    > dpkg: error processing libgstreamer-gconf0.8-0 (--remove):
    > subprocess pre-removalscript returned error exist status 127
    >
    > I'm not famliiar with an error like this, but rather than get tangled up
    > in a lot of purges and installs, I will try to dist-upgrade.
    >
    > Incidentally, I put GNOME on my wife's system because I assumed that it
    > would make her life easier, since she had no contact with computers at
    > all. The disappearance of GNOME when I did an update/upgrade may account
    > for the switch of her machine (until the X server died) to KDE. The
    > error above smells like it might be related. She was frustrated with her
    > O.O mail reader, and combined with my experience I now have, I'll just
    > get rid of any desktop manager on her machine as I have always done with
    > my own, and have her run mutt.
    > --
    >
    > Haines Brown, KB1GRM


    I have had similar booting problems trying to reboot a live system.
    Power down and the boot succeeds with no errors. Since the hard disk
    reboots after a power down and not without it suspicion is on a low
    voltage condition. The problem could also be ROM memory that is not
    being erased without a complete power down. Nothing pointed to the
    hard drive in my case. Heat can cause erratic behavior -- try
    cleaning your motherboard fans.

    GNOME vs KDE is another issue entirely. Xterm is/was not comparable
    with GNOME. Trying to apt xterm removed hundreds of files. That may
    be the cause of your problems with upgrade. Fluxbox solved my windows
    manager problems.

  15. Re: motherboard failure? Confusing symptoms.

    On Sat, 23 Feb 2008 01:49:22 -0800, budmaddock rearranged some electrons
    to say:

    >
    >
    > GNOME vs KDE is another issue entirely. Xterm is/was not comparable
    > with GNOME. Trying to apt xterm removed hundreds of files. That may be
    > the cause of your problems with upgrade. Fluxbox solved my windows
    > manager problems.


    Xterm works just fine with GNOME on FC7. I'm running it right now.


+ Reply to Thread