Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: offtopic: weird hardware problem?!

Author: Andrew Williams

Date: 01:39:34 09/06/03

Go up one level in this thread


On September 06, 2003 at 04:11:18, Rick Bischoff wrote:

>Hi all, sorry for the offtopic post, but I trust all of you more than Joe Schmoe
>at the local computer store.  Anywho, I have a very weird transient hardware
>error..
>
>System)
>AMD Athlon-XP 1700+, 512 MB PC-100 RAM in an ABit KR7A motherboard.
>Video card is GeForce3 Ti200
>Sound card is SoundBlaster Live!
>Ethernet card 1 is Intel Pro Express 100
>Ethernet card 2 is generic Linksys
>Hard drive is 40GB Western Digital
>
>It is currently running a fresh (as in installed yesterday after giving up on
>trying to install Gentoo (see below)) Windows XP install, fully patched.  All
>components seems to be working ok.
>
>Symptom 1)
>I was trying to install Gentoo Linux the other day and "gcc" would have random
>segmentation faults during the build process.  For example, during the kernel
>build, "make bzImage", gcc would report an internel error, segmentation fault.
>I would then run "make bzImage" again and it would compile the file it
>previously crashed on just fine.
>
>Symptom 2)
>Fritz 8 crashes during Full Analysis at random times.  It also crashed during a
>normal game after a fresh install.
>
>History--
>This problem had been occuring for, oh, say the last 6 months, but since my main
>machine is the G4, I just let this box sit here and be a router... But I want to
>start using this computer again.  So, I suspected the problem was with
>overheating, and I replaced the CPU fan with a stock fan from the computer store
>and added a case fan and also cleaned up tons of dust.  The problem is still
>here even though my CPU never goes above 51 celsius.
>
>I did sucessfully run the memtest program and it passed in 27 minutes, so I am
>pretty sure it is not a memory problem-- which leaves two things... The CPU is
>faulty or the hard drive is-- how do I determine which it is?  Money IS AN
>object to consider here, I would like to get it back up with minimal cost.


How new is your system? With my current system, from time to time I have found
that it would react adversely to dust on the motherboard. These errors tend to
be very intermittent.

One time I spent several weeks trying to track down an intermittent crash in PM.
First I thought it was a HW glitch. Then it started to happen more and more
frequently, but I could never pinpoint it (I was doing a lot of testing in games
rather than test sets). Eventually I found a position which would *always*
manifest the problem. Excellent. So for a few evenings I focused all my
attention on this position. It took about five minutes for the crash to happen,
so my next step was to find a recent older version and run that. Ten minutes
with no crash. I relaxed at this point, knowing that this bug could not escape a
rational process of elimination. So back to the new version and time to start
unrolling these changes. Take away the most recent change. Crash. Try the next
change. Crash. And the next changes. Crash. On and on and on over a very long
evening. Next evening I continued. Unroll a change, crash. Unroll the next
change, crash. Often these crashes were taking *less* than five minutes. By this
time they were taking between 30 and 90 seconds. Finally, on a whim, I tried the
older version again. Immediate crash.

It took a while, but the penny dropped eventually. I thought I had one variable,
namely the program source code which I was intending to change until it reached
the older version. But of course I had *two* variables, the other being *time*.
The solution was simple. I opened up my PC and pulled all the memory sticks out.
I used a miniature vacuum cleaner to clean up the inside. I used one of those
compressed air spray things to loosen up any more dust and used the vacuum
cleaner again. Then the compressed air again and the vacuum again. Then I put
the machine back together again, crossed my fingers and switched on. Fuck! Now
the damn thing won't even boot. I was so sure it was the memory. It *had* to be
the memory. Oh shit. The memory. I opened up the PC. *RE-INSERTED* the memory.
And tried again. This time, success. I tried the latest version for ten minutes,
no crash. Then I went through every new version I had created, giving each ten
minutes. Each passed the test.

The conclusion of this tale: try cleaning the inside of your PC if it's not new.

Andrew



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.