Author: Keith Evans
Date: 09:24:23 07/03/03
Go up one level in this thread
On July 03, 2003 at 05:13:18, Aaron Gordon wrote: >On July 03, 2003 at 04:26:09, Sune Fischer wrote: > >>On July 03, 2003 at 04:16:24, Aaron Gordon wrote: >> >>>On July 02, 2003 at 20:55:52, Keith Evans wrote: >>> >>>>On July 02, 2003 at 20:18:25, Sune Fischer wrote: >>>> >>>>>On July 02, 2003 at 19:37:46, Aaron Gordon wrote: >>>>>>You can test how close they are to the limit. Please read: >>>>>>http://www.talkchess.com/forums/1/message.html?304354 >>>>> >>>>>You make it sound like you can state things with 100% certainty :) >>>>>What you are doing is not exact science, it's more of an ad hoc, "oh seems to be >>>>>working fine" experiment, IMO. >>>>> >>>>>This may be sufficient in many cases, I can't say it ever worked for me 100% >>>>>though. >>>>> >>>> >>>>He does not know the worst case path through the chip, and hopes that it is >>>>being exercised. The guys who wrote the BurnK7 program state that it is not a >>>>sufficient test. Basically if you run that and you have problems - then you know >>>>that you have problems. But if you run that and you don't have noticible >>>>problems, then you may or may not have problems. >>>> >>>>For example let's say that a certain ALU operation has a long delay due to the >>>>number of combinatorial gates in the path. Maybe this is what determines the >>>>maximum chip operating frequency. Well if you don't test this one operation you >>>>may think that the chip is fine because all of the other operations will work. >>>>Now you raise the temperature or frequency and the other operations start >>>>failing. So you think "wow I was close to the edge", but in reality you were >>>>over the edge and you just didn't know it. >>> >>>You can figure out how on-edge you are by doing the tests. Then as I stated in >>>my previous post you can kick the voltage up, drop the cpu temp to 'average' >>>levels, and clock back and get a 100% stable CPU. There are some production cpus >>>that can't run more than 5% over stock speed without producing the same >>>instability as one of the pretested chips I have running on-edge. I however back >>>off a good 10-15%, Intel (some P4-3.06s for example) only backs off about 5%. >>>This is too close for me. At least with my chips I know they're 100% stable. :) >> >>We may disagree on what instability is. >>I think it is possible for a chip to malfunction long before it actually causes >>a system crash, just like a piece of software can have many bugs that only >>rarely shows themselves. >>If you don't somehow very that _all_ of the CPU is operating perfectly, but only >>focuses on a few instructions, then the test is not sufficient IMO. >> >>How long would it take you to discover if 1 in a billion fpu operations are in >>error because of OC'ing, when the rest of the chip is operating perfectly? >> >>-S. > >I know because the chip is backed off from the 'edge' enough. Even more so than >some retail chips. How are you so sure that you know where the edge is? This could be instruction and data dependent. If you don't exercise the worst case timing path in your burn in, then you can miss these failures. You keep failing to address this issue. Do you have a timing report from an insider at AMD? I don't know how AMD tests chips at speed. Here's an article which describes one method in which scan vectors can be used to test a chip at speed: http://www.eedesign.com/design_library/da/t/OEG20030509S0046 You'll see that this article mentions the critical timing path, and how you can find where that is. This is obviously something that requires the design database to do. Do you have this information from an insider at AMD?
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.