Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Some Crafty 16.19 results on my XP 2.44GHz

Author: Robert Hyatt

Date: 08:42:49 02/20/03

Go up one level in this thread


On February 20, 2003 at 09:53:28, Jeremiah Penery wrote:

>On February 19, 2003 at 18:12:03, Robert Hyatt wrote:
>
>>On February 19, 2003 at 17:29:28, Aaron Gordon wrote:
>>
>>>On February 19, 2003 at 16:30:14, Robert Hyatt wrote:
>>>
>>>>Sure there are.  And they take _years_ to run.  You have to run all four billion
>>>>possible
>>>>values thru every possible instruction, in every possible sequence, with every
>>>>possible
>>>>pipeline delay, at every possible temperature, with every possible voltage
>>>>variance with...
>>>>
>>>>You get the idea...
>>>
>>>No one tests this way, not AMD, Intel, IBM, etc.
>>
>>
>>Of course not.  The engineers look at the gate delays, add them up, and set the
>>clock rate
>>just longer than the longest max delay they found.
>
>That's absolutely not true.  Otherwise, why is it that chips from the same
>silicon wafer get separated into different speed bins?

Because the manufacturing process is not perfect.  The engineer knows to within
a
few picoseconds of what the chips will run at.  But then there are small
imperfections
that make some gates switch slower or with higher resistance than what is
supposed to
happen.

Doesn't negate a single thing I said however, as they do _know_ what the chips
will
do nominally.  And they weed out the exceptions since the process will _never_
be
perfect.


>
>Anyway, finding the longest gate delay sounds like the traveling salesman
>problem involving millions of cities.  It doesn't sound exactly practical to do
>that.

If you don't do it you have _no_ idea what the clock frequency limit is.  This
is a
standard part of any silicon compiler design tool.  You know where the clock
edge starts and where the signals travel, and it isn't hard to figure out how
long it
will take to settle by counting the numbers (and types) of gates along each
path.

There are not an infinite number of pathways, btw...
Not even a large number.  IE each pipeline stage represents a pretty
straightforward
set of gate delays.  That is where the engineers spend time, trying to make
every pipeline
stage run at the same settling time because the clock frequency has to be set
for the _longest_
delay overall.






>
>>You are doing the opposite, and you have to test it in a different way to see if
>>_your_
>>processor has faster gates.
>>
>>> It would be a waste of time to
>>>do so, too. You make the silicon and see what range it can do and you make the
>>>chips at or below the minimum clock speeds attainable by those chips.
>>
>>I don't believe anybody does this.  I believe they _know_ before the first die
>>is cut into
>>chips, how fast the thing should be able to run.  Because they know very
>>accurately how
>>fast each of the gates can switch, and how many there are in a single path.
>
>They can guess, but they still have to test to find out for sure.
>


Not for the majority of the chips.  Only for the exceptions that would normally
fail the
basic test, but they discover they will run at a slower clock rate just fine.
Most chips pass
the QC tests with zero problems, at the clock rate the engineers targeted.  If
they can grab
a few of the failures that work at lower rates, so much the better.  But they
_are_ bad
chips.





>>> For
>>>stability testing a Prime95/Memtest86 combo is all thats needed. If you want to
>>>take videocards into account (high AGP speed, etc) you just run 3DMark2001 or
>>>2003. Newer Nforce2 boards lock the AGP speed at 66mhz (or have it adjustable)
>>>so you don't have to worry about it (same with the PCI speed).
>>
>>Fine, but suppose there is _another_ instruction with a longer gate delay and
>>prime95
>>is not using it.  BSR for example.  Then all your testing shows that the thing
>>works but
>>it fails for Crafty.
>>
>>That has happened...
>>
>>Prime95 doesn't test _all_ instructions, with exceptions thrown in at random
>>points to further
>>stress things...
>
>You basically said you didn't know what Prime95 was a few posts ago, and now
>suddenly you know what kind of instruction mix it uses?  Impressive...
>

I didn't say anything of the kind.  It doesn't take a genius to figure out what
a prime
number tester is doing.  Pretty certain it doesn't use BSF/BSR, and a _lot_ of
other
instructions.  If that impresses you, you must be easily impressed.  :)




>>>>Possibly, but that's "business".  But they weren't producing 2.8's that could
>>>>run reliably at
>>>>3.06, which is the topic here...
>
>Intel has demoed chips that run far above the currently shipping 3.06GHz.  Why
>do you suppose they haven't released them?  _That_ is "business".  If Intel
>releases a 5GHz chip tomorrow, they'd sure knock everyone else out of the
>performance race, but they would lose a TON of money relative to the current
>business model.


That is _not_ the same idea.  The idea that a vendor purposefully underclocks a
chip
is ridiculous.  The idea that they don't release the next generation at a faster
clock rate
until the current supply of slower chips is exhausted is not contradictory at
all.  Two
totally different business practices, one of which makes economic sense, the
other makes
zero sense.








>
>>>They could, infact. They were clocking up to 3.2Ghz consistantly. I doubt we'll
>>>even see 3.2Ghz for a while due to the heat those chips put off. As I mentioned
>>>before a P4-3.06GHz is 110 watts.
>>>
>>>
>>>Also, I'm not sure if you're aware but temperature/voltage helps a lot with
>>>overclockability. If you get the CPU cold enough (-120C to be exact) you could
>>>effectively run 2x your max ambient air temp and it would NOT be considered
>>>overclocking. Here's a small graph I grabbed from Kryotech (makes Freon cpu
>>>coolers). ftp://speedycpu.dyndns.org/pub/cmoscool.gif
>>
>>You are mixing apples and oranges.  One result of overclocking is having to ramp
>>up the
>>supply voltage to shorten the switching times, which produces heat that has to
>>be removed
>>or the chip will be damaged.  Another result is cutting the base clock cycle
>>below the base
>>settling time for some odd pathway, so that particular operation doesn't always
>>produce
>>correct results.  Two different issues...
>>
>>Just because you can cool it _still_ doesn't mean it can switch fast enough.
>>
>>
>>
>>>
>>>Also when overclocking you need to use a bit of common sense. Lets say 1.6GHz is
>>>stable at 1.6 volts, 2.0GHz is stable at 1.75v and perhaps the upper limit of
>>>the theoretical chip I'm speaking of is, say.. 2.2GHz at 2.00v. If you test at
>>>2.00v and it failed in prime95 after 1 hour and they drop the MHz down to
>>>2.1GHz, you don't think it'll be completely stable? Of course it would be. I've
>>>seen some servers fail at prime95, NON overclocked. It's that sensitive. If you
>>>drop the clock speed THAT much (100mhz) from an already almost fully stable
>>>setup it will be completely stable. It's still overclocked, yes, but being so
>>>doesn't automatically warrant diepeveen hand-waving. :)
>>
>>No, but it is a risk.  As I said my first bad experience was a couple of years
>>ago with a
>>machine that passed all the overclocker tests around, but failed after 8-10
>>hours on a
>>particular data mining application...  When we dropped the clock back to
>>specification,
>>the problem did _not_ appear.
>
>If you had been 'lucky' enough to get your hands on a 1.13GHz P3 during the
>short time it was shipping, it would probably fail in a similar manner.  But
>somehow I doubt that would make you condemn Intel chips as completely unsafe.
>Why do you so condemn all overclocking?  The overclocking in your example was no
>doubt somewhat 'unsafe'.  However, contrary to your belief, there is some degree
>of safety in moderate overclocking.  I don't know that I'd trust many of the
>'super' overclocks, but I would absolutely trust a P4 3.06GHz running at 3.2GHz
>or something like that.


Feel free to do so.  I trust the engineers myself.  I have more than enough
debugging to do
without having to deal with the hardware mangling results sporadically as well.
Those are
difficult to find problems, and I don't want any part of them.

You can also go out and buy a two-cycle outboard motor rated at 5700RPM redline,
and run it
at 6200RPM.  For a while.  At 5700 it will last 20 years.  At 6200 it might last
10 years or
10 minutes.  Again, I go with the engineers and their testing.



This page took 0.03 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.