Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Is SPEC a bad test organisation according to Hyatt?

Author: Gerd Isenberg

Date: 13:53:51 03/01/04

Go up one level in this thread


On March 01, 2004 at 16:24:16, Robert Hyatt wrote:

>On March 01, 2004 at 15:15:11, Vincent Diepeveen wrote:
>
>>On March 01, 2004 at 14:24:50, Robert Hyatt wrote:
>>
>>>On March 01, 2004 at 14:20:41, Robert Hyatt wrote:
>>>
>>>>On March 01, 2004 at 13:59:25, Eugene Nalimov wrote:
>>>>
>>>>>On March 01, 2004 at 13:49:38, Eugene Nalimov wrote:
>>>>>
>>>>>>On March 01, 2004 at 12:05:17, Vincent Diepeveen wrote:
>>>>>>
>>>>>>>On February 29, 2004 at 23:38:31, Robert Hyatt wrote:
>>>>>>>
>>>>>>>[snip]
>>>>>>>
>>>>>>>You qualify the testresults as done for SPEC as INVALID and INCORRECT?
>>>>>>>
>>>>>>>YES or NO?
>>>>>>>
>>>>>>>[bla bla removed]
>>>>>>
>>>>>>Had you stopped to drink vodka every morning?
>>>>>>
>>>>>>Please answer only YES or NO.
>>>>>>
>>>>>>[bla bla removed]
>>>>>
>>>>>So, my previous post pointed that there are questions for which you cannot
>>>>>answer "YES or NO".
>>>>>
>>>>>And here is *official* SPEC data for 1.3GHz K7 and 1.5GHz Itanium2:
>>>>>
>>>>>http://www.spec.org/cpu2000/results/res2001q4/cpu2000-20011008-01018.html
>>>>>http://www.spec.org/cpu2000/results/res2004q1/cpu2000-20040126-02775.html
>>>>>
>>>>>Thanks,
>>>>>Eugene
>>>>
>>>>
>>>>Please do not confuse discussions with Vincent by supplying real data.  Things
>>>>stay on a more equal footing if you just make up stuff and post it here.
>>>>
>>>><sarcasm off>
>>>>
>>>>:)
>>>
>>>
>>>For those that didn't look at the data, the 1.5ghz K7 compared to the 1.5ghz
>>>itanium shows a 50% faster speed on the Itanium.  IE the K7 took 127 seconds to
>>>run the test, the Itanium took 80.
>>>
>>>Why 1.5ghz K7?  Because Vincent was talking about "clock for clock" and Eugene
>>>chose to supply real data rather than barking up a hollow tree...
>>
>>Latest itanium compiler 1.5Ghz 6MB L3. Compiler used from 2004.
>>    base score : 1241
>>Note this is the HP compiler which hardly anyone uses. No one in government is.
>>They all use the way slower intel compiler. The supercomputers of the government
>>aren't HP ones. HP isn't delivering big enough systems.
>>
>>But even then. Let's compare this 6 instructions a cycle Itanium2 with crafty at
>>K7 a 32 bits doing 3 instructions a cycle max:
>
>Flap flap flap flap.  Flappety flap.  Flap.  Flap.  flap-flap-flap.
>
>No matter how much hand-flapping you do, you made the original statement.
>Eugene supplied data that showed that clock for clock, the Itanium was 50%
>faster with Crafty.  Nothing more, nothing less.  No flappety-flap either.
>
>
>
>
>
>>
>>http://www.spec.org/osg/cpu2000/results/res2003q2/cpu2000-20030505-02154.html
>>K7 at 2.2Ghz getting: 1324
>>
>>So after concluding that itanium is hell of a lot slower than K7, we can look to
>>the IPC.
>
>
>flappety-flap.
>
>Who cares.  You said something that was simply shown to be wrong.
>
>
>>
>>The 6 instructions a clock from the itanium2 @ 64 bits delivers :
>>  1241/1324 * 2.2Ghz/1.5Ghz = 37% faster speed
>>
>>So years of work at compiler still didn't improve much from my 33% statement
>>that the 4 instructions a clock 21264 1Ghz delivered a few years ago.
>>
>>So the move from 32 bits to 64 bits can not have contributed more than a few %
>>of speed to crafty.
>>
>>Best regards,
>>Vincent
>
>"can not have contributed -> flappety flap."
>
>One of these days, I will have access to an opteron where I can do a 32 bit and
>64 bit compile with _everything_ else constant.  Then we will _really_ know what
>32 -> 64 bit gives.  It will be more than "a few %".

I'll hope it ;-)

OTOH how do you distinguish the gain of using additional registers even for
32-bit ints, which may dramatically decrease stack bandwidth and the gain of
pure 64-bit processing. Do you have an idea, by profiling or estimation, of the
ratio between the number of average executed 32-bit and 64-bit instructions
inside crafty's search and eval?

With some dependent bitboard operations you may introduce more register stalls
and probably less densitity using all execution ressources in parallel within
one processor, where 32-bit could do two independent instructions in parallel.
(intel64 with HT is coming ;-)

With shifts, specially with generalized, there is really a huge win. But with
store/load and simple logical/arithmetical instructions and bitscan the win
isn't that huge, considering sequentially preparing and traversing one single
bitboard inside a register.

Cheers,
Gerd




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.