Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Is SPEC a bad test organisation according to Hyatt?

Author: Robert Hyatt
Date: 19:18:21 03/01/04
On March 01, 2004 at 16:53:51, Gerd Isenberg wrote:

>On March 01, 2004 at 16:24:16, Robert Hyatt wrote:
>
>>On March 01, 2004 at 15:15:11, Vincent Diepeveen wrote:
>>
>>>On March 01, 2004 at 14:24:50, Robert Hyatt wrote:
>>>
>>>>On March 01, 2004 at 14:20:41, Robert Hyatt wrote:
>>>>
>>>>>On March 01, 2004 at 13:59:25, Eugene Nalimov wrote:
>>>>>
>>>>>>On March 01, 2004 at 13:49:38, Eugene Nalimov wrote:
>>>>>>
>>>>>>>On March 01, 2004 at 12:05:17, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>>>On February 29, 2004 at 23:38:31, Robert Hyatt wrote:
>>>>>>>>
>>>>>>>>[snip]
>>>>>>>>
>>>>>>>>You qualify the testresults as done for SPEC as INVALID and INCORRECT?
>>>>>>>>
>>>>>>>>YES or NO?
>>>>>>>>
>>>>>>>>[bla bla removed]
>>>>>>>
>>>>>>>Had you stopped to drink vodka every morning?
>>>>>>>
>>>>>>>Please answer only YES or NO.
>>>>>>>
>>>>>>>[bla bla removed]
>>>>>>
>>>>>>So, my previous post pointed that there are questions for which you cannot
>>>>>>answer "YES or NO".
>>>>>>
>>>>>>And here is *official* SPEC data for 1.3GHz K7 and 1.5GHz Itanium2:
>>>>>>
>>>>>>http://www.spec.org/cpu2000/results/res2001q4/cpu2000-20011008-01018.html
>>>>>>http://www.spec.org/cpu2000/results/res2004q1/cpu2000-20040126-02775.html
>>>>>>
>>>>>>Thanks,
>>>>>>Eugene
>>>>>
>>>>>
>>>>>Please do not confuse discussions with Vincent by supplying real data.  Things
>>>>>stay on a more equal footing if you just make up stuff and post it here.
>>>>>
>>>>><sarcasm off>
>>>>>
>>>>>:)
>>>>
>>>>
>>>>For those that didn't look at the data, the 1.5ghz K7 compared to the 1.5ghz
>>>>itanium shows a 50% faster speed on the Itanium.  IE the K7 took 127 seconds to
>>>>run the test, the Itanium took 80.
>>>>
>>>>Why 1.5ghz K7?  Because Vincent was talking about "clock for clock" and Eugene
>>>>chose to supply real data rather than barking up a hollow tree...
>>>
>>>Latest itanium compiler 1.5Ghz 6MB L3. Compiler used from 2004.
>>>    base score : 1241
>>>Note this is the HP compiler which hardly anyone uses. No one in government is.
>>>They all use the way slower intel compiler. The supercomputers of the government
>>>aren't HP ones. HP isn't delivering big enough systems.
>>>
>>>But even then. Let's compare this 6 instructions a cycle Itanium2 with crafty at
>>>K7 a 32 bits doing 3 instructions a cycle max:
>>
>>Flap flap flap flap.  Flappety flap.  Flap.  Flap.  flap-flap-flap.
>>
>>No matter how much hand-flapping you do, you made the original statement.
>>Eugene supplied data that showed that clock for clock, the Itanium was 50%
>>faster with Crafty.  Nothing more, nothing less.  No flappety-flap either.
>>
>>
>>
>>
>>
>>>
>>>http://www.spec.org/osg/cpu2000/results/res2003q2/cpu2000-20030505-02154.html
>>>K7 at 2.2Ghz getting: 1324
>>>
>>>So after concluding that itanium is hell of a lot slower than K7, we can look to
>>>the IPC.
>>
>>
>>flappety-flap.
>>
>>Who cares.  You said something that was simply shown to be wrong.
>>
>>
>>>
>>>The 6 instructions a clock from the itanium2 @ 64 bits delivers :
>>>  1241/1324 * 2.2Ghz/1.5Ghz = 37% faster speed
>>>
>>>So years of work at compiler still didn't improve much from my 33% statement
>>>that the 4 instructions a clock 21264 1Ghz delivered a few years ago.
>>>
>>>So the move from 32 bits to 64 bits can not have contributed more than a few %
>>>of speed to crafty.
>>>
>>>Best regards,
>>>Vincent
>>
>>"can not have contributed -> flappety flap."
>>
>>One of these days, I will have access to an opteron where I can do a 32 bit and
>>64 bit compile with _everything_ else constant.  Then we will _really_ know what
>>32 -> 64 bit gives.  It will be more than "a few %".
>
>I'll hope it ;-)
>
>OTOH how do you distinguish the gain of using additional registers even for
>32-bit ints, which may dramatically decrease stack bandwidth and the gain of
>pure 64-bit processing. Do you have an idea, by profiling or estimation, of the
>ratio between the number of average executed 32-bit and 64-bit instructions
>inside crafty's search and eval?

For your first question, it is answerable, but I'm not going to answer it.  It
would take some time to twaddle with gcc to make it produce 64 bit code, but not
use the extra 8 registers.  Someone might want to try it, if their interest is
high enough, but not me.  I have fiddled with GCC before, it is not "pretty"
code.

Raw search is not much in the way of 64 bit stuff.  But move generation is, as
is the evaluation, SEE, etc...

I have no idea what the precise ratio is, of course...  Although again someone
_could_ produce some assembly from the C code and look.  If it were an
interesting question I would do so, but it isn't worth the amount of work it
would take, IMHO.



>
>With some dependent bitboard operations you may introduce more register stalls
>and probably less densitity using all execution ressources in parallel within
>one processor, where 32-bit could do two independent instructions in parallel.
>(intel64 with HT is coming ;-)
>
>With shifts, specially with generalized, there is really a huge win. But with
>store/load and simple logical/arithmetical instructions and bitscan the win
>isn't that huge, considering sequentially preparing and traversing one single
>bitboard inside a register.
>
>Cheers,
>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.