Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Odd hyperthreading behavior

Author: Robert Hyatt

Date: 06:48:40 10/06/03

Go up one level in this thread


On October 06, 2003 at 06:07:09, Tom Kerrigan wrote:

>On October 05, 2003 at 14:56:38, Vincent Diepeveen wrote:
>
>>On October 05, 2003 at 14:45:00, Robert Hyatt wrote:
>>
>>>On October 05, 2003 at 13:44:15, Vincent Diepeveen wrote:
>>>
>>>>On October 04, 2003 at 23:44:03, Robert Hyatt wrote:
>>>>
>>>>>On October 04, 2003 at 21:09:23, Vincent Diepeveen wrote:
>>>>>
>>>>>>On October 04, 2003 at 21:00:34, Tom Kerrigan wrote:
>>>>>>
>>>>>>>I had the chance to run my program on a dual P4 Xeon (with hyperthreading).
>>>>>>
>>>>>>which OS and what version number of the os and what release number?
>>>>>>
>>>>>>pretty crucial.
>>>>>>
>>>>>>>First off, there have been some involved arguments about the design and
>>>>>>>performance of hyperthreading on this board in the past. I'd like to settle one
>>>>>>>argument, namely that single threaded programs do not slow down when
>>>>>>>hyperthreading is on. Actually, my program did slow down by 1.3% but I think
>>>>>>>this is marginal and easily attributed to the scheduler, not hyperthreading.
>>>>>>>
>>>>>>>The odd part is that hyperthreading DOES slow down my program when running 2
>>>>>>>threads. With HT off, my program searches 90% more NPS with a 2nd thread. With
>>>>>>
>>>>>>>HT on, it only searches 53% more NPS. The idle time reported by each thread is
>>>>>>>low and the nodes are split evenly, so it seems both processors are slowed down
>>>>>>>equally. What must be happening is that HT is activated some (or all?) of the
>>>>>>>time while searching but I have no idea what might be activating it.
>>>>>>>
>>>>>>>Also odd is that HT seems to be decreasing the efficiency of the search. With HT
>>>>>>>off, my program's time-to-ply is 64% faster with 2 threads but with HT on, it's
>>>>>>>only 21% faster. The time-to-ply:NPS ratios are 0.86 and 0.79 respectively.
>>>>>>>
>>>>>>>Running 4 threads with HT on results in a 15% NPS/6% time-to-ply speedup over 2
>>>>>>>threads.
>>>>>>>
>>>>>>>In other words, there's no contest between running 2 threads (HT off) vs.
>>>>>>>running 4 threads (HT on). The former wins hands down for my program.
>>>>>>>
>>>>>>>-Tom
>>>>>>
>>>>>>Your thing is searching parallel nowadays and we do talk about a chessprogram
>>>>>>here?
>>>>>>
>>>>>>Doesn't take away that it is not easy to profit from HT.
>>>>>>
>>>>>>Basically HT only works well at intel test machines it seems.
>>>>>>
>>>>>>those do HT a lot better than non-test machines.
>>>>>>
>>>>>>it is confirmed again in www.aceshardware.com
>>>>>>
>>>>>>25% speedup (in nodes a second) for diep is just too much (single P4 EE 3.4Ghz)
>>>>>>i bet production machines that we can buy in the shops soon won't show at single
>>>>>>cpu P4 EE 3.4Ghz a speedup of 25% like aceshardware.com has tested. Anyway i
>>>>>>kept the executable to proof my guess there in the future when the p4 ee is
>>>>>>released or when i can run at a P4 3.2Ghz C (also showed 25% speedup in nps
>>>>>>thanks to HT for current diep version).
>>>>>>
>>>>>>best regards,
>>>>>>vincent
>>>>>
>>>>>
>>>>>Several have run this test with Crafty.  SMT on is 20-30% faster in NPS for
>>>>>my program, on my dual 2.8, which is not a "test machine".  Eugene posted
>>>>>similar numbers for a dual he has.  Others have also reproduced this with
>>>>>no problems.
>>>>
>>>>Not really, all reports i saw here from non-Hyatt and non-Nalimov machines
>>>>report for the same versions 10-15% for crafty.
>>>
>>>And 10%-15% is _drastically_ different than 20%, right?
>>>
>>>learn some math.
>>>
>>>this varies significantly, on the same machine...
>>
>>You tested just 6 postions, so that renders your results pretty useless.
>>
>>The others had tested around 30 positions.
>>
>>So even if we still take the average it's closer to 10% than it is to 30%.
>>
>>Nalimov just said 30% without much of a proof.
>
>You sound all indignant, like Bob & Eugene are lying, but at the same time it
>seems clear to you that different tests yield different results. You can think
>the tests they ran were not representative but it's stupid to be upset over the
>actual #s they got.
>
>-Tom


The only numbers that matter are the numbers _Vincent_ produces.  All other
numbers are "rude" or "wrong" or "impossible" etc.

Surely you have recognized that, even when he is the _only_ person that
reports such numbers (one example:  Crafty on a dual produces _no_ speedup.
yet everybody that has tested has shown an average speedup of around 1.7,
except for my dual xeon which is not producing 2x NPS on a dual, when
it produces 2X with 2 processors and 4X with 4 processors on my quad boxes.)

Arguing with his "proofs" is hopeless...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.