Author: Robert Hyatt
Date: 10:05:19 11/03/02
Go up one level in this thread
On November 03, 2002 at 10:03:57, Vincent Diepeveen wrote: > >Who told you months ago to copy less data for a split, which reduces >the time that something is locked :) > Nobody. I haven't changed what I copy in a split in 5 years now. So I have _no_ idea what you are talking about. I copy just as much today as I copied when the first SMP version was released. that is _easy_ to check if you have old versions. Just check CopyToSMP() and CopyFromSMP() using diff. You did say that GCP should copy less like I do, and then when I told you I copy more than you think you then said I copy too much. Hard to argue when you take both sides of the argument as I am not going to take the "middle".. My code works. Therefore... Q.E.D. > > >On November 02, 2002 at 00:15:02, Robert Hyatt wrote: > >>On November 01, 2002 at 13:06:53, Vincent Diepeveen wrote: >> >>>On November 01, 2002 at 12:20:14, Robert Hyatt wrote: >>> >>>Feel free to ship a version of crafty that doesn't do spinlock >>>or whatever you want to modify. I'll extensively test it for you >>>at all P4s i can get my hands on... >> >>Feel free to download the version on the ftp machine. I have _already_ played >>a bit with hyper-threading as I told you. We have two dual 2.2ghz P4's here >>running XP. Hyper-threading _does_ work. While the O/S thinks it has two >>cpus, it is not twice as fast. But it is definitely significantly faster, >>which is the important point. >> >>I have seen hyper-threading produce results of 1.3X to 1.5X faster, depending >>on what is run. >> >>Crafty can do better once the spinlocks are modified. But it doesn't do >>_worse_ now... It does run faster with SMT on. >> >> >> >>> >>>I would be really amazed if you get even 0.1% faster in nodes a >>>second... >>> >>>...of course it must be a fair compare in contradiction to what >>>intel shows. They do next comparision >>> >>> a) some feature called 'SMT' in the bios turned on >>> - just running 2 threads then >>> b) turning it off >>> - also running 2 threads at it >>> >>>Like everyone who is not so naive we know that you also need >>>to do next test: >>> >>> a) some feature called 'SMT' in the bios turned on >>> - just running 1 thread eating all system time >>> b) turning it off >>> - also running 1 thread eating all system time >>> >>>There shouldn't be a speed difference between a and b of course. >>> >>>That verification step is missing. >>> >>> >>> >>>>On November 01, 2002 at 11:56:56, Vincent Diepeveen wrote: >>>> >>>>>On November 01, 2002 at 10:41:25, Robert Hyatt wrote: >>>>> >>>>>>On October 31, 2002 at 10:53:07, Vincent Diepeveen wrote: >>>>>> >>>>>>>On October 30, 2002 at 06:59:21, Terje Vagle wrote: >>>>>>> >>>>>>>>Hi all, >>>>>>>> >>>>>>>>The new cpu from intel will have a new function called >>>>>>>>hyper-threading. >>>>>>>> >>>>>>>>This will make the operating system able to recognize the cpu as if it was >>>>>>>>2 cpu's. >>>>>>>> >>>>>>>>Could the programs with smp-support make use of this? >>>>>>>> >>>>>>>>Regards, >>>>>>>> >>>>>>>>Terje Vagle >>>>>>> >>>>>>>No chessprograms cannot make use of that feature at all. It is sad but >>>>>>>the truth. Hyperthreading is a cool thing for the future but the P4 >>>>>>>processor is a too small processor to allow hyperthreading from getting >>>>>>>to work. >>>>>>> >>>>>>>Apart from that a major problem is that even if we have a great processor >>>>>>>which really allows hyperthreading to be effective, that the threads >>>>>>>run at unequal speeds. >>>>>>> >>>>>>>Hyper threading is supposed to work for 2 threads where 1 is a fast >>>>>>>thread and the other is some kind of background thread eating little cpu >>>>>>>time. >>>>>>> >>>>>>>In chessprograms having a second search thread which just runs now and >>>>>>>then in the background is simply impossible to use. >>>>>> >>>>>> >>>>>>It is not impossible at all. The only problem was spinlocks and Eugene >>>>>>posted a link to an Intel document that describes how to solve this problem. >>>>>> >>>>>>Given that solution, hyper-threading will work just fine since spinlocks >>>>>>won't confuse the processor... >>>>>> >>>>>>It won't be 2x faster, but it will certainly be faster if you can run a second >>>>>>thread while the first is blocked on a memory access... >>>>> >>>>>No it won't be 2 times faster. suppose you start crafty with 2 threads. >>>> >>>>I didn't say it would be _two_ times faster. >>>> >>>>I said it would be _faster_. >>>> >>>>And it will. >>>> >>>> >>>> >>>>> >>>>>thread A starts search and has 1.e4,e5 >>>>>thread B starts and continues with 1.d4 >>>>> >>>>>now when A is ready, B will still be busy with its own search space, >>>>>and delay thread A time and again. >>>>> >>>>>that'll slow down incredible. >>>>> >>>> >>>> >>>>Except that isn't how it works. The threads co-execute in an intermingled >>>>way as one blocks for a memory read the other fills in the gap. It is >>>>something like having 1.5 cpus... and it does work. >>>> >>>> >>>> >>>>>You'll be a lot slower than searching with a single thread! >>>>> >>>> >>>> >>>>Not very likely... >>>> >>>> >>>> >>>> >>>>>Also note that there is just 8 KB data cache and just like >>>>>40 registers to rename variables. then another 12KB tracecache. >>>>> >>>>>*both* threads are eating from that 8 KB and 12KB tracecache, >>>>>that is an additional problem they 'overlook'. >>>>> >>>> >>>> >>>>That is a problem on an SMP machine. But _both_ threads are executing >>>>the _same_ code anyway... so that isn't a problem. At least for me. >>>> >>>>For you it is different because you are not using "shared everything" in >>>>lightweight threads, so your results might be different. But all my threads >>>>share the exact same executable instruction code... >>>> >>>> >>>> >>>> >>>>>As you can see from graphs. Usually SMT brings zero speedup. >>>> >>>>I have seen numbers around 1.3 up to 1.5... which is not to be >>>>ignored. >>>> >>>> >>>> >>>>> >>>>>Try crafty on a 2.4Ghz single cpu P4 or P4-Xeon please (northwood) or >>>>>above. Not on a slower P4 or P4-Xeon. Of course we go for the latest >>>>>hardware... >>>> >>>> >>>>Why does it matter? Hyper-Threading is Hyper-Threading, unless you are >>>>going to start that memory speed nonsense. And, in fact, the faster the >>>>processor vs memory speed, the better hyperthreading should perform. Just >>>>like the greater the difference in processor speed vs disk speed, the better >>>>normal operating systems do at running multiple processes. >>>> >>>> >>>>> >>>>>Just try it like i tried at Jan Louwman's 2.4Ghz P4s and 2.53Ghz P4s. >>>> >>>>That says it all. "Like I tried it". As if that is a comprehensive and >>>>exhaustive testing? >>>> >>>>> >>>>>I can't measure *any* speedup *anyhow*. >>>>> >>>> >>>> >>>>Why am I not surprised??? >>>> >>>> >>>> >>>>>Also theoreticlaly i see major problems for the P4 chip even if you >>>>>have software which could theoretically profit. >>>> >>>> >>>>"theoretically". >>>> >>>>:) >>>> >>>>:) >>>> >>>>:) >>>> >>>>Theory from someone that doesn't know theory. >>>> >>>>:) >>>> >>>>:)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.