Author: Vincent Diepeveen
Date: 15:53:53 02/25/03
Go up one level in this thread
On February 25, 2003 at 13:28:43, Matt Taylor wrote: hello are you trying to measure the quantum of a thread in your code? how does this measure how fast a thread gets signalled *anyhow* with WaitForSingleObject? >On February 25, 2003 at 07:44:23, Vincent Diepeveen wrote: > >>On February 23, 2003 at 01:38:55, Matt Taylor wrote: >> >>DIEP is spinning and locking way way less than Crafty. Note that >>it is pretty hard to do without spinning under linux. >> >>The runqueue fires at 100Hz in linux. So the latency for a thread that doesn't >>search and normally is doing all kind of stuff is around 10ms under linux. > >Yes, Windows NT is 7.5 ms, and any OS that strives to do better is going to >waste a lot of time in the scheduler. > >Spin waits are nearly useless on a single-processor machine. I don't know what >you are doing, but a spin wait never occurs in an application on a >single-processor machine when the code is written correctly. Since the chess >engine has no extra threads, there will never be another engine thread that has >the spin lock. The lock will never actually spin -- the thread can always >acquire the lock because it's always free (unless you have a bug). > >>For crafty 10ms latency is too much to wait for a thread to get fired for sure. >> >>I guess you didn't try to figure out what the cost of it is, otherwise you would >>not write such unprofessional comments like below. > >My comment had nothing to do with Crafty vs. Diep. It had everything to do with >comments you made a few months ago about how the Xeon 2.8 GHz was not available >when Bob had one on his desk. I can understand them not being available in >Europe, but you didn't say that. You kept asserting that they didn't exist. > >I'd wager most people who read that thread thought it was pretty funny as I did. > >>In DIEP under linux i do not idle either. Of course for me 10ms is too expensive >>too. Instead i generate a bunch of attacktables instead an idle process doesn't >>hammer at the same cache line like crafty does. >> >>It speeds DIEP up 20% (in nodes a second) at 32 processors when i do not take >>the 10ms penalty but go for doing something with the registers without hurting >>shared cache lines (so just local allocated stuff). > >Ok, but that's unnecessary. A spin wait is a short-duration lock. Crafty gets >the same speedup without having to go do something else while waiting for the >lock. > >>Under windows the runqueue fires at 500Hz, so that's 2ms latency. Still a lot, >>but a lot less than 10ms latency. Today i go test what the effect of that is for >>DIEP. I have no dual Xeon to my avail at the moment to test it though. Must do >>with a dual K7 and dual P3 and see what generating 600 attacktables (about 0.5 >>ms at the dual k7) just in local ram is going to give versus using >>WaitForSingleObject. > >No. On NT it theoretically fires every 7.5 ms (133 Hz). On Win9x, it can fire as >slow as 20 ms (50 Hz). I measured the time on Windows XP Professional just now >and I got 15 ms. I am inclined to think this is the best XP Professional gets. >Server versions may use different timeslice values, but I don't have a copy to >test with. > >Code follows at the end of this message, please cut it when replying. Oh -- and >I recommend -never- programming like that. It's not bad for 20 minutes of work >including some debugging and a fix for SMP, but it can do really nasty things to >your system such as not being able to get into task manager to terminate it... > >Too bad Windows's scheduler isn't fair. > >>So for processes that let threads idle instead of letting them spin, that is a >>complete pathetic idea for realtime environments. ><snip> > >Realtime has nothing to do with it. Spin locks can be used in real-time >programs. The idea behind a spin lock is that it is a -short- wait, probably >shorter than the time required to transition into kernel mode. Spin locks are >used all over SMP kernels, particularly in drivers which are as close to >real-time as the PC architecture usually comes. > >In a single processor system, it is a dumb idea as you pointed out, but I don't >think that's news to Bob, and that's not news to me. I haven't even been >programming for 20 years, and he's been doing parallel research for that long. > >-Matt > >>>>Did you make the necessary changes to spinlocks and spinwaits??? >>> >>>Sorry, can't resist a good laugh! >>> >>>"No, they're not out yet!" >>> >>>:-) >>> >>>-Matt > ><-- cut here --> >#include <windows.h> >#include <stdio.h> >#include <conio.h> > >typedef unsigned __int64 uint64; > >DWORD WINAPI IdleThread(LPVOID lpParam); > >int main(void) >{ > uint64 clkspeed, dclocks, dbound, freq, dtime; > SYSTEM_INFO sysInfo; > HANDLE hThread[2]; // adjust for SMP system > DWORD dwTID[2]; > > QueryPerformanceFrequency((LARGE_INTEGER *) &freq); > > _asm > { > lea eax, dtime > push eax > push 1000 > push eax > call DWORD PTR [QueryPerformanceCounter] > rdtsc > mov esi, eax > mov edi, edx > call DWORD PTR [Sleep] > rdtsc > sub eax, esi > sbb edx, edi > mov esi, DWORD PTR [dtime] > mov edi, DWORD PTR [dtime+4] > mov DWORD PTR [dclocks], eax > mov DWORD PTR [dclocks+4], edx > call DWORD PTR [QueryPerformanceCounter] > sub DWORD PTR [dtime], esi > sbb DWORD PTR [dtime+4], edi > } > > clkspeed = (uint64)((double) dclocks * (double) freq / (double) dtime); > dbound = clkspeed / 1000; > > SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS); > > GetSystemInfo(&sysInfo); > for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++) > hThread[i] = CreateThread(NULL, 4096, IdleThread, NULL, 0, &dwTID[i]); > > while(!kbhit()) > { > _asm > { > push 0 > push 0 > > call DWORD PTR [Sleep] > > rdtsc > mov DWORD PTR [dclocks], eax > mov DWORD PTR [dclocks+4], edx > > call DWORD PTR [Sleep] > >TimeSliceLoop: > pause > rdtsc > sub eax, DWORD PTR [dbound] > sbb edx, DWORD PTR [dbound+4] > sub edx, DWORD PTR [dclocks+4] > ja TimeSliceElapsed > sub eax, DWORD PTR [dclocks] > jna TimeSliceLoop > >TimeSliceElapsed: > sbb edx, 0 > add eax, DWORD PTR [dbound] > adc edx, DWORD PTR [dbound+4] > mov DWORD PTR [dclocks], eax > mov DWORD PTR [dclocks+4], edx > } > > printf("Timeslice was: %d msec\n", (int)((dclocks * 1000) / clkspeed)); > } > > for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++) > TerminateThread(hThread[i], 0); > > return 0; >} > >DWORD WINAPI IdleThread(LPVOID lpParam) >{ > SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS); > while(1); > > return 0; >}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.