Author: Vincent Diepeveen
Date: 15:53:53 02/25/03
Go up one level in this thread
On February 25, 2003 at 13:28:43, Matt Taylor wrote:
hello are you trying to measure the quantum of a thread in your code?
how does this measure how fast a thread gets signalled *anyhow* with
WaitForSingleObject?
>On February 25, 2003 at 07:44:23, Vincent Diepeveen wrote:
>
>>On February 23, 2003 at 01:38:55, Matt Taylor wrote:
>>
>>DIEP is spinning and locking way way less than Crafty. Note that
>>it is pretty hard to do without spinning under linux.
>>
>>The runqueue fires at 100Hz in linux. So the latency for a thread that doesn't
>>search and normally is doing all kind of stuff is around 10ms under linux.
>
>Yes, Windows NT is 7.5 ms, and any OS that strives to do better is going to
>waste a lot of time in the scheduler.
>
>Spin waits are nearly useless on a single-processor machine. I don't know what
>you are doing, but a spin wait never occurs in an application on a
>single-processor machine when the code is written correctly. Since the chess
>engine has no extra threads, there will never be another engine thread that has
>the spin lock. The lock will never actually spin -- the thread can always
>acquire the lock because it's always free (unless you have a bug).
>
>>For crafty 10ms latency is too much to wait for a thread to get fired for sure.
>>
>>I guess you didn't try to figure out what the cost of it is, otherwise you would
>>not write such unprofessional comments like below.
>
>My comment had nothing to do with Crafty vs. Diep. It had everything to do with
>comments you made a few months ago about how the Xeon 2.8 GHz was not available
>when Bob had one on his desk. I can understand them not being available in
>Europe, but you didn't say that. You kept asserting that they didn't exist.
>
>I'd wager most people who read that thread thought it was pretty funny as I did.
>
>>In DIEP under linux i do not idle either. Of course for me 10ms is too expensive
>>too. Instead i generate a bunch of attacktables instead an idle process doesn't
>>hammer at the same cache line like crafty does.
>>
>>It speeds DIEP up 20% (in nodes a second) at 32 processors when i do not take
>>the 10ms penalty but go for doing something with the registers without hurting
>>shared cache lines (so just local allocated stuff).
>
>Ok, but that's unnecessary. A spin wait is a short-duration lock. Crafty gets
>the same speedup without having to go do something else while waiting for the
>lock.
>
>>Under windows the runqueue fires at 500Hz, so that's 2ms latency. Still a lot,
>>but a lot less than 10ms latency. Today i go test what the effect of that is for
>>DIEP. I have no dual Xeon to my avail at the moment to test it though. Must do
>>with a dual K7 and dual P3 and see what generating 600 attacktables (about 0.5
>>ms at the dual k7) just in local ram is going to give versus using
>>WaitForSingleObject.
>
>No. On NT it theoretically fires every 7.5 ms (133 Hz). On Win9x, it can fire as
>slow as 20 ms (50 Hz). I measured the time on Windows XP Professional just now
>and I got 15 ms. I am inclined to think this is the best XP Professional gets.
>Server versions may use different timeslice values, but I don't have a copy to
>test with.
>
>Code follows at the end of this message, please cut it when replying. Oh -- and
>I recommend -never- programming like that. It's not bad for 20 minutes of work
>including some debugging and a fix for SMP, but it can do really nasty things to
>your system such as not being able to get into task manager to terminate it...
>
>Too bad Windows's scheduler isn't fair.
>
>>So for processes that let threads idle instead of letting them spin, that is a
>>complete pathetic idea for realtime environments.
><snip>
>
>Realtime has nothing to do with it. Spin locks can be used in real-time
>programs. The idea behind a spin lock is that it is a -short- wait, probably
>shorter than the time required to transition into kernel mode. Spin locks are
>used all over SMP kernels, particularly in drivers which are as close to
>real-time as the PC architecture usually comes.
>
>In a single processor system, it is a dumb idea as you pointed out, but I don't
>think that's news to Bob, and that's not news to me. I haven't even been
>programming for 20 years, and he's been doing parallel research for that long.
>
>-Matt
>
>>>>Did you make the necessary changes to spinlocks and spinwaits???
>>>
>>>Sorry, can't resist a good laugh!
>>>
>>>"No, they're not out yet!"
>>>
>>>:-)
>>>
>>>-Matt
>
><-- cut here -->
>#include <windows.h>
>#include <stdio.h>
>#include <conio.h>
>
>typedef unsigned __int64 uint64;
>
>DWORD WINAPI IdleThread(LPVOID lpParam);
>
>int main(void)
>{
> uint64 clkspeed, dclocks, dbound, freq, dtime;
> SYSTEM_INFO sysInfo;
> HANDLE hThread[2]; // adjust for SMP system
> DWORD dwTID[2];
>
> QueryPerformanceFrequency((LARGE_INTEGER *) &freq);
>
> _asm
> {
> lea eax, dtime
> push eax
> push 1000
> push eax
> call DWORD PTR [QueryPerformanceCounter]
> rdtsc
> mov esi, eax
> mov edi, edx
> call DWORD PTR [Sleep]
> rdtsc
> sub eax, esi
> sbb edx, edi
> mov esi, DWORD PTR [dtime]
> mov edi, DWORD PTR [dtime+4]
> mov DWORD PTR [dclocks], eax
> mov DWORD PTR [dclocks+4], edx
> call DWORD PTR [QueryPerformanceCounter]
> sub DWORD PTR [dtime], esi
> sbb DWORD PTR [dtime+4], edi
> }
>
> clkspeed = (uint64)((double) dclocks * (double) freq / (double) dtime);
> dbound = clkspeed / 1000;
>
> SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
>
> GetSystemInfo(&sysInfo);
> for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
> hThread[i] = CreateThread(NULL, 4096, IdleThread, NULL, 0, &dwTID[i]);
>
> while(!kbhit())
> {
> _asm
> {
> push 0
> push 0
>
> call DWORD PTR [Sleep]
>
> rdtsc
> mov DWORD PTR [dclocks], eax
> mov DWORD PTR [dclocks+4], edx
>
> call DWORD PTR [Sleep]
>
>TimeSliceLoop:
> pause
> rdtsc
> sub eax, DWORD PTR [dbound]
> sbb edx, DWORD PTR [dbound+4]
> sub edx, DWORD PTR [dclocks+4]
> ja TimeSliceElapsed
> sub eax, DWORD PTR [dclocks]
> jna TimeSliceLoop
>
>TimeSliceElapsed:
> sbb edx, 0
> add eax, DWORD PTR [dbound]
> adc edx, DWORD PTR [dbound+4]
> mov DWORD PTR [dclocks], eax
> mov DWORD PTR [dclocks+4], edx
> }
>
> printf("Timeslice was: %d msec\n", (int)((dclocks * 1000) / clkspeed));
> }
>
> for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
> TerminateThread(hThread[i], 0);
>
> return 0;
>}
>
>DWORD WINAPI IdleThread(LPVOID lpParam)
>{
> SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
> while(1);
>
> return 0;
>}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.