Author: Robert Hyatt
Date: 12:30:25 02/26/03
Go up one level in this thread
On February 25, 2003 at 16:23:05, Vincent Diepeveen wrote:
>On February 25, 2003 at 13:28:43, Matt Taylor wrote:
>
>i asked m$ kernel team member and he told me newer NT kernels have 2ms latency
>to wake up a process. If you measure 7.5ms somehow that surprises me quite some.
>I was told it was 2ms for latest NT kernels.
VIncent, you need to read before asking questions. "2ms to wake up a process"
is a
_far_ different thing than what you suggested earlier. That is a measure of how
long it
takes to start executing a process once it has been flagged as "ready". It
includes the time
to suspend the current (presumably lower-priority) process and then
context-switch to the
new process.
In chess, this is meaningless. Because the processor in question is _idle_ and
that is why I
use spinlocks and spinwaits, as I have _zero_ latency. Which is what I want...
>
>My own tests show that the scheduler from windows at NT is about 2-3 times
>faster than the latency that it gets under linux. Of course it is possible that
>it ain't 10ms under windows but like 21, i didn't test absolute speeds. i just
>tested relative speeds ;)
Right. Tested how? I'm sure it was an _accurate_ test...
>
>>On February 25, 2003 at 07:44:23, Vincent Diepeveen wrote:
>>
>>>On February 23, 2003 at 01:38:55, Matt Taylor wrote:
>>>
>>>DIEP is spinning and locking way way less than Crafty. Note that
>>>it is pretty hard to do without spinning under linux.
>>>
>>>The runqueue fires at 100Hz in linux. So the latency for a thread that doesn't
>>>search and normally is doing all kind of stuff is around 10ms under linux.
>>
>>Yes, Windows NT is 7.5 ms, and any OS that strives to do better is going to
>>waste a lot of time in the scheduler.
>>
>>Spin waits are nearly useless on a single-processor machine. I don't know what
>>you are doing, but a spin wait never occurs in an application on a
>>single-processor machine when the code is written correctly. Since the chess
>>engine has no extra threads, there will never be another engine thread that has
>>the spin lock. The lock will never actually spin -- the thread can always
>>acquire the lock because it's always free (unless you have a bug).
>>
>>>For crafty 10ms latency is too much to wait for a thread to get fired for sure.
>>>
>>>I guess you didn't try to figure out what the cost of it is, otherwise you would
>>>not write such unprofessional comments like below.
>>
>>My comment had nothing to do with Crafty vs. Diep. It had everything to do with
>>comments you made a few months ago about how the Xeon 2.8 GHz was not available
>>when Bob had one on his desk. I can understand them not being available in
>>Europe, but you didn't say that. You kept asserting that they didn't exist.
>>
>>I'd wager most people who read that thread thought it was pretty funny as I did.
>>
>>>In DIEP under linux i do not idle either. Of course for me 10ms is too expensive
>>>too. Instead i generate a bunch of attacktables instead an idle process doesn't
>>>hammer at the same cache line like crafty does.
>>>
>>>It speeds DIEP up 20% (in nodes a second) at 32 processors when i do not take
>>>the 10ms penalty but go for doing something with the registers without hurting
>>>shared cache lines (so just local allocated stuff).
>>
>>Ok, but that's unnecessary. A spin wait is a short-duration lock. Crafty gets
>>the same speedup without having to go do something else while waiting for the
>>lock.
>>
>>>Under windows the runqueue fires at 500Hz, so that's 2ms latency. Still a lot,
>>>but a lot less than 10ms latency. Today i go test what the effect of that is for
>>>DIEP. I have no dual Xeon to my avail at the moment to test it though. Must do
>>>with a dual K7 and dual P3 and see what generating 600 attacktables (about 0.5
>>>ms at the dual k7) just in local ram is going to give versus using
>>>WaitForSingleObject.
>>
>>No. On NT it theoretically fires every 7.5 ms (133 Hz). On Win9x, it can fire as
>>slow as 20 ms (50 Hz). I measured the time on Windows XP Professional just now
>>and I got 15 ms. I am inclined to think this is the best XP Professional gets.
>>Server versions may use different timeslice values, but I don't have a copy to
>>test with.
>
>I do not know whether he meant SERVER version or PROFESSIONAL version for the
>2ms wake up time.
>
>>Code follows at the end of this message, please cut it when replying. Oh -- and
>>I recommend -never- programming like that. It's not bad for 20 minutes of work
>>including some debugging and a fix for SMP, but it can do really nasty things to
>>your system such as not being able to get into task manager to terminate it...
>>
>>Too bad Windows's scheduler isn't fair.
>>
>>>So for processes that let threads idle instead of letting them spin, that is a
>>>complete pathetic idea for realtime environments.
>><snip>
>>
>>Realtime has nothing to do with it. Spin locks can be used in real-time
>>programs. The idea behind a spin lock is that it is a -short- wait, probably
>>shorter than the time required to transition into kernel mode. Spin locks are
>
>Anything that needs kernel functions to let your process search on is bad simply
>nowadays. Kernels really are outdated in some ways.
>
>>used all over SMP kernels, particularly in drivers which are as close to
>>real-time as the PC architecture usually comes.
>>
>>In a single processor system, it is a dumb idea as you pointed out, but I don't
>>think that's news to Bob, and that's not news to me. I haven't even been
>>programming for 20 years, and he's been doing parallel research for that long.
>
>In fact in supercomputers it is far dumber to let stuff idle than in single cpu
>systems. Of course you use up less 'testing cpu clock ticks time'. or whatever
>they call it. But you are slower simply.
>
>20% slower at 32 processors is a lot... ...chessprograms split a lot each
>second.
>
>>-Matt
>>
>>>>>Did you make the necessary changes to spinlocks and spinwaits???
>>>>
>>>>Sorry, can't resist a good laugh!
>>>>
>>>>"No, they're not out yet!"
>>>>
>>>>:-)
>>>>
>>>>-Matt
>>
>><-- cut here -->
>>#include <windows.h>
>>#include <stdio.h>
>>#include <conio.h>
>>
>>typedef unsigned __int64 uint64;
>>
>>DWORD WINAPI IdleThread(LPVOID lpParam);
>>
>>int main(void)
>>{
>> uint64 clkspeed, dclocks, dbound, freq, dtime;
>> SYSTEM_INFO sysInfo;
>> HANDLE hThread[2]; // adjust for SMP system
>> DWORD dwTID[2];
>>
>> QueryPerformanceFrequency((LARGE_INTEGER *) &freq);
>>
>> _asm
>> {
>> lea eax, dtime
>> push eax
>> push 1000
>> push eax
>> call DWORD PTR [QueryPerformanceCounter]
>> rdtsc
>> mov esi, eax
>> mov edi, edx
>> call DWORD PTR [Sleep]
>> rdtsc
>> sub eax, esi
>> sbb edx, edi
>> mov esi, DWORD PTR [dtime]
>> mov edi, DWORD PTR [dtime+4]
>> mov DWORD PTR [dclocks], eax
>> mov DWORD PTR [dclocks+4], edx
>> call DWORD PTR [QueryPerformanceCounter]
>> sub DWORD PTR [dtime], esi
>> sbb DWORD PTR [dtime+4], edi
>> }
>>
>> clkspeed = (uint64)((double) dclocks * (double) freq / (double) dtime);
>> dbound = clkspeed / 1000;
>>
>> SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
>>
>> GetSystemInfo(&sysInfo);
>> for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
>> hThread[i] = CreateThread(NULL, 4096, IdleThread, NULL, 0, &dwTID[i]);
>>
>> while(!kbhit())
>> {
>> _asm
>> {
>> push 0
>> push 0
>>
>> call DWORD PTR [Sleep]
>>
>> rdtsc
>> mov DWORD PTR [dclocks], eax
>> mov DWORD PTR [dclocks+4], edx
>>
>> call DWORD PTR [Sleep]
>>
>>TimeSliceLoop:
>> pause
>> rdtsc
>> sub eax, DWORD PTR [dbound]
>> sbb edx, DWORD PTR [dbound+4]
>> sub edx, DWORD PTR [dclocks+4]
>> ja TimeSliceElapsed
>> sub eax, DWORD PTR [dclocks]
>> jna TimeSliceLoop
>>
>>TimeSliceElapsed:
>> sbb edx, 0
>> add eax, DWORD PTR [dbound]
>> adc edx, DWORD PTR [dbound+4]
>> mov DWORD PTR [dclocks], eax
>> mov DWORD PTR [dclocks+4], edx
>> }
>>
>> printf("Timeslice was: %d msec\n", (int)((dclocks * 1000) / clkspeed));
>> }
>>
>> for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
>> TerminateThread(hThread[i], 0);
>>
>> return 0;
>>}
>>
>>DWORD WINAPI IdleThread(LPVOID lpParam)
>>{
>> SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
>> while(1);
>>
>> return 0;
>>}
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.