Computer Chess Club Archives


Search

Terms

Messages

Subject: bugged code

Author: Vincent Diepeveen

Date: 15:53:53 02/25/03

Go up one level in this thread


On February 25, 2003 at 13:28:43, Matt Taylor wrote:

hello are you trying to measure the quantum of a thread in your code?

how does this measure how fast a thread gets signalled *anyhow* with
WaitForSingleObject?

>On February 25, 2003 at 07:44:23, Vincent Diepeveen wrote:
>
>>On February 23, 2003 at 01:38:55, Matt Taylor wrote:
>>
>>DIEP is spinning and locking way way less than Crafty. Note that
>>it is pretty hard to do without spinning under linux.
>>
>>The runqueue fires at 100Hz in linux. So the latency for a thread that doesn't
>>search and normally is doing all kind of stuff is around 10ms under linux.
>
>Yes, Windows NT is 7.5 ms, and any OS that strives to do better is going to
>waste a lot of time in the scheduler.
>
>Spin waits are nearly useless on a single-processor machine. I don't know what
>you are doing, but a spin wait never occurs in an application on a
>single-processor machine when the code is written correctly. Since the chess
>engine has no extra threads, there will never be another engine thread that has
>the spin lock. The lock will never actually spin -- the thread can always
>acquire the lock because it's always free (unless you have a bug).
>
>>For crafty 10ms latency is too much to wait for a thread to get fired for sure.
>>
>>I guess you didn't try to figure out what the cost of it is, otherwise you would
>>not write such unprofessional comments like below.
>
>My comment had nothing to do with Crafty vs. Diep. It had everything to do with
>comments you made a few months ago about how the Xeon 2.8 GHz was not available
>when Bob had one on his desk. I can understand them not being available in
>Europe, but you didn't say that. You kept asserting that they didn't exist.
>
>I'd wager most people who read that thread thought it was pretty funny as I did.
>
>>In DIEP under linux i do not idle either. Of course for me 10ms is too expensive
>>too. Instead i generate a bunch of attacktables instead an idle process doesn't
>>hammer at the same cache line like crafty does.
>>
>>It speeds DIEP up 20% (in nodes a second) at 32 processors when i do not take
>>the 10ms penalty but go for doing something with the registers without hurting
>>shared cache lines (so just local allocated stuff).
>
>Ok, but that's unnecessary. A spin wait is a short-duration lock. Crafty gets
>the same speedup without having to go do something else while waiting for the
>lock.
>
>>Under windows the runqueue fires at 500Hz, so that's 2ms latency. Still a lot,
>>but a lot less than 10ms latency. Today i go test what the effect of that is for
>>DIEP. I have no dual Xeon to my avail at the moment to test it though. Must do
>>with a dual K7 and dual P3 and see what generating 600 attacktables (about 0.5
>>ms at the dual k7) just in local ram is going to give versus using
>>WaitForSingleObject.
>
>No. On NT it theoretically fires every 7.5 ms (133 Hz). On Win9x, it can fire as
>slow as 20 ms (50 Hz). I measured the time on Windows XP Professional just now
>and I got 15 ms. I am inclined to think this is the best XP Professional gets.
>Server versions may use different timeslice values, but I don't have a copy to
>test with.
>
>Code follows at the end of this message, please cut it when replying. Oh -- and
>I recommend -never- programming like that. It's not bad for 20 minutes of work
>including some debugging and a fix for SMP, but it can do really nasty things to
>your system such as not being able to get into task manager to terminate it...
>
>Too bad Windows's scheduler isn't fair.
>
>>So for processes that let threads idle instead of letting them spin, that is a
>>complete pathetic idea for realtime environments.
><snip>
>
>Realtime has nothing to do with it. Spin locks can be used in real-time
>programs. The idea behind a spin lock is that it is a -short- wait, probably
>shorter than the time required to transition into kernel mode. Spin locks are
>used all over SMP kernels, particularly in drivers which are as close to
>real-time as the PC architecture usually comes.
>
>In a single processor system, it is a dumb idea as you pointed out, but I don't
>think that's news to Bob, and that's not news to me. I haven't even been
>programming for 20 years, and he's been doing parallel research for that long.
>
>-Matt
>
>>>>Did you make the necessary changes to spinlocks and spinwaits???
>>>
>>>Sorry, can't resist a good laugh!
>>>
>>>"No, they're not out yet!"
>>>
>>>:-)
>>>
>>>-Matt
>
><-- cut here -->
>#include <windows.h>
>#include <stdio.h>
>#include <conio.h>
>
>typedef unsigned __int64 uint64;
>
>DWORD WINAPI IdleThread(LPVOID lpParam);
>
>int main(void)
>{
>	uint64 clkspeed, dclocks, dbound, freq, dtime;
>	SYSTEM_INFO sysInfo;
>	HANDLE hThread[2]; // adjust for SMP system
>	DWORD dwTID[2];
>
>	QueryPerformanceFrequency((LARGE_INTEGER *) &freq);
>
>	_asm
>	{
>		lea	eax, dtime
>		push	eax
>		push	1000
>		push	eax
>		call	DWORD PTR [QueryPerformanceCounter]
>		rdtsc
>		mov	esi, eax
>		mov	edi, edx
>		call	DWORD PTR [Sleep]
>		rdtsc
>		sub	eax, esi
>		sbb	edx, edi
>		mov	esi, DWORD PTR [dtime]
>		mov	edi, DWORD PTR [dtime+4]
>		mov	DWORD PTR [dclocks], eax
>		mov	DWORD PTR [dclocks+4], edx
>		call	DWORD PTR [QueryPerformanceCounter]
>		sub	DWORD PTR [dtime], esi
>		sbb	DWORD PTR [dtime+4], edi
>	}
>
>	clkspeed = (uint64)((double) dclocks * (double) freq / (double) dtime);
>	dbound = clkspeed / 1000;
>
>	SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
>
>	GetSystemInfo(&sysInfo);
>	for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
>		hThread[i] = CreateThread(NULL, 4096, IdleThread, NULL, 0, &dwTID[i]);
>
>	while(!kbhit())
>	{
>		_asm
>		{
>			push	0
>			push	0
>
>			call	DWORD PTR [Sleep]
>
>			rdtsc
>			mov		DWORD PTR [dclocks], eax
>			mov		DWORD PTR [dclocks+4], edx
>
>			call	DWORD PTR [Sleep]
>
>TimeSliceLoop:
>			pause
>			rdtsc
>			sub		eax, DWORD PTR [dbound]
>			sbb		edx, DWORD PTR [dbound+4]
>			sub		edx, DWORD PTR [dclocks+4]
>			ja		TimeSliceElapsed
>			sub		eax, DWORD PTR [dclocks]
>			jna		TimeSliceLoop
>
>TimeSliceElapsed:
>			sbb		edx, 0
>			add		eax, DWORD PTR [dbound]
>			adc		edx, DWORD PTR [dbound+4]
>			mov		DWORD PTR [dclocks], eax
>			mov		DWORD PTR [dclocks+4], edx
>		}
>
>		printf("Timeslice was: %d msec\n", (int)((dclocks * 1000) / clkspeed));
>	}
>
>	for(int i = 0; i < sysInfo.dwNumberOfProcessors; i++)
>		TerminateThread(hThread[i], 0);
>
>	return 0;
>}
>
>DWORD WINAPI IdleThread(LPVOID lpParam)
>{
>	SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);
>	while(1);
>
>	return 0;
>}



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.