Author: Vincent Diepeveen
Date: 06:07:31 02/26/03
Go up one level in this thread
On February 25, 2003 at 21:14:06, Matt Taylor wrote: >On February 25, 2003 at 16:23:05, Vincent Diepeveen wrote: > >>On February 25, 2003 at 13:28:43, Matt Taylor wrote: >> >>i asked m$ kernel team member and he told me newer NT kernels have 2ms latency >>to wake up a process. If you measure 7.5ms somehow that surprises me quite some. >>I was told it was 2ms for latest NT kernels. > >7.5 ms is the documented value. I can't remember whether that was hacked out or >documented in MSDN. The NT source is available for academic purpose if you're >willing to sign an NDA. I never sign for NDA's like that. >>My own tests show that the scheduler from windows at NT is about 2-3 times >>faster than the latency that it gets under linux. Of course it is possible that >>it ain't 10ms under windows but like 21, i didn't test absolute speeds. i just >>tested relative speeds ;) > >Again, Windows XP Professional is 15 ms for me. You test the wrong thing dude. >>>On February 25, 2003 at 07:44:23, Vincent Diepeveen wrote: >>> >>>>On February 23, 2003 at 01:38:55, Matt Taylor wrote: >>>> >>>>DIEP is spinning and locking way way less than Crafty. Note that >>>>it is pretty hard to do without spinning under linux. >>>> >>>>The runqueue fires at 100Hz in linux. So the latency for a thread that doesn't >>>>search and normally is doing all kind of stuff is around 10ms under linux. >>> >>>Yes, Windows NT is 7.5 ms, and any OS that strives to do better is going to >>>waste a lot of time in the scheduler. >>> >>>Spin waits are nearly useless on a single-processor machine. I don't know what >>>you are doing, but a spin wait never occurs in an application on a >>>single-processor machine when the code is written correctly. Since the chess >>>engine has no extra threads, there will never be another engine thread that has >>>the spin lock. The lock will never actually spin -- the thread can always >>>acquire the lock because it's always free (unless you have a bug). >>> >>>>For crafty 10ms latency is too much to wait for a thread to get fired for sure. >>>> >>>>I guess you didn't try to figure out what the cost of it is, otherwise you would >>>>not write such unprofessional comments like below. >>> >>>My comment had nothing to do with Crafty vs. Diep. It had everything to do with >>>comments you made a few months ago about how the Xeon 2.8 GHz was not available >>>when Bob had one on his desk. I can understand them not being available in >>>Europe, but you didn't say that. You kept asserting that they didn't exist. >>> >>>I'd wager most people who read that thread thought it was pretty funny as I did. >>> >>>>In DIEP under linux i do not idle either. Of course for me 10ms is too expensive >>>>too. Instead i generate a bunch of attacktables instead an idle process doesn't >>>>hammer at the same cache line like crafty does. >>>> >>>>It speeds DIEP up 20% (in nodes a second) at 32 processors when i do not take >>>>the 10ms penalty but go for doing something with the registers without hurting >>>>shared cache lines (so just local allocated stuff). >>> >>>Ok, but that's unnecessary. A spin wait is a short-duration lock. Crafty gets >>>the same speedup without having to go do something else while waiting for the >>>lock. >>> >>>>Under windows the runqueue fires at 500Hz, so that's 2ms latency. Still a lot, >>>>but a lot less than 10ms latency. Today i go test what the effect of that is for >>>>DIEP. I have no dual Xeon to my avail at the moment to test it though. Must do >>>>with a dual K7 and dual P3 and see what generating 600 attacktables (about 0.5 >>>>ms at the dual k7) just in local ram is going to give versus using >>>>WaitForSingleObject. >>> >>>No. On NT it theoretically fires every 7.5 ms (133 Hz). On Win9x, it can fire as >>>slow as 20 ms (50 Hz). I measured the time on Windows XP Professional just now >>>and I got 15 ms. I am inclined to think this is the best XP Professional gets. >>>Server versions may use different timeslice values, but I don't have a copy to >>>test with. >> >>I do not know whether he meant SERVER version or PROFESSIONAL version for the >>2ms wake up time. > >2 ms is pretty short. It is possible he meant 2 ms...but I don't think any PC or >server uses 2 ms timeslices. > >>>Code follows at the end of this message, please cut it when replying. Oh -- and >>>I recommend -never- programming like that. It's not bad for 20 minutes of work >>>including some debugging and a fix for SMP, but it can do really nasty things to >>>your system such as not being able to get into task manager to terminate it... >>> >>>Too bad Windows's scheduler isn't fair. >>> >>>>So for processes that let threads idle instead of letting them spin, that is a >>>>complete pathetic idea for realtime environments. >>><snip> >>> >>>Realtime has nothing to do with it. Spin locks can be used in real-time >>>programs. The idea behind a spin lock is that it is a -short- wait, probably >>>shorter than the time required to transition into kernel mode. Spin locks are >> >>Anything that needs kernel functions to let your process search on is bad simply >>nowadays. Kernels really are outdated in some ways. > >Crafty has its own spin lock code. It does not use the OS for it. This is why >Bob had to modify Crafty for HT. If the OS provided the spin locks, the OS would >have to be modified, not Crafty. > >>>used all over SMP kernels, particularly in drivers which are as close to >>>real-time as the PC architecture usually comes. >>> >>>In a single processor system, it is a dumb idea as you pointed out, but I don't >>>think that's news to Bob, and that's not news to me. I haven't even been >>>programming for 20 years, and he's been doing parallel research for that long. >> >>In fact in supercomputers it is far dumber to let stuff idle than in single cpu >>systems. Of course you use up less 'testing cpu clock ticks time'. or whatever >>they call it. But you are slower simply. >> >>20% slower at 32 processors is a lot... ...chessprograms split a lot each >>second. ><snip> > >A few cycles is a penalty gladly paid. If Bob doesn't pay that in Crafty, the OS >will pay it for him after overhead on -top- of that. Unless you can avoid race >conditions, you have to employ some sort of synchronization. The spin lock is >the best tool for this job because it wastes the least amount of time. It's >protecting data that isn't locked for very long. > >-Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.