Author: Bas Hamstra
Date: 18:13:20 04/16/01
Go up one level in this thread
On April 16, 2001 at 18:45:09, Vincent Diepeveen wrote: >On April 16, 2001 at 18:15:52, Dieter Buerssner wrote: > >>In a different discussion, Vincent wrote the following: >> >>>It is not difficult to implement the form of parallellism as used by >>>Rudolf. Invented by a frenchman who couldn't spell a word english and >>>who wrote an impossible article for JICCA (did anyone proofread it at >>>the time as i'm pretty sure they didn't get his parallel idea?). >>> >>>At the time when i read the article i was pathetically laughing about it >>>actually as i also didn't get the idea of the frenchman. But it appears >>>everyone who can make a chessprogram work under win2000 can also get >>>within an afternoon his program parallel to work. Then some debugging >>>and a day later it works cool. >> >>I'd be very interested in this algorithm, that can be implemented at an >>afternoon :-) >> >>Could you point elaborate on this. > >Answer is: YES... > >Ok i'll give a small practical description here. For source code >to implement some features or in depth questions email me. > >There are several forms of parallellism and from the >shared memory parallellisms a very important difference is: > a) dividing the search over the different processors > b) completely independant searches which get speeded up by > a side effect > >Nearly all papers are about a. Of course it took a frenchman >to find a cheapo algorithm for b. > >God knows who proofread that article of the frenchman, probably >the proofreader was only Jaap v/d Herik and he probably only corrected >spelling of english as most likely the frenchman couldn't spell english >at all. > >Anyway, Rudolf Huber managed to read it luckily. > >The idea is pretty straightforward. Just start your program which we >will call P1 from process 1. > >It is completely the same as normal. No hard changes. > >Secondly we start a second program P2. This is nearly the exact same >program as P2 except it can't do i/o. This program is only analysing >for example. > >To not let P2 always analyze the root position we give moves made >by P1 to P2. Understand me well we talk about moves made in the GAME, >not in the search. > >So that's pretty little communication from P1 to P2. Probably smart >is also to give it expected moves by P1 and newgame commands. > >But that's about all you need to give from P1 to P2. > >Now where is the speedup? > >Well in the description as above so far there is no speedup at all. >The speedup comes from a small side effect. Both P1 and P2 share the >hashtable! > >So sometimes P1 will be speeded up by transpositiontable cutoffs which >P2 has neatly stored for it in the hashtable. > >Even the most deterministic program in the world will be speeded up >bigtime by this, as after a few microseconds already both processes >will not search and evaluate the same positions at the exact same time. > >Expected speedup of this algorithm should be around 1.6 > >What do you need to modify to your engine to get it to work? > >Steps: > a) let p1 start P2 (initially you might want to use > _spawnl(), later when it works ok CreateProcess() ) > Let P2 follow all things P1 is doing and also > not let P2 ask for keyboard input. > Even not shipping predicted moves is more than ok. just > filling hashtable is all what P2 needs to do. > b) create a shared hashtable and lock it before you fill it > >to talk more about locking hashtable and shipping moves >to the other process: just like sharing >hashtable you can share another piece of memory to do all this. > >Very important thing for shared memory is to know how volatile >variables work. > >if i have a global variable that is called: > int a; > >And process P1 says: > a = 1; > >And process P2 says some later: > a = 2; > >then it might be that both processes have for quite some time a >different value in both processors when it is referenced. > >If i do however > volatile int a; > >then the compiler always LOADS the variable from memory. > >So a variable which is indicating whether a piece of code is locked >HAS TO BE volatile. > > volatile int lock; > >To lock a variable browse crafty a bit and probably somewhere >like chess.h or some similar file you'll find definition for >msvc for lock() and unlock. > >Note that sometimes system commands for locking go to the same assembly >as in chess.h. > >There are some pitfalls which you might run in, if you can't figure something >out then don's HESITATE to ask others who already experienced parallel >problems, otherwise solutions might take months! > >When it works, tips to improve speedup have to do with hashtable: > a) use several probes in the hashtable! I'm using 8 probes, > perhaps too much for you, but if it is 1 or 2 now, try 4 probes > with this algorithm! > b) there is no need to lock the hashtable if you use a small trick > as used by crafty. Note this trick requires that half of the entry > is filled with hashkey, so it probably will eat more bytes an entry > which for me is a major obstacle. > c) search for smart overwrite approaches of entries. there are zillions > of ways to > do it. But don't use just 1 bit, that works awful for this algorithm > during a game. 1 bit to overwrite is really very little. Use a bit or > 10 and simply store gamedepth+searchdepth or so and replace on that. > >>BTW. In Paderborn, Roland Pfister also told me, that he knows this from Rudolf >>Huber, and he even started to explain it to me. Somhow, we (or me) got >>distracted, and I cannot remember the essential things. >> >>What I remember is, that the time consuming work, of making your >>search/evaluation routines free from all those global variables is not needed. >> >>Regards, >>Dieter - Do you really get 1.6? Because Roland Pfister told me he got 1.2 with this. - Maybe ETC will help also here? At lower depths trying to play all moves to get a cutoff? Bas.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.