Author: Vincent Diepeveen
Date: 15:45:09 04/16/01
Go up one level in this thread
On April 16, 2001 at 18:15:52, Dieter Buerssner wrote: >In a different discussion, Vincent wrote the following: > >>It is not difficult to implement the form of parallellism as used by >>Rudolf. Invented by a frenchman who couldn't spell a word english and >>who wrote an impossible article for JICCA (did anyone proofread it at >>the time as i'm pretty sure they didn't get his parallel idea?). >> >>At the time when i read the article i was pathetically laughing about it >>actually as i also didn't get the idea of the frenchman. But it appears >>everyone who can make a chessprogram work under win2000 can also get >>within an afternoon his program parallel to work. Then some debugging >>and a day later it works cool. > >I'd be very interested in this algorithm, that can be implemented at an >afternoon :-) > >Could you point elaborate on this. Answer is: YES... Ok i'll give a small practical description here. For source code to implement some features or in depth questions email me. There are several forms of parallellism and from the shared memory parallellisms a very important difference is: a) dividing the search over the different processors b) completely independant searches which get speeded up by a side effect Nearly all papers are about a. Of course it took a frenchman to find a cheapo algorithm for b. God knows who proofread that article of the frenchman, probably the proofreader was only Jaap v/d Herik and he probably only corrected spelling of english as most likely the frenchman couldn't spell english at all. Anyway, Rudolf Huber managed to read it luckily. The idea is pretty straightforward. Just start your program which we will call P1 from process 1. It is completely the same as normal. No hard changes. Secondly we start a second program P2. This is nearly the exact same program as P2 except it can't do i/o. This program is only analysing for example. To not let P2 always analyze the root position we give moves made by P1 to P2. Understand me well we talk about moves made in the GAME, not in the search. So that's pretty little communication from P1 to P2. Probably smart is also to give it expected moves by P1 and newgame commands. But that's about all you need to give from P1 to P2. Now where is the speedup? Well in the description as above so far there is no speedup at all. The speedup comes from a small side effect. Both P1 and P2 share the hashtable! So sometimes P1 will be speeded up by transpositiontable cutoffs which P2 has neatly stored for it in the hashtable. Even the most deterministic program in the world will be speeded up bigtime by this, as after a few microseconds already both processes will not search and evaluate the same positions at the exact same time. Expected speedup of this algorithm should be around 1.6 What do you need to modify to your engine to get it to work? Steps: a) let p1 start P2 (initially you might want to use _spawnl(), later when it works ok CreateProcess() ) Let P2 follow all things P1 is doing and also not let P2 ask for keyboard input. Even not shipping predicted moves is more than ok. just filling hashtable is all what P2 needs to do. b) create a shared hashtable and lock it before you fill it to talk more about locking hashtable and shipping moves to the other process: just like sharing hashtable you can share another piece of memory to do all this. Very important thing for shared memory is to know how volatile variables work. if i have a global variable that is called: int a; And process P1 says: a = 1; And process P2 says some later: a = 2; then it might be that both processes have for quite some time a different value in both processors when it is referenced. If i do however volatile int a; then the compiler always LOADS the variable from memory. So a variable which is indicating whether a piece of code is locked HAS TO BE volatile. volatile int lock; To lock a variable browse crafty a bit and probably somewhere like chess.h or some similar file you'll find definition for msvc for lock() and unlock. Note that sometimes system commands for locking go to the same assembly as in chess.h. There are some pitfalls which you might run in, if you can't figure something out then don's HESITATE to ask others who already experienced parallel problems, otherwise solutions might take months! When it works, tips to improve speedup have to do with hashtable: a) use several probes in the hashtable! I'm using 8 probes, perhaps too much for you, but if it is 1 or 2 now, try 4 probes with this algorithm! b) there is no need to lock the hashtable if you use a small trick as used by crafty. Note this trick requires that half of the entry is filled with hashkey, so it probably will eat more bytes an entry which for me is a major obstacle. c) search for smart overwrite approaches of entries. there are zillions of ways to do it. But don't use just 1 bit, that works awful for this algorithm during a game. 1 bit to overwrite is really very little. Use a bit or 10 and simply store gamedepth+searchdepth or so and replace on that. >BTW. In Paderborn, Roland Pfister also told me, that he knows this from Rudolf >Huber, and he even started to explain it to me. Somhow, we (or me) got >distracted, and I cannot remember the essential things. > >What I remember is, that the time consuming work, of making your >search/evaluation routines free from all those global variables is not needed. > >Regards, >Dieter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.