Author: Enrique Irazoqui
Date: 15:17:49 12/13/97
Go up one level in this thread
On December 13, 1997 at 16:50:22, Don Dailey wrote: >I have an idea for generating a positional problem set to >measure our chess programs against. This will take some >cooperation and should involve many programs. > > 1. Start with (n) positions from Grandmaster games and make > a note of the move the grandmaster played. > > 2. Run each participating program for some length of time (t) > on each position. Note the move the program chooses at > the end of the allotted time. > > 3. Determine which positions there is a "consensus" on. The > best move must be agreed on by all programs. > > 4. Throw out the "easy" ones. Consider this our problem set. > > >For this to work, t (time) should be pretty high. The rational >is that all our programs are of Grandmaster strength, given enough >time to think. If all the "grandmasters" in our sample (including >the human who played the game) agree on the best move, the evidence >is fairly strong that the move is indeed the best move and the noise >will be quite low. > >Since all the programs will be capable of solving every problem, >the issue will be how quickly the "solutions" are found. A >scoring system should be determined in advance that we can use >to rate our programs based on the problem set that is eventually >chosen. > >The easy problems should be filtered away. If every program >chooses the right move very quickly, we probably should not >consider this postion worthy of including in our problem set. >The goal is to have a single "best" move, but the move should >not be trivial. > >I have no idea whether this technique will produce a useful >positional problem set. But I would be willing to prepare >the intial fen positions from a set of random positions I >prepared from master games. These positions are completely >random and were culled from one of the CDROM databases I have. > >Here is what I would need from each participant: > > a) The time your program first chose and kept the move that > was it's final choice. It might also be useful to know > if the program "wavered" on earlier iterations, did it > change it's mind a few times? > > b) The program you are using and the hardware you are running > on. Probably we should adjust for hardware and choose > our run times based on something pretty standard, like a > pentium pro 200. The exact time and hardware is not > critically important, but it should meet some minimum > requirement so as to not degrade the test. > >I would like some feedback on this from you guys. Do you think >it is worth pursing? Will it produce a useful set? Who is >interested in participating? Do you have some suggestions or >improvements? > >If enough people want to try this, I have 1000 fen positions >with grandmaster moves attached to them. I can run Cilkchess >through these 1000 positions (we have access to lots of hardware >here) and post the results of all 1000 positions, along with >a reduced set reflecting those positions Cilkchess "consents" >to being in the set. I suspect this will cut the set down a lot. > >I will suggest that each problem is run for 30 minutes on each >machine of at least pentium pro 200 performance (or you can >adjust for lesser hardware.) I am expecting that after Cilkchess >performs the first pass, there will only be a fraction of the >original 1000 positions left and everyone will be willing to >run 30 minutes on each one. > >Once this step is done we may have a useful positional problem >set. Even if we do not perhaps we will learn something! > >The participants would be the actual programmers whenever possible >but we can get the data from any source, and I know there are a >lot of enthusiastic chess program owners who might be interested >in contributing the test time. > >I will wait for feedback before proceeding. > >-- Don I had a very similar idea a few months ago. As a result I have a positional test of 24 problems, all from great grandmaster games and all commented by grandmasters. I discarded positions with solutions not found by any programs. The main difference with your proposal is that instead of 30 minutes I used only 5, more or less the average amount of time in games played at 40:2. The test has been posted on CCR under the name "CCR Test", with results on a P200MMX/64. Below I post the global results of some programs and the test set. Please tell me what you think of it. Rebel 8/9 17 Hiarcs 6 16 Mchess 6 16 Rebel 6 16 Mchess 7 15 Fritz 5 13 Genius 5 12 Shredder 1 11 Fritz 4 10 CM5K 10 24 positions: 2br2k1/5pp1/1p3q1p/2pBpP2/2P1P3/P6P/2Q3P1/5RK1 w - - ; Qa4 r1b2rk1/1p3pp1/pn3n1p/2q1p3/P3P3/2N2N2/BP1RQPPP/3R2K1 w - - ; Qe3 4rbk1/ppp2ppp/2qprn2/8/3QP3/1PN1R3/PBP2PPP/4R1K1 w - - ; h3 r2qk2r/4bppp/p1bp1n2/1p2pP2/4P3/1BNQ4/PPP3PP/R1B2RK1 w qk - ; Bg5 r1bq2k1/4n1b1/p2p2p1/Pp1Pp2p/1N2P3/1NR3PP/1P3QB1/4K3 w - - ; Qb6 r4rk1/ppq1bppp/3pbn2/4p3/4PP2/2N1B3/PPP1B1PP/R3QRK1 w - - ; f5 4r1k1/1p1rq1pp/2p1p3/p1P1Ppb1/P2P4/1PNR1QP1/6KP/3R4 w - - ; Nb1 2rq2rk/1p1bbp1p/p1Nppp2/8/4PP2/2N2B2/PPP3PP/R2Q1R1K b - - ; bxc6 r1bqnrk1/p4ppp/1pnpp3/2p5/2PPP3/P1PBB3/4NPPP/R2Q1RK1 b - - ; Na5 2r3k1/1q2ppbp/p5p1/1p1P4/4P3/3Q3P/P2B1PP1/2R3K1 w - - ; Rc6 1r4k1/p1rqn1p1/Ppn1p2p/1B1pPp2/1P1P1P2/2R1QN2/6PP/2R3K1 w - -; R1c2 r3r1k1/pp1q1pp1/2bb1nnp/3pp3/8/1P1PNNP1/PBR1PPBP/Q1R3K1 w - - ; d4 1r1r2k1/p4pp1/2bppq1p/2p5/2P5/1P1BP3/P1QR1PPP/3R2K1 b - - ; Qe5 r4rn1/pp3pkp/3q2p1/1b1pN3/3P4/5N2/PP1Q1PPP/2R1R1K1 w - - ; Qa5 2r2rk1/1p1qppbp/1n1n2p1/pN1P4/P7/BP4PP/4QPB1/2R2RK1 w - - ; Bxd6 rn1qk2r/p2pppbp/1p3np1/8/2PN4/2N3P1/PP2PPKP/R1BQ1R2 b qk - ; Qc8 5k2/1p1r1pp1/p1pp2p1/4q3/2P1P1Q1/4R2P/PP3PP1/6K1 b - - ; Ke8 r2r2k1/ppp1qppp/4pn2/8/2P1PP2/1P6/P1B1Q1PP/3R1R1K b - - ; e5 8/4k3/4bpp1/7p/1p2p3/4P1PB/1P2P1KP/8 b - - ; f5 2r1nrk1/p2q1ppp/bp1p4/n1pPp3/P1P1P3/2PBB1N1/4QPPP/R4RK1 w - - ; f4 r2q1rk1/4bppp/p2p4/2pP4/3pP3/3Q4/PP1B1PPP/R3R1K1 w - - ; b4 2r2rk1/1p1bq3/p3p2p/3pPpp1/1P1Q4/P7/2P2PPP/2R1RBK1 b - - ; Bb5 r1b2r1k/pp2q1pp/2p2p2/2p1n2N/4P3/1PNP2QP/1PP2RP1/5RK1 w - - ; Nd1 bn6/1q4n1/1p1p1kp1/2pPp1pp/1PP1P1P1/3N1P1P/4B1K1/2Q2N2 w - - ; h4 Enrique
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.