Author: Mathieu Pagé
Date: 00:39:00 02/21/04
Go up one level in this thread
On February 20, 2004 at 14:36:28, Jaime Benito de Valle Ruiz wrote: >Most people keep using test sets to find out how "good" their engines are doing; >this gives us a rough idea of their strength. Of course, real games are better >for this purpose. > >Tom Romstad just posted a message asking for opinions regarding a particular >static evaluation, and many others answered by giving the score given by their >engines. I find this most interesting: >Although there is nothing such as a "perfect score" for a position, and the play >style of an engines strongly depends on this value, I'm sure most here will >agree that there must be a sensible range of values to be considered reasonable. >Surely, these upper and lowere bounds could be set tighter or wider depending on >the nature of the position. >I could be wrong, but I would expect many engines with similar strength to give >scores within a reasonably range for a full list of static test positions. > >If I (or anyone else) provide you with a list of positions, would you be >interesting in providing the static values that your engine get for each >position? If your engine can read EPD files, adapting the code to read each >position and write another file with the static scores should be fairly straight >forward. >I'm sure this information could be extremely useful for many to find potential >flaws in their engines using an easy automatic process. > >We could compile one or more of these static tests (similar to EPDs) and suggest >a range of values for each position based on the ones given by the strongest >engines. > >Example: (with ficticious scores) >---------------------------------- > >Test File: > >3r2k1/3b2p1/1q2p2p/4P3/3NQ1P1/bP5P/PpP1N3/1K1R4 w - -; id "Test 0001" >r2k1b1r/2p2ppp/p3q3/2PpN3/Q2Pn3/4B3/PP3PPP/1R2R1K1 b - -; id "Test 0001" >1r2r2k/1bq3p1/p1p1Bp1p/1p3Q2/3PP3/1PnN2P1/5P1P/R3R1K1 b - -; id "Test 0001" >...... > >Output (3 engines): > > Engine A Engine B Engine C Range > >0001: +0.42 +0.12 +0.52 [ 0.12 , 0.52] >0002: +3.00 +2.83 +3.42 [ 2.83 , 3.42] >0003: -1.23 -0.88 -1.24 [-1.24 , -0.88] >..... > >Let me know if you're interested. >Regards, > > Jaime I got differents feelings about this. It would surely help lot of people (including me) to improve their evaluation function(s), but on the other hand, I think, it will contribute to make our engines less original. Their are already too much techniques that we all do like the other. I mean every one do AB, null move ... you know all those techniques. I think that if we used those kind of test suites to tweak our evaluation function we will get even less original engines. See what I mean (I hope so) Mathieu P.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.