Author: Vasik Rajlich
Date: 10:02:17 04/13/04
Go up one level in this thread
On April 13, 2004 at 12:27:17, Anthony Cozzie wrote: >On April 13, 2004 at 11:51:51, Omid David Tabibi wrote: > >>On April 13, 2004 at 11:24:22, Vasik Rajlich wrote: >> >>>On April 13, 2004 at 09:40:41, Omid David Tabibi wrote: >>> >>>>On April 12, 2004 at 14:45:28, Christophe Theron wrote: >>>> >>>>>On April 12, 2004 at 07:50:47, Tord Romstad wrote: >>>>> >>>>>>On April 12, 2004 at 00:09:48, Christophe Theron wrote: >>>>>> >>>>>>>On April 11, 2004 at 13:52:59, Tom Likens wrote: >>>>>>> >>>>>>>>On April 10, 2004 at 21:53:17, Christophe Theron wrote: >>>>>>>> >>>>>>>>>On April 10, 2004 at 15:55:17, Tom Likens wrote: >>>>>>>>>> >>>>>>>>>>I'm not sure where I come down on the bitboards vs. non-bitboard >>>>>>>>>>architectures. My engine is a bitboard engine, but that doesn't >>>>>>>>>>necessarily mean that the next one will be bitboard based. >>>>>>>>>> >>>>>>>>>>I don't believe though, that because no bitboarder has topped the >>>>>>>>>>SSDF list that this really constitutes any kind of proof. My strong >>>>>>>>>>suspicion is that if all the top commercial programmers converted >>>>>>>>>>over to bitboards tomorrow (yourself included) that *eventually* >>>>>>>>>>their new engines would again rise to the top of the SSDF. I'm >>>>>>>>>>beginning to suspect that creating a strong (i.e. world-class) engine >>>>>>>>>>involves a helluva lot more than just the basic data representation, >>>>>>>>>>but instead involves... >>>>>>>>>> >>>>>>>>>>1. 24/7 dedication >>>>>>>>>>2. A *real* way to measure progress >>>>>>>>>>3. A selective search strategy that works 99.99999% of the time >>>>>>>>>>4. Attention to about 2^64 minor details >>>>>>>>>>5. A failed marriage (okay, maybe this is extreme but you see the point) >>>>>>>>>> >>>>>>>>>>regards, >>>>>>>>>>--tom >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>Number 5 (or something close) was the reason why Tiger has made such a progress >>>>>>>>>between 1997 and 1999. :) >>>>>>>>> >>>>>>>>>Number 2, seriously, is worth spending several months on it. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Christophe >>>>>>>> >>>>>>>>This has been my main focus over the past few weeks. It's become readily >>>>>>>>apparent to me that the improvement slope from here on up is much steeper >>>>>>>>and I rather not waste my time implementing features that I can't properly >>>>>>>>test. >>>>>>>> >>>>>>>>regards, >>>>>>>>--tom >>>>>>> >>>>>>> >>>>>>> >>>>>>>That's the secret of real professional chess programmers. >>>>>> >>>>>>Of course you don't want to reveal your secrets, but it would be interesting if >>>>>>you could answer >>>>>>the following question: >>>>>> >>>>>>Assume that you make a change to your engine which improves the playing strength >>>>>>by >>>>>>about 10 Elo points. How many hours of CPU time do you need before you are sure >>>>>>that >>>>>>the change was an improvement? >>>>>> >>>>>>Tord >>>>> >>>>> >>>>> >>>>>I would say approximately one week, and I would not even be really sure it is an >>>>>improvement. We are talking about a 1.5% improvement in winning percentage here, >>>>>it's below the statistical noise of a several hundreds games match if you want >>>>>95% reliability! >>>>> >>>>>And unfortunately a 10 elo points improvement is becoming rare for me. Most of >>>>>the changes I try make the program weaker, and many changes do not provide any >>>>>measurable improvement! >>>>> >>>>>That's why not having a strong test methodology is totally out of question if >>>>>you are serious about chess programming. >>>> >>>>Devoting a whole week with 5 computers working 24/7 is a luxury few can afford. >>>>During the past two years I have developed Falcon from a 2200 engine to a 2700+ >>>>engine it currently is, all on one humble P3 733MHZ machine. >>>> >>>>In order to reach a 2700 level, the search should already be good enough. But >>>>beyond that level, it is mostly the evaluation that matters. Since the Graz >>>>WCCC, I have been spending almost all my time working on evaluation function. >>>>The work on search has been limited to modifying one pruning here, one extension >>>>there, etc. But again, beyond 2700, it is evaluation that matters. And I fully >>>>agree with Vincent on that. >>>> >>>>It is almost impossible to test a single evaluation change to see whether it >>>>improved the strength. If you change the evaluation of knight outposts by a few >>>>centipawns, good luck testing it... In those cases you have to highly rely on >>>>your feelings and chess knowledge, and then after doing many changes, test them >>>>as a whole to see if they improved the strength. Just my two cents. >>>> >>>> >>> >>>Omid, >>> >>>I'm curious, how many NPS does Falcon do? (Of course give hardware.) >> >>Falcon's NPS is about the same as Shredder on one precessor. >> >> >>>I take it >>>from the above that your search is essentially null-move based (possibly except >>>near the leaves). >> >>Modified verified null-move pruning, plus another custom forward pruning, and >>various extensions. >> >>> >>>I have the theory that there are three viable approaches to making a top engine: >>> >>>1) ultra-high NPS, brute force (ie null move, some stuff at & below tips) >>>2) ultra-high NPS, selective >>>3) moderate NPS, ultra-selective >>> >>>Somehow, moderate NPS brute-force doesn't make much sense to me. >>> >>>Of course, practice should drive theory so there might be room in the above for >>>a #4. :-) >>> >> >>Various approaches are possible. Speaking for myself, up to the 2700 level the >>main strength increase came from improved pruning techniques. But after reaching >>that level, most of the problems will be with evaluation, not search. >> >>I don't think ultra-high NPS with good selectivity is enough for winning a WCCC >>title. What is the point of searching 18 plies just to apply a primitive >>evaluation function? > That's what Fritz and Junior do. (Except I would substitute "cheap" for "primitive"). >However, T ~ C^2 where T is time spent and C is complexity of the eval :) > >anthony > I suspect the "eval" guys like Omid & Vincent would argue just the opposite. T ~ (C^1) * (3^D), where C is # of eval terms, and D is search depth. Franz Morsch and Amir Ban would come back with: rating (C,D+1) > rating (3*C,D). I've concluded that an expensive eval can only be justified in an engine which also uses it to guide a very selective search. My Search () now has 34 arguments (27 input, 7 output), and my Eval () has 3 outputs. Of course, theories are cheap. Let's see who is standing when the dust settles. :-) Vas > >>>Vas >>> >>>> >>>>> >>>>>Even with a good test methodology chess programming is still an art: in many >>>>>cases you have to decide with your feelings, because the raw data does not give >>>>>you a definite answer. >>>>> >>>>>Now of course there are small improvements that I do not even need to test for a >>>>>long time: if I find a way to make my program 10% faster without changing the >>>>>shape of the tree, then all I need to do is run some safety tests that will only >>>>>look at the number of nodes searched on a large set of positions and compare it >>>>>to the last stable version. >>>>> >>>>>But that does not happen often! :) >>>>> >>>>> >>>>> >>>>> Christophe
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.