Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dr. Enriques Problem Set

Author: Christophe Theron

Date: 12:52:31 01/02/00

Go up one level in this thread


On January 02, 2000 at 10:36:40, Enrique Irazoqui wrote:

>On January 01, 2000 at 18:26:05, Christophe Theron wrote:
>
>>On December 31, 1999 at 07:50:56, Enrique Irazoqui wrote:
>>
>>>On December 30, 1999 at 22:51:14, John Warfield wrote:
>>>
>>>>
>>>>
>>>>  My question is simple curiosity, Is it really possible for this so-called
>>>>hidden Test of Dr enriques to accurately predict how a program will perform on
>>>>the ssdf.  I find this difficult to believe, there seems to be alot of viarables
>>>>to deal with, how would a simple test set, perdict precisely how fritz6 or tiger
>>>>will score.  I am open to be educated here. If this test really exist I would
>>>>love to get my hands on it, So Dr Enrique if you read this please send me the
>>>>test, or let me know when it will be availble .  Thanks
>>>
>>>I am open to be educated too. :)
>>>
>>>This test exists and by now has 133 positions, all tactical, unambiguous, not
>>>included before in any test, therefore not cooked. The fact that so far it shows
>>>results very similar to the SSDF list came as a complete surprise to me. I don't
>>>trust positional tests, and what I wanted to get out of my tactical suite when I
>>>started building it was the difference between a tactical test and the SSDF
>>>list. I thought that with this I could see the value of non tactical stuff in a
>>>program. After running this test with some 30 programs, I was very, very
>>>surprised to see that ratings obtained with a tactical test and comp-comp games
>>>are basically the same, at least so far.
>>>
>>>As I said in other posts, any programmer can come with a version of his program
>>>optimized for tactics and such a program would do better in a test than in
>>>games. But since I test released, commercial programs tuned for real life and
>>>not for tests, my test is nod being fooled.
>>>
>>>So far it works, but... I ran this test with Junior 6 and Shredder 4, and in my
>>>opinion both programs scored less well than they should, according to what I see
>>>when they play, and I trust what I see better than any tests, including mine. I
>>>am extremely curious to see what will be the rating of J6 and S4 in the SSDF
>>>list. In case there is a big difference with my test, it will be interesting to
>>>know why these two programs are the only ones so far to do better in games than
>>>in a tactical test. Maybe, after all, my initial purpose will work and we will
>>>be able to see this difference tactical - not tactical (call it positional,
>>>strategic, whatever, but without a direct impact in the speed up of the search).
>>>Explaining this will be difficult, at least for me.
>>>
>>>(I hope this post is not too messy. While writing it I am instaling things in
>>>the new computer)
>>>
>>>I got the following results of the last programs:
>>>
>>>              Test               SSDF scale
>>>RT             12                   2695
>>>T12-dos         0                   2683
>>>CM6K          -10                   2673
>>>N732          -20                   2663
>>>F532          -21                   2662
>>>F6a           -22                   2661
>>>H732          -32                   2651
>>>J6            -53                   2630
>>>J5            -58                   2625
>>>S4            -69                   2614
>>>
>>>Enrique
>>
>>
>>I think your test shows something in what I believe since a while: positional
>>and tactical abilities are not separate entities.
>>
>>Improving the "positional" skills of a program improves also his "tactical"
>>abilities. A program with better positional understanding can also solve
>>combinations faster. For various reasons:
>>1) it spends less time hesitating between 2 inferior moves before finding a
>>third move (which is the key move)
>>2) with better knowledge a program can "sniff" a great combination one or 2
>>plies deeper (I have seem CM4000 doing this rather often)
>>
>>The opposite is also true: a program that is better at tactics can look like a
>>superiorly knowledged program. If you play the same program at ply depth N
>>against the same at ply depth N+1, the first one looks as if it knew nothing
>>about chess. It will be badly beaten, most of the time for what a human player
>>will recognize as "positional" reasons. But in fact there is exactly the same
>>amount of knowledge in both opponents!
>>
>>
>>However I'm still surprised that your test is so accurate. I think that's
>>because all the top chess programs are very similar in term of the chess
>>knowledge they have. Or because the tradeoff involved in adding new chess
>>knowledge leads to a balance between search and knowledge.
>>
>>So programmers have to break this balance by finding a new concept that goes
>>beyond the usual tactical/positional dilemna, which in fact is an ILLUSION.
>
>It is not an illusion for humans and this may be the source of confusion. People
>see a positionally ugly move played by a program and deduce that programs are
>stupid. Not so long ago someone posted a position easily understandable by us
>but that programs were unable to solve, concluding that these idiotic programs
>didn't play chess. The fact that we can not see instantly a mate in 77 like
>programs do at times or that we lose to these idiots didn't seem to bother, as
>if chips and neurons would have to work the same way. We call "positional" the
>knowledge we must resource to in order to avoid our slow search, so positional
>play is tactics by other means. Put it this way: if we could calculate 200 moves
>in advance, we wouldn't need any positional knowledge. We search slowly and we
>need knowledge, chips search faster and they get a good deal of knowledge during
>the search. Now, it is very natural for humans to qualify as ugly or stupid
>things done in a way strange to us. Then you get "human like" considerations,
>program's "positional superiority" over other programs and so on, which seem
>fallacies to me. It is different to say that there are programs that play in a
>way I enjoy and programs that don't; this is subjective and not necessarily
>transferable to other people. But strength is strength and objective, neurons or
>chips, and in programs it seems to be directly proportional to the speed of
>search.
>
>This is the last I got with my test. Results keep looking remarkably similar to
>the SSDF list. It can't be just random.
>
>             Test            SSDF scale
>RT             12               2695
>T12-dos         0               2683
>CM6K          -10               2673
>N732          -20               2663
>F532          -21               2662
>F6            -22               2661
>H732          -32               2651
>T1175         -37               2646
>N99a          -43               2640
>J6            -53               2630
>J5            -58               2625
>S4            -69               2614
>R9            -80               2603
>C1701         -98               2585
>M8           -117               2566
>G6           -124               2559
>
>Enrique


Remarkable, indeed. It is a very good thing that you publish it before the next
SSDF list, so people will be convinced of the accuracy of your tests, like I
have been convinced before the last SSDF list was published.

We will have in the next list at least one new program, Junior 6, and I just
can't wait to see if its SSDF rating matches yours.

If it is so, it's rather disappointing for Amir. But other testers say that
Junior 6 is probably 40 elo points above Junior 5, so the comparison with your
rating is going to be interesting.

I believe that you are right, and the differences between your list and the SSDF
list could help to discover which programs are superior to others in terms of
knowledge.

Hiarcs was supposed to have a lot of knowledge (just because it is a "slow"
searcher, the usual fairytale), but your list would tend to show that this
program has not much more knowledge than the others, or that this knowledge does
not give it any superiority.

I don't know if it is so, but your list at least provides some experimental
data. That's interesting.


    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.