Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Dr. Enriques Problem Set

Author: Enrique Irazoqui

Date: 07:36:40 01/02/00

Go up one level in this thread


On January 01, 2000 at 18:26:05, Christophe Theron wrote:

>On December 31, 1999 at 07:50:56, Enrique Irazoqui wrote:
>
>>On December 30, 1999 at 22:51:14, John Warfield wrote:
>>
>>>
>>>
>>>  My question is simple curiosity, Is it really possible for this so-called
>>>hidden Test of Dr enriques to accurately predict how a program will perform on
>>>the ssdf.  I find this difficult to believe, there seems to be alot of viarables
>>>to deal with, how would a simple test set, perdict precisely how fritz6 or tiger
>>>will score.  I am open to be educated here. If this test really exist I would
>>>love to get my hands on it, So Dr Enrique if you read this please send me the
>>>test, or let me know when it will be availble .  Thanks
>>
>>I am open to be educated too. :)
>>
>>This test exists and by now has 133 positions, all tactical, unambiguous, not
>>included before in any test, therefore not cooked. The fact that so far it shows
>>results very similar to the SSDF list came as a complete surprise to me. I don't
>>trust positional tests, and what I wanted to get out of my tactical suite when I
>>started building it was the difference between a tactical test and the SSDF
>>list. I thought that with this I could see the value of non tactical stuff in a
>>program. After running this test with some 30 programs, I was very, very
>>surprised to see that ratings obtained with a tactical test and comp-comp games
>>are basically the same, at least so far.
>>
>>As I said in other posts, any programmer can come with a version of his program
>>optimized for tactics and such a program would do better in a test than in
>>games. But since I test released, commercial programs tuned for real life and
>>not for tests, my test is nod being fooled.
>>
>>So far it works, but... I ran this test with Junior 6 and Shredder 4, and in my
>>opinion both programs scored less well than they should, according to what I see
>>when they play, and I trust what I see better than any tests, including mine. I
>>am extremely curious to see what will be the rating of J6 and S4 in the SSDF
>>list. In case there is a big difference with my test, it will be interesting to
>>know why these two programs are the only ones so far to do better in games than
>>in a tactical test. Maybe, after all, my initial purpose will work and we will
>>be able to see this difference tactical - not tactical (call it positional,
>>strategic, whatever, but without a direct impact in the speed up of the search).
>>Explaining this will be difficult, at least for me.
>>
>>(I hope this post is not too messy. While writing it I am instaling things in
>>the new computer)
>>
>>I got the following results of the last programs:
>>
>>              Test               SSDF scale
>>RT             12                   2695
>>T12-dos         0                   2683
>>CM6K          -10                   2673
>>N732          -20                   2663
>>F532          -21                   2662
>>F6a           -22                   2661
>>H732          -32                   2651
>>J6            -53                   2630
>>J5            -58                   2625
>>S4            -69                   2614
>>
>>Enrique
>
>
>I think your test shows something in what I believe since a while: positional
>and tactical abilities are not separate entities.
>
>Improving the "positional" skills of a program improves also his "tactical"
>abilities. A program with better positional understanding can also solve
>combinations faster. For various reasons:
>1) it spends less time hesitating between 2 inferior moves before finding a
>third move (which is the key move)
>2) with better knowledge a program can "sniff" a great combination one or 2
>plies deeper (I have seem CM4000 doing this rather often)
>
>The opposite is also true: a program that is better at tactics can look like a
>superiorly knowledged program. If you play the same program at ply depth N
>against the same at ply depth N+1, the first one looks as if it knew nothing
>about chess. It will be badly beaten, most of the time for what a human player
>will recognize as "positional" reasons. But in fact there is exactly the same
>amount of knowledge in both opponents!
>
>
>However I'm still surprised that your test is so accurate. I think that's
>because all the top chess programs are very similar in term of the chess
>knowledge they have. Or because the tradeoff involved in adding new chess
>knowledge leads to a balance between search and knowledge.
>
>So programmers have to break this balance by finding a new concept that goes
>beyond the usual tactical/positional dilemna, which in fact is an ILLUSION.

It is not an illusion for humans and this may be the source of confusion. People
see a positionally ugly move played by a program and deduce that programs are
stupid. Not so long ago someone posted a position easily understandable by us
but that programs were unable to solve, concluding that these idiotic programs
didn't play chess. The fact that we can not see instantly a mate in 77 like
programs do at times or that we lose to these idiots didn't seem to bother, as
if chips and neurons would have to work the same way. We call "positional" the
knowledge we must resource to in order to avoid our slow search, so positional
play is tactics by other means. Put it this way: if we could calculate 200 moves
in advance, we wouldn't need any positional knowledge. We search slowly and we
need knowledge, chips search faster and they get a good deal of knowledge during
the search. Now, it is very natural for humans to qualify as ugly or stupid
things done in a way strange to us. Then you get "human like" considerations,
program's "positional superiority" over other programs and so on, which seem
fallacies to me. It is different to say that there are programs that play in a
way I enjoy and programs that don't; this is subjective and not necessarily
transferable to other people. But strength is strength and objective, neurons or
chips, and in programs it seems to be directly proportional to the speed of
search.

This is the last I got with my test. Results keep looking remarkably similar to
the SSDF list. It can't be just random.

             Test            SSDF scale
RT             12               2695
T12-dos         0               2683
CM6K          -10               2673
N732          -20               2663
F532          -21               2662
F6            -22               2661
H732          -32               2651
T1175         -37               2646
N99a          -43               2640
J6            -53               2630
J5            -58               2625
S4            -69               2614
R9            -80               2603
C1701         -98               2585
M8           -117               2566
G6           -124               2559

Enrique

>    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.