Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Rybka 1.01 Beta 13 b First impressions

Author: Albert Silver

Date: 19:29:36 01/29/06

Go up one level in this thread



>>>>>I am 99.9% sure that ultrasolid is either better than solid in all of the Betas,
>>>>>or worse than solid in all of the Betas.
>>>>
>>>>Perhaps a 4th opponent is necessary. In my testing, UltraSolid scores a bit
>>>>better with Fritz 9, about the same against Hiarcs 10, and worse against Fruit
>>>>2.2. Worse enough that it kills whatever it gains from Fritz 9 and then some, so
>>>>that after the 150 games of the 3 matches it actually does a fraction worse.
>>>>This has happened twice though. Maybe I'll try one of its most difficult
>>>>opponents in my previous testing: Gambit Fruit 4bx, and see what happens.
>>>>
>
>This is tricky. I would be surprised (but not shocked) if such an effect
>existed.
>
>However, the thing to keep in mind is that by looking for such patterns, you
>effectively "dilute" your data.
>
>If you play enough games, and entertain enough hypotheses, then some of them
>will be true by accident.

Well, look at it this way, I tested different settings, including the default,
and got this:

The default settings against Fritz 9 and Hiarcs 10 (Hypermodern) scored:

1   Rybka 1.01 Beta 12 32-bit  2900  +22/-11/=17 61.00%   30.5/50
2   Hiarcs 10                  2850  +11/-22/=17 39.00%   19.5/50

1   Rybka 1.01 Beta 12 32-bit  2900  +19/-15/=16 54.00%   27.0/50
2   Fritz 9                    2820  +15/-19/=16 46.00%   23.0/50

1   Rybka 1.01 Beta 12 32-bit  2900  +25/-10/=15 65.00%   32.5/50
2   Fruit 2.2                  2850  +10/-25/=15 35.00%   17.5/50

Total: 90 / 150

I then tested:

Improving Position     = Slightly Optimistic,
Deteriorating Position = Much More Pessimistic

1   Rybka 1.01 Beta 12 32-bit  2900  +27/-7/=16  70.00%   35.0/50
2   Hiarcs 10                  2850  +7/-27/=16  30.00%   15.0/50

1   Rybka 1.01 Beta 12 32-bit  2900  +20/-15/=15 55.00%   27.5/50
2   Fritz 9                    2820  +15/-20/=15 45.00%   22.5/50

1   Rybka 1.01 Beta 12 32-bit  2900  +25/-11/=14 64.00%   32.0/50
2   Fruit 2.2                  2850  +11/-25/=14 36.00%   18.0/50

Total: 94.5 / 150

A better score but only in one match, even if by quite a bit. Is it a fluke, due
in part to the fast time control?

I then tried the UltraSolid to the above, noting that I had tested it once
before and it had done better with Fritz 9, but worse (than default) with Fruit
2.2.

Improving Position     = Slightly Optimistic,
Deteriorating Position = Much More Pessimistic


1   Rybka 1.01 Beta 13 32-bit  2900  +26/-11/=13 65.00%   32.5/50
2   Hiarcs 10                  2850  +11/-26/=13 35.00%   17.5/50

1   Rybka 1.01 Beta 13 32-bit  2900  +26/-14/=10 62.00%   31.0/50
2   Fritz 9                    2820  +14/-26/=10 38.00%   19.0/50

1   Rybka 1.01 Beta 13 32-bit  2900  +25/-15/=10 60.00%   30.0/50
2   Fruit 2.2                  2850  +15/-25/=10 40.00%   20.0/50

Total: 93.5 / 100

As you can see, it is unclear whether UltraSolid is simply worse with Fruit 2.2,
but overall better, or whether it just isn't better. Since Toga and Fruit are so
close of kin, I tend to think that Hurd's results confirm the lack of
sutiability of UltraSolid with Fruit and Co. The question still remains as to
whether or not it is a somewhat isolated phenomenon. In fact, even my changes
only really appeared in one match, so they too bear further investigation.

One thing is clear, and that is that testing against one opponent is risky, no
matter how strong. Even as strong as Rybka.

>Of course, once you identify a particularly promising hypothesis, you can test
>it further and get an honest result.

That is the whole idea of course! :-)

                        Albert

>
>Anyway, it's late so I probably am not making much sense :)
>
>Vas
>
>>>>
>>>>>
>>>Ok I'll do the same. I here what vasik says and he is the author. I bet you can
>>>sense a but coming on, have a look at this:
>>>
>>>http://www.talkchess.com/forums/1/message.html?481879
>>>
>>>That was Beta12 now lets look at Beta13b.
>>
>>I'll be testing with Beta 13 and not 13b though. The reason is that I already
>>have the results of the default settings of Beta 12/13, as well as others. There
>>is no reason to presume that the parameter will work better with the hash change
>>he made compared to others. Otherwise I'd have to re-run the defaults settings
>>of 13b as well.
>>
>>                                         Albert



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.