Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Lies, ****ed Lies and Statistics? Re: Rotation

Author: Eelco de Groot
Date: 02:55:31 08/17/01
On August 16, 2001 at 19:21:57, Les Fernandez wrote:

>On August 16, 2001 at 18:53:46, Eelco de Groot wrote:
>
>>On August 16, 2001 at 16:31:46, Dann Corbit wrote:
>>
>>>On August 16, 2001 at 16:01:11, Nino wrote:
>>>[snip]
>>>>>People get bored of the project and drop off.  We are addressing this situation,
>>>>>and there should be something fairly revolutionary to announce soon.  We will
>>>>>announce it when it is ready.
>>>>
>>>>Dan can you indicate about when to expect this announcement.
>>>
>>>Don't know for sure, sometime soon.
>>>
>>>>Also dan how much
>>>>experimentation have you done with these rotations?
>>>
>>>Maybe 10 hours total.  I have not discovered any problems in my searches, but I
>>>did see a few strange things.  For instance, the rotated solution will sometimes
>>>be different than the one that a chess engine will find when left on its own to
>>>analyze the rotated position.   However, if you analyze the rotated solution, it
>>>will have about the same score as the original problem.  It seems that chess
>>>engine evaluations are not fully symetric for the most part.
>>>
>>>>Has anyone else taken advantage of this concept?
>>>
>>>I doubt it.
>>>[snip]
>>
>>
>>Nino maybe you could ask Harald Faber who also did some of the well-known tests
>>reversed to see, amongst other things, if programs would do worse for example
>>because they were tuned to known testpositions. But he got surprisingly
>>different results too sometimes. Harald e-mailed me some files back then and
>>looking at the results we thought it may have been because of asymmetries.
>
>This is certainly a possibility!
>>
>>Stefan Meyer-Kahlen had just won yet another of his world-titles and Harald
>>asked him for his opinion about the strange time-differences, Stefan said about
>>the same thing about asymmetries.
>>
>>An example might be if a program always starts looking in the same corner of the
>>board when it generates its movelists, that would be enough to create
>>differences. Once such differences are just enough to cause a good move to be
>>found one ply later you already have a big difference in solution time. It's a
>>bit of chaos theory too I thought; where small differences can grow larger if a
>>program prunes, cuts off, a part of the tree containing a good move that would
>>have been found with a little different move-ordering. Like the butterfly
>>creating climate changes or something like that..
>>
>What you say about differences based as to where the chess engine starts looking
>at a particular corner, although may be important, was not what I understood
>Nino to be talking about.  What he was saying is that one can take any chess
>position and be able to generate 3 additional permutations of that position
>which would in fact increase the number of positions.  The fact that some
>engines have this assymmetry condition or some engines build the movelist in
>different ways does not exclude the new 3 positions from being relevant and just
>as good as the original postions evaluations.  Please excuse me if I have
>interpreted your paragraph wrong.
>
>>If anybody already has these kinds of results, I think there is some more that
>>you can do with it. Perhaps Les Fernandez thought about something like this too?
>
>Eelco can you elaborate as to what you mean here?
>
Hi  Les,

Thanks for your reaction! Yes, I totally agree with you that the mirrored
positions are just as good as the original ones! Maybe even better because some
programs may accidentally already have games in their databases or learn-files
which contain the original positions and even the programmers wanting to do
testing sometimes forget to disable those already learnt results. So obscure
positions have an advantage for testing.

What I was mainly referring to there was the paragraph following this; because
the new positions should in theory be just as difficult as the old ones, if
there is still a difference in the time to find a solution you could regard that
as a random effect.

The "totally random" is a bit debatable because computations are always only
semi-random.

That is the main reason why it might not work exactly as planned but theory was
that any new set of positions where just a few testposions are mirrored ones is
just as good as the old one. So with N positions in the set you can get 2^N
testresults by just taking AAAA, AAAB, AABA, ..., AABB, ...., ABBB,..., BBBB and
treating them as "real" results. They all lie between the set with quickest
total and slowest total solution times. Now you can put all that in a statistics
program and you have a means of saying if two different versions of a program
have a statistically different performance on a test, which otherwise is not
possible.

Not possible that is unless you disregard the "randomness" introduced because a
program may accidentally be very good at the original test, by accidentaly
trying the best moves first for instance.

I hope I have made my somewhat shaky theorizing  a little clearer now? Not sure
I understand it myself exactly anymore now...

Best regards,
Eelco

PS: Dann thanks for the positions but what I had in mind where more the exact
solution times to find the key move and that doesn't show up in the PV unless I
am mistaking?


>>If you have the results for mirrored positions as well then with a permutation
>>program you can generate all the virtual results AAAAABAAA, BBBAABBBA etc. And
>>with an "off the shelve" calculator suddenly for every testresult you can also
>>generate statistical error-margins, histograms, standard deviations and such.
>>
>>
>>For an experimental program undergoing testing you could look for "outliers": in
>>those positions having very different time-results when mirrored there might
>>also be something wrong in the program. A bug could be causing a systematic
>>rather than a statistical aberration. But, because of the "butterfly-effects",
>>it might not always be that simple to find actual causes and bugs?
>>
>
>Good point Eelco!  This type of data may be able to indicate to the developer
>that there is something possibly wrong in the logic.  Of course this all depends
>on what the developer had hoped to achieve in the logic he/she tried to
>incorporate.
>
>>Big error margins also can tell you that because of random fluctuations you just
>>would have to do more actual testing to get more accurate results.
>>
>>All this more or less in theory, I can Christophe Theron already hear saying
>>that you would have to test this, because over 90% of such ideas just don't work
>>in practice as you had anticipated and mirrored BS-tests just give you "bovine
>>statistics"?!
>>
>
>As far as the possibility of this idea not working in practice I find it hard to
>dispute that given a chess position which has already been evaluated to a
>certain depth and rotated into 3 new positions, that as long as you maintain
>your symmetry changes throughout the evaluation parts (pv,bm etc) the 3 new
>positions and their evaluations should be as good as the initial evaluation.
>>
>
>Les
>
>> Eelco
Re: Lies, ****ed Lies and Statistics? Re: Rotation Andreas Stabel 09:08:57 08/17/01
- Re: Lies, ****ed Lies and Statistics? Re: Rotation Dann Corbit 10:09:18 08/17/01
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.