Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: CM8K another question

Author: John Merlino

Date: 11:35:07 11/26/00

Go up one level in this thread


On November 26, 2000 at 14:15:21, Luis E. Alvarado wrote:

>Maybe John Merlino can describe with some detail how the CM8K rating of the
>personalities were derived using Computer vs omputer games and USCF survey data.
>I would appreciate that ... Luis

I'll give as much detail as I can recall....

As soon as it was known that a new engine was going to be put into CM8K, a test
was organized with as many USCF rated humans as the development team could scare
up quickly It turned out that 105 players signed up, and only a few of those did
not return their results. So, say an even 100 players.

As soon as a test program was ready, the test was mailed out to all players. It
involved playing 6 games (3 as White and 3 as Black) against 5 different
personalities. The personalities were chosen based on their ratings in CM7K and
the best "anchor points" to allow for a reasonable accuracy. These turned out to
be around every 500 points of rating because of the wide rating range included
in the game so that every player can have a decent challenge. So, there were
five opponents:

Chessmaster for the 2500 opponent
Josh Age 9 for the 2000 opponent
Willow for the 1500 opponent (a bit of a stretch with this one)
Sonja for 1000
Skippy for 500

So, 100 people playing 30 games gave the team around 3000 human vs. computer
games. But this only gave "accurate" ratings for 5 of the opponents in the game,
and there were almost another 140 to go!

So, once the ratings were fixed for the "anchor personalities", all of the other
personalities were put into a large number of comp vs. comp tournaments. I think
there ended up being just under 13,000 games played, as it was desired that each
personality would have at least, or close to, 100 games played to provide a
reasonable rating.

Hence, with more anchor points available (only Chessmaster's rating was anchored
before CM8K), the ratings are believed to be considerably more accurate.
Obviously, if your one anchor point to reality is at the top end of the scale,
the farther away from that anchor point you get, the farther away from reality
you get. In actuality, though, it turned out to be more of a bell curve of
"non-reality", as the personalities rated 1200-2000 were the most inaccurate in
CM6K and CM7K. Many of the personalities in this range were rated over 100
points too high, some were over 200 points too high. Others, though, were either
within 10 points or were even a little low.

No SETTINGS for any personalities were changed from CM7K to CM8K except for a
few of the Josh personalities, which had to be as accurate as possible and the
new engine forced a modification of the strength parameter.

I hope that was detail enough. If not, feel free to ask any further questions,

jm




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.