Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Qualifier to my previous statement

Author: KarinsDad

Date: 14:13:04 01/29/99

Go up one level in this thread


On January 29, 1999 at 16:10:56, Dann Corbit wrote:

>On January 29, 1999 at 15:48:15, KarinsDad wrote:
>[snip]
>>The topic is: How can we find out if programs (not humans) have been getting
>>better? Of course, someone could write a better opening book and a better
>>tablebase. No question about it. And yes, this does happen with the commercial
>>programs.
>>
>>But if you want to compare engine strength versus engine strength to determine
>>if the program got better and not compare whether some database has better data
>>in it, then you should use the same databases.
>I disagree.  What you are testing here is engine strength verses engine
>strength.  That is one particular measure of the ability of a chess engine.  But
>that same engine will play *better* with an excellent opening book.  And *better
>again* with an excellent endgame tablebase.  So that is only a subset of a
>program's ability to play the game of chess.
>

Sorry Dann, but you do not make sense. My statement was "if you want to compare
engine strength versus engine strength to determine if the program got better
and not compare whether some database has better data in it, then you should use
the same databases" and you respond with "What you are testing here is engine
strength verses engine strength". Of course it is. That's what I said.

Yes, I agree. There is more to a program than just it's engine. But what I am
saying is that you should compare "apples to apples" and have only one variable
different per test. You could test CM5000 and CM6000 with the same opening books
and same tablebases (if CM uses tablebases) for a thousand games. You could then
test CM6000 using CM5000 opening book vs. CM6000 using CM6000 opening book for a
thousand games to see if the opening book improved.

As long as you only change one variable from the control per test, then you are
measuring something real. You can find out if CM6000 is stronger due to a better
opening book or a stronger engine or both.

There are other factors to consider. What if CM6000 changed C compilers to
better work on Pentium IIs, but then works worse on Pentiums. Does that mean
that CM6000 is better or worse? Depends on what type of hardware you run it on.

>>If I could take a 1200 rated playing program and give it an opening database of
>>all moves out to 100 moves for each side (a very large database on the magnitude
>>of 10^320 positions) and a tablebase which can handle all positions with 12
>>pieces on a side (another extremely large database which I cannot even guess how
>>to calculate), then I would have a program that would never lose to Deep Blue
>>since it would never use it's search engine for anything other than looking up
>>data out of databases.
>Well, if you could produce such a remarkable database system, then your program
>would have that ability.  It does not matter where the answers come from.
>Should we get annoyed because the computer did not have to think about it but
>instead did a simple lookup?  A chess program is a black box.  Into it go board
>positions and out of it come board positions.  How it generated the positions is
>not relevant in determining the strength of the program.  If we remove all
>database entries, then we measure only engine strength.
>

Well, sort of. What if I play 2000 French games against a computer and 2000
King's Indian games against a computer and I win all of the French's and it wins
all of the King's Indians. Who is stronger? Are we the same strength? What if we
both win all games playing as white? Who is stronger?

The problem with ratings in general is that they make an approximation based on
results, but do not break those results down into categories (except time).

This is the reason GMs prepare for tournaments. They try to get their opponents
into disadvantageous territory. They do not care about the ratings of their
opponents, they care about their strengths and weaknesses. Hence, the propensity
for learning programs. They do not learn chess knowledge, they learn which
variations within an opening book lead to wins and loses for their particular
engine (again it is the engine which is important).

>>Would CM6000 be stronger than CM5000 with a stronger opening database? Most
>>likely. Is it a fair test to compare CM5000 with CM6000 with them both using the
>>same opening database? Of course. That's the point. If CM6000 has an inferior
>>engine to CM5000, but had a much more superior opening book, it could still win
>>games due to being in a superior position out of the opening.
>No more fair or less fair than testing with different database or endgame
>tablebase systems or whatever.  You should describe in the test the full nature
>of the variables of the experiment, but if you want to find out how well a
>program plays chess, you do not remove the data it normally has at its disposal.
> It will play far worse than it is capable of.
>

Agreed. The question I am trying to answer is: Have they gotten better? You are
looking at whether they have gotten better overall. Fair enough. I am looking at
whether the engines (and hence the algorithms) are getting better, or if it is
merely a matter of better databases. The best way to answer that is to not just
run CM5000 versus CM6000, but rather to segregate for each component.

>>The difference between humans and programs is that the opening book of a human
>>is an integral part of him whereas this is not the case with a program. A
>>program can use any opening book (in the appropriate format) or none at all. You
>>cannot compare the two.
>A human's opening book also changes over time.  You can learn new openings or
>you can forget how to use an opening you have not used for a while.  You can
>also have holes in your opening book just like a computer.
>Furthermore, *my* opening book is a microscopic fraction of the opening book of
>a GM. Is it fair to have us play against each other when his opening book is
>much larger?  Of course it is.  If I want a bigger {internal} opening book, I
>should study more.  And if I simply lack the capability to gather an opening
>book the size of Sierwan or Karpov or whomever, then tough -- I just have to get
>along with what I am capable of mustering.

You know, most of these discussions are debates in semantics. People here tend
to compare computers and humans similarly when they want to and then turn around
and compare them differently at other times.

The two are really quite dissimilar machines playing the same game.

For example: computers do not need "insufficient losing chances" types of rules.
These rules were added for humans with human feelings. Either you win within
time, or you draw, or you lose.

A delayed clock is designed for a human, not a computer.

Computers do not need rules on writing down the moves, they can do it
(effectively) effortlessly.

Computers make logs. Humans making logs during a game is considered cheating.

Computers look up exact moves in a database. The choices in a given opening book
will not change (assuming no learning and no intervention). Humans can play the
same opening for 20 years and suddenly make a blunder in the opening.

Humans get tired and sick. Computers do not.

Humans will (most often) stop playing if a hurricane threatens. Computers will
not.

There are many differences between the two. Whenever you make an analogy for one
based on the other, there is a tendency for it to be semantics and nothing more.

KarinsDad



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.