Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Is Hardware Over -valued? Why is Diep doing so Poorly?

Author: Robert Hyatt

Date: 08:57:05 07/11/02

Go up one level in this thread


On July 11, 2002 at 09:03:20, Rolf Tueschen wrote:

>On July 10, 2002 at 11:04:29, Robert Hyatt wrote:
>
>>Here is what happened.
>>
>>During the early Summer, Bert Gower and I were playing nightly games using
>>Cray Blitz running on a vax vs several different micros.  We were running
>>about 100 nodes per second and doing 5 ply searches.  We were doing this to
>>get ready for the 1985 ACM computer chess tournament in Denver, and we knew
>>I was moving to Birmingham in August to start work on my Ph.D.
>>
>>We noticed that we were generally tactically stronger than any of the micros
>>we were playing against, and in tactical situations, we won easily.  But on
>>occasion, we would create a "hole" in our pawn structure that would hurt us
>>in the endgame where the opponent could infiltrate off-side and cause trouble
>>since we were both doing pretty shallow searches.
>>
>>To fix these holes, I added two lines of code into the pawn evaluation loop
>>in Cray Blitz and further testing showed that we had "closed the door" on the
>>pawn hole problem and we lost no more games due to that problem.
>
>Just to make this clear. This historical anecdote is a masterpiece and should be
>told to every chess programmer. At the same time your story shows the main
>weakness of the past decades of computerchess programming. Until this very day!
>I just wrote a new article on SCW (Thorsten Czub Forum) about the game between
>JUNIOR and SJENG in Maastricht, a terrible mess for chess, only possible with
>the modern books and cookings, because no machine could have any idea about the
>coming endgame. But this endgame, for this specific variation after the sac of
>the quality on e4 with the best continuation on both sides, is simply won for
>White. See the game in the databases with a real master as White. To play this
>anyhow - is for me personally, excuse me, in non-crime vocabulary, "confidence
>tricksing" on chess. "Tricksing" also on the spectators world-wide who might
>think that they attend master chess, which is not at all the case. A computer
>wouldn't even dare to play the Marshall Gambit without exclamation marks in its
>books! You know, SJENG can play just to the exchange on e4 but is unable to see
>the difference between the bad Qd7 and the better Qh6. That is what I was
>talking about. There is simply no consistancy in the play of the machines with
>today's book-hypocrisy of course.
>
>The question from an interdisciplinary interest's standpoint is, why you at the
>time being did _not_ reflect about possibilities to find programming "tricks"
>closely related to chess itself. Why you did think that by a comparably more
>"stupid" or "magic" approach you could better win against comparably less strong
>machines? In other terms. Why didn't you reflect about real solutions, coming
>from chess, to find the right moves, also better later in the endgame?

You are making comments without having "been there".  The issue is this:

If a program doesn't have some particular evaluation term, and that fact can
be used in the majority of games to beat that program, then something has to
be done.  Because that particular term is missing, and therefore wrong 100%
of the time.  Doing _anything_ to help is a good idea, even if the "new term"
is only right 50% of the time.  Otherwise you lose 100% of the games, now you
only lose 50%.  A significant gain.

On the other hand, if you add something that is wrong 5% of the time, and it
causes you to lose 5% of the games you play, while if you don't use that term
you only lose 1% of the games you play, then the new term is not productive and
should be canned.

Neither of those applied to the change we made in 1985.  We didn't make a
change to "cook another program" as we were playing several different programs
plus humans at the weekly chessclub meetings, and we noticed this weakness
from time to time.  And after adding the new pawn hole eval term, it
_definitely_ played far better in those positions, whether against humans or
against computers.  But running on a VAX.  Not a Cray.



>Or this.
>Would you have changed your code this way also against equally strong or
>stronger (just joking) machines? Or was it all a consequence of your own
>"strength" in chess? In just another variation. Why didn't you try to find
>technical mirrors of real chess, chess of its best possibilities? Of course your
>solution was something from chess but not reaally sound. BTW what do you mean
>with testing? Against other machines or human players?

Both.  And the solution _did_ come from "real chess".  In fact, it came
from GM Julio Kaplan who worked with Harry Nelson regularly in tuning Cray
Blitz.

>
>[Just added: why did you never feel unconfortable with programming tricks, said
>to be successful in 98% _only_.


See above.  If the trick works 98% of the time, then that is a 2% error rate.
What is the error rate _without_ the trick?  If it is > 2% then the trick is
a "winner" overall.  yes, it would be nice to have it work 100% of the time,
and in fact, many eval terms do just this.  But it isn't always possible due
to time constraints within the search.  And 2% error is better than any percent
> 2 percent.





>I mean what with the 2% with deterministical
>certainty? For me such a procedure is _wrong_.

That's because you aren't a chess player.  As a human, I do this _all_
the time.  I often don't "know" and just "take a chance".  The only thing
I can compute with 100% confidence is a forced mate.

> A good human player will take
>advantage of such "holes" in your system. Tell me please if it's a typical
>learning by doing because your opponents were machines and not human players...]


As I said, we were playing _both_ while doing this "tuning".

>
>
>>
>>When we got to denver we played not very well and lost 2 of 4 games, with
>>the new HiTech machine winning the tournament.  I was neck-deep in my first
>>year of Ph.D. classes and just thought "seems that several have 'caught up'
>>with us" and let it go at that.
>>
>>The next year, at the WCCC tournament in Cologne, we won easily in the first
>>round, but lost in the second to a program that really had no business beating
>>us (we were running on an 8 cpu Cray I believe).
>
>The same question as above. Did you concentrate in your code on chess or just
>mastering your machine? ;-)



I have no idea what that means...



>
>
>>After losing in round 2, we
>>started looking to see what was wrong.  It was playing _very_ passively with
>>its pawns and its score would climb as the opponent pushed his pawns.  On a
>>"whim" I commented out the two lines of code dealing with pawn holes.  Suddenly
>>it started finding the moves it used to find.  In games where it played horribly
>>passively, it suddenly started to play aggressively as it used to.
>>
>>We left those two lines of code out, and we easily won the next three rounds,
>>including beating HiTech in the final round in a convincing way.  A way so
>>convincing that Berliner accused of of cheating because "no computer would
>>play some of those moves."
>>
>>A change that worked _great_ at 100 nodes per second and five ply searches,
>>turned out to be terrible at 9-10 plies.  After a lot of study, it turned out
>>that the pawn hole code was covering a hole that other eval terms were handling
>>with the deeper searches.  Which basically made holes twice as bad at 10 plies
>>as they were at 5 plies.  Removing two lines completely changed the character
>>of the program, and it also _clearly_ shows that an eval for a shallow search
>>can be quite different than what is needed for a significantly deeper search.
>>
>
>Just to explain Vincent's difficulties! You have forseen it before the
>tournament in Maastricht. Give us more details about the dependencies between
>hardware and resulting qualitative differences.


I don't know how to be very specific.  Each machine offers particular
architectural features that can be used to play chess.  IE the cray is a
vector machine and likes to work with arrays of data in parallel.  It also
has a _huge_ number of user-accessible registers (roughly 150, not counting
the 8 vector registers that hold 128 words each).  Parallel machines offer
another performance advantage.  But with that advantage comes some architectural
quirks the engineers had to include to make the design (a) feasible;  (b)
affordable;  (c) buildable;  (d) scalable;  (e) etc.  And a program has to
take advantage of the good features while avoiding the problems caused by the
bad features.  This takes a great deal of time to "get right".  It took us
_years_ to develop vectorizable algorithms for move generation, attack
detection, etc.  More time to code in assembly so we could use all those
registers to avoid memory traffic.  More time to handle the SMP features on
the cray but not on other machines (very efficient semaphores, shared registers
between cpus, etc.)  This is _never_ a 2-week process.  It is more a process
measured in terms of years.








>
>>
>>What if you only do very shallow searches.  So shallow that you can't under-
>>stand some simple positional idea.  You add eval code to handle that.  Then
>>you search far deeper on different hardware and you both see the consequence
>>of the positional idea thru search, _and_ you see it thru eval.  A "double
>>dose".  If you tune the positional term down so that it doesn't double-deal you
>>on fast hardware, it is ineffective on slow hardware.  If you turn it up so that
>>you don't make the positional mistake on slow hardware, you might find yourself
>>paranoid about the problem on fast hardware since you see twice the penalty.
>>That is what happened to us.  Could it be avoided?  No idea.  But it _clearly_
>>can happen.  It _did_ happen for one well-known program...  Mine...
>
>The Hsu team of IBM did avoid it by introducing GM Benjamin into the testings.
>IMO he had to check nonsense and contradictions of the play in relation to the
>books.

Apples and oranges.  The DB guys _never_ tried to tune their program while
running at 1/1000th the speed of the real machine.  That was the problem we
fell into.  They had access to their machine all the time and could test in
any way they wanted.  We had little access to the Cray except right before
an annual computer chess tournament, so all our development was _forced_
onto a VAX, which was far slower.




>
>>
>>
>>If you play on ICC long enough, you learn that you have to "turn up your king
>>safety" to survive against IM/GM players in bullet/blitz games.  Otherwise they
>>attack you before you know you are being attacked.  You then discover that this
>>makes you very passive in long time controls.  And you have to turn it back down
>>or else you play so passively you get squeezed to death.  Happens to most
>>everyone that plays there.  You get the right "setting" for say 1 0 bullet,
>>then at game/60 minutes you become way too passive.  Or vice-versa...
>
>I hope that you finally agree that Kasparov was far from trying such subtile
>methods in 1997 against DEEP BLUE II. More he wasn't even able to try. Simply
>because he didn't know the machine's performance at the time being.

Kasparov was strong enough to discover the problems, if they existed.  He did
it in match 1.  But not in match 2.  Which suggests that the problems in match
2 were _very_ difficult for anyone to "pick out".  If the "surprise" wasn't a
factor in match 1, it seems dishonest to suddenly claim it was an issue in match
2.  The only difference in the two matches was the "outcome".





>
>I know that you commented already, but this thread showed us that most people
>still have no understanding for the real connections between hardware and
>software. For some FRITZ is already as strong as DBII.

Some believe pigs will one day fly, too...


>
>Rolf Tueschen



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.