Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Set the Record straight again, Bob - - -

Author: Robert Hyatt
Date: 06:57:33 01/27/04
On January 27, 2004 at 03:37:23, Drexel,Michael wrote:

>On January 26, 2004 at 22:09:22, Robert Hyatt wrote:
>
>>On January 26, 2004 at 17:00:07, Drexel,Michael wrote:
>>
>>>On January 26, 2004 at 11:49:28, Robert Hyatt wrote:
>>>
>>>>On January 26, 2004 at 11:24:58, Drexel,Michael wrote:
>>>>
>>>>>On January 26, 2004 at 09:33:41, Robert Hyatt wrote:
>>>>>
>>>>>>On January 26, 2004 at 02:14:39, ALI MIRAFZALI wrote:
>>>>>>
>>>>>>>On January 25, 2004 at 21:38:27, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On January 25, 2004 at 20:04:16, Rolf Tueschen wrote:
>>>>>>>>
>>>>>>>>>- - in a famous German forum the kids are on the streets and they shout:
>>>>>>>>>
>>>>>>>>>These old-fashioned Cray Blitz and Deep Blue monuments won't be "disqualified"
>>>>>>>>>by their authors with actualized Elo numbers.
>>>>>>>>>
>>>>>>>>>Is that true? Would these legends lose badly against today's elite of
>>>>>>>>>computerchess programs?
>>>>>>>>>
>>>>>>>>>I'm waiting!
>>>>>>>>>
>>>>>>>>>Rolf
>>>>>>>>
>>>>>>>>
>>>>>>>>I don't believe _any_ of them would "lose badly".  Any "super-program" from deep
>>>>>>>>thought through Cray Blitz would be very tough opponents for today's programs.
>>>>>>>>However, hardware is beginning to catch up.  Someone just pointed out on a chess
>>>>>>>>server last night that this quad opteron system I have is about the same speed
>>>>>>>>as the Cray T90 I ran on in 1995, in terms of raw nodes per second (6-7M back
>>>>>>>>then, 7-8M typically on the quad opteron).  So it is now probable that Crafty
>>>>>>>>could actually win a match from Cray Blitz on a T90 with 32 CPUs, assuming I use
>>>>>>>>the quad opteron.  My quad xeon 700 got ripped by the same machine a couple of
>>>>>>>>years back, however, so it would still be dangerous.
>>>>>>>>
>>>>>>>>I can't say much about how it would compare to other commercial programs as I
>>>>>>>>didn't run those tests with very little test time to play with the T90.
>>>>>>>>
>>>>>>>>The superiority of today's programs over the super-computers of 1995 are mainly
>>>>>>>>mythical, IMHO.  I suspect the games would be a _lot_ more interesting than some
>>>>>>>>would believe.  Of course, there is little chance to test such a hypothesis
>>>>>>>>since most old programs are long-retired, and such hardware is not readily
>>>>>>>>available today.
>>>>>>>I disagree.DeepBlue would get slaughtered ;by todays top commercial programs.
>>>>>>
>>>>>>Fine.  That is an opinion I don't agree with.  But since there is no way to test
>>>>>>the hypothesis, it is not worth the long argument.
>>>>>>
>>>>>>>It is known that standards in the midninties were not very high compared to
>>>>>>>today.I think you over estimate Nodes per second for some reason.For instance
>>>>>>>chess Tiger on Palm has a respectable SSDF rating of 2101 searching about
>>>>>>>only 200 positions per second on the palm.A decade ago at such low NPS it was
>>>>>>>inconceivable to get such rating.
>>>>>>
>>>>>>
>>>>>>You _do_ know that deep thought, _not_ deep blue, but deep thought produced
>>>>>>a 2650+ performance over 25 consecutive 40 moves in 2 hours games against GM
>>>>>>competition to win the last second Fredkin prize?
>>>>>
>>>>>It was pretty weak nevertheless. It was not at all near 2650 level.
>>>>>Many of those GMs didn't know anything about its significant weaknesses.
>>>>>Deep Thought II played 1991 in Hannover against several German IMS and GMs.
>>>>>It scored 3.5/7.
>>>>>That was not even close to a 2650 performance.
>>>>
>>>>And your point would be?  I have played games with versions of my program
>>>>that were simply terrible, due to bugs and so forth.  They had their share.
>>>>
>>>>But the Fredkin prize left little doubt, IMHO.  25 _consecutive_ games played
>>>>over a year+, counting only games against GM players, games that were at least
>>>>40 moves in two hours + 20 moves in one hour.  Whatever happened in various
>>>>things they did, you can _not_ play 25 consecutive games against varied GM
>>>>players and pull off a 2650 without being _good_.  Or do you really believe that
>>>>a weak human player could do that also?  I don't...
>>>
>>>Assume you could play a large number of games against strong human opponents
>>>with
>>>1. Crafty 19.09 on the Quad Opteron
>>>2. Chess Genius 1.5 on a Pocket PC (400 Mhz)
>>>
>>>The humans don't get any information about program and hardware.
>>>
>>>I would expect both programs to score well. Maybe Craftys performance would be
>>>50 ELO better.
>>
>>I would expect it to be much worse, myself.  GM players are not to be laughed at
>>for their tactics.  I believe that then can quickly "feel" how well tactically
>>the program is playing and adjust on-the-fly if they feel it is not very
>>tactically aware...
>
>In a single game? Impossible
>This might be true if they would play quite a few games in a row against the
>same program.

You might say impossible, but I have seen it happen.  One case in point was a
couple of years ago, I had a bug that on rare occasions would leave a Crafty
process running after a game, even though a new instance of Crafty was fired off
by xboard.  8 threads does not work well with the assumptions I have made in my
parallel search, and crafty was playing very weakly due to some 3-4-5 ply
searches.  I watched a GM start a game, and by move 25 he was playing _way_ more
aggressively than normal and he won, as he did in the next couple of games.  I
looked to see what was wrong, found the extra crafty running and killed it.  In
the next game he eased up on the wild stuff once the game was underway.

I view it as trying to push an unknown object around.  You can quickly tell how
hard it is pushing back and adjust your strategy...


>
>>
>>>
>>>If you match both computers Chess Genius would lose badly of course.
>>>
>>>Performances against humans are a weak argument.
>>>The initial posting was about computer-computer games. That is something
>>>entirely different.
>>
>>One thing is pretty sure, a program that plays pretty equally with another
>>program when they are pitted against GM players is _not_ going to smash the
>>other program.  It might _beat_ it, but not _smash_ it.  Particuarly when we
>>also have lots of data about how well DT did against computers up through 1994.
>>
>
>Assume you would match an old crafty version on the 200 Mhz machine with Crafty
>19.10 on the Quad.
>Now that would be a pretty onesided match. I would call this a smash.
>But Crafty scored already well against GM opponents back in 1996.

Not the same thing.  Current Crafty would score _significantly_ better against
the same pool of GM players.  And it would score significantly better against
the old crafty/hardware.  That was my point.  I can't imagine a circumstance
where two programs play humans and produce similar results but play each other
and the result is way lop-sided.  One could certainly be more anti-human, but it
is not going to get killed by the other program if they both play GM players at
a similar result level.

>
>>
>>
>>
>>>
>>>>
>>>>
>>>>>
>>>>>[D] 1r1q1rk1/pp1b1ppp/3p4/2pBpP2/P2nP3/2NP2P1/1PP2R1P/R1Q3K1 b - - 0 15
>>>>>
>>>>>Any strong program that does not avoid Bc6 in some ms?
>>>>
>>>>
>>>>Again, the point would be?  Pick _any_ program and look at 10 of its games. And
>>>>pose the question "any program that would avoid this terrible move?"  And you
>>>>will get a dozen answers, including answers from programs that have no chance at
>>>>all of beating the program in question...
>>>
>>>Well, King safety is very important. The most important factor in chess of
>>>course.
>>>Deep Thought would IMHO lose badly to Shredder 8 or Deep Junior 8 on a mere 2Ghz
>>>PC.
>>
>>:)
>>
>>Remember that Hsu talked about the DB king safety advantage over the commercial
>>programs of 1996/1997?  He _specifically_ mentioned that their king safety was
>>not up to the standard of DB and that led to the really bad results that all
>>programs were reported to have against them in their lab, against a crippled
>>processor version of DB, including yours truly...
>>
>
>Have you ever seen one of those Deep Junior games where it spotted a decisive
>King vulnerability 5 full moves before the opponent saw the 0.00 score (and
>after another 3 full moves a losing score)?

Yes, and I have seen games where it didn't see what was coming until it was too
late, as well.  It's a fine line for balancing something and it can fall either
way...  I can recall at least one game within the last year against Crafty,
where Crafty was losing at around -3.5, but Junior (some copy on ICC) played a
move and my score rose to 0.00 very quickly.  And later Junior's dropped to 0.00
as it was definitely a deep repetition draw...  That happens...  both ways...


>
>>
>>
>>
>>
>>>
>>>Deep Blue might be the sole exception.
>>>
>>>From a humans point of view todays top programs are strategical clearly
>>>stronger.
>>>Should be more important than Deep Blues possible tactical superiority.
>>>
>>>Michael
>>>
>>
>>Except no one has said Deep Blue was strategically inferior either. The second
>>game was uniformly praised (game 2 match 2) as the best strategic game ever
>>played by a computer, and of such high quality as to be as good as any GM game
>>ever played, strategically.
>>
>
>This game is highly overrated.
>The opening variation was ideal for any computer program.
>Deep Blue just made normal moves that improved its position.
>Black had no chance to get a dynamic counter play.
>At the end Deep Blue even almost spoiled it.
>
>Michael

I'll gently remind you it is the _only_ match against computers that K has lost.
 :)  That's not a bad accomplishment under any circumstances...





>>
>>
>>>>
>>>>
>>>>>
>>>>>[Event "Hanover"]
>>>>>[Site "Hanover"]
>>>>>[Date "1991.??.??"]
>>>>>[Round "7"]
>>>>>[White "Tischbierek, Raj"]
>>>>>[Black "Deep Thought II"]
>>>>>[Result "1-0"]
>>>>>[ECO "B23"]
>>>>>[PlyCount "44"]
>>>>>[EventDate "1991.05.??"]
>>>>>
>>>>>1. e4 c5 2. Nc3 Nc6 3. Nge2 e5 4. Nd5 d6 5. Nec3 Nge7 6. Bc4 Nxd5 7. Bxd5 Be7
>>>>>8. d3 Nd4 9. O-O Bh4 10. f4 O-O 11. f5 Rb8 12. a4 Bd7 13. g3 Bg5 14. Rf2 Bxc1
>>>>>15. Qxc1 Bc6 16. f6 gxf6 17. Qh6 Qb6 18. Qxf6 Be8 19. Raf1 Qxb2 20. Qg5+ Kh8
>>>>>21. Nd1 Qb4 22. c3 Qa3 1-0
>>>>>
>>>>>[D] r1r3k1/1q1n1p1p/p2Q2pb/3Rp3/P1p1P3/1P3P2/5BPP/4KB1R w K - 0 22
>>>>>
>>>>>How long does it take Crafty to avoid 22.bxc4?
>>>>
>>>>Perhaps more important:  How long would it take Crafty to play all the other
>>>>_good_ moves they played there?
>>>>
>>>>
>>>>
>>>>>
>>>>>[Event "Hanover"]
>>>>>[Site "Hanover"]
>>>>>[Date "1991.??.??"]
>>>>>[Round "6"]
>>>>>[White "Deep Thought II"]
>>>>>[Black "Wahls, Matthias"]
>>>>>[Result "0-1"]
>>>>>[ECO "E86"]
>>>>>[PlyCount "56"]
>>>>>[EventDate "1991.05.??"]
>>>>>
>>>>>1. d4 d6 2. c4 g6 3. Nc3 Bg7 4. e4 Nf6 5. f3 O-O 6. Be3 e5 7. Nge2 c6 8. Qd2
>>>>>Nbd7 9. d5 cxd5 10. Nxd5 Nxd5 11. Qxd5 Nb6 12. Qb5 Bh6 13. Bf2 Be6 14. Nc3 Qc7
>>>>>15. b3 Nd7 16. Qb4 a6 17. Rd1 Rfc8 18. Nd5 Bxd5 19. Rxd5 b5 20. a4 bxc4 21.
>>>>>Qxd6 Qb7 22. bxc4 Bf8 23. Qxd7 Qb4+ 24. Rd2 Rd8 25. Qxd8 Rxd8 26. Be3 Bc5 27.
>>>>>Bg5 Rd6 28. Ke2 Rxd2+ 0-1
>>>>>
>>>>>Michael
>>>>>
>>>>> So thinking that its
>>>>>>successor, which was 100x faster, would lose badly to today's programs is simply
>>>>>>logic that I can't follow.  There is absolutely _no_ basis to make such a wild
>>>>>>leap of faith.
>>>>>>
>>>>>>For the record, Cray Blitz, in 1980, had a USCF rating of 2300.  Running exactly
>>>>>>one thousand nodes per second.  Be careful of what you write if you are not sure
>>>>>>of your facts.  In this case you are simply wrong.
Re: Set the Record straight again, Bob - - - Drexel,Michael 12:14:58 01/28/04
- Re: Set the Record straight again, Bob - - - Robert Hyatt 07:39:44 01/29/04
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.