Author: blass uri
Date: 00:57:44 02/24/00
Go up one level in this thread
On February 23, 2000 at 23:50:44, Andrew Dados wrote: >On February 23, 2000 at 17:49:20, Bruce Moreland wrote: > >>On February 23, 2000 at 15:01:14, Bertil Eklund wrote: >> >>>On February 23, 2000 at 12:33:50, Bruce Moreland wrote: >>> >>>>On February 23, 2000 at 11:08:43, blass uri wrote: >>>> >>>>>shredder2 was not tested on the fast hardware because the ssdf always use fast >>>>>hardware for new programs and old hardware for old programs. >>>> >>>>Has anyone considered that this might be a major source of error, perhaps rating >>>>inflation? >>> >>>Why? Any suggestions of what to do instead. Do you think humans should refuse to >>>play opponents rated 200 elo higher or lower. >> >>There is a very major assumption buried in the Swedish list, the assumption that >>these ratings have some correlation with ratings on the human list. >> >>The very best way to make the ratings correlate with the human list would be to >>have the programs play against a variety of humans. >> >>Instead the games are played exclusively between machines. >> >>If you speed up a program's hardware, the program will become stronger against >>other computers, you have ample evidence of this. But it is not a foregone >>conclusion that the program will become the same amount stronger against humans. >> >>The programs don't differ that much from each other, and it is possible that >>when you increase hardware, you allow the faster player to superset the slower >>one. It sees the same stuff, just better and faster. What is the result of >>this? I don't know, but it is possible that it is more extreme than should be >>expected. > >I can see some evidence supporting your point. > When playing new version of my proggy versus old one, I usually run >'nunn-style' match over set of 20-40 positions (which gives 40-80 games). Then >newer version makes use of new implemented knowledge which older does not have. >Last match like that ended in score suggesting about 150 rating points strength >difference between versions. Yet my programs ICC average rating didn't move >much if at all... > >-Andrew- I suspect that the nunn match inflate the difference in rating. Did you try a similiar match from fixed position from practical games of your program(after 10 moves)? Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.