Author: Bruce Moreland
Date: 01:16:03 03/05/00
Go up one level in this thread
On March 04, 2000 at 23:36:58, Eugene Nalimov wrote: >On March 04, 2000 at 23:17:45, Robert Hyatt wrote: > >>On March 04, 2000 at 21:43:04, Tom Kerrigan wrote: >> >>>On March 04, 2000 at 20:27:38, Robert Hyatt wrote: >>> >>>>On March 04, 2000 at 15:48:16, Tom Kerrigan wrote: >>>> >>>>>On March 04, 2000 at 09:34:13, Robert Hyatt wrote: >>>>> >>>>>>>So it makes me wonder... if you made the Pentium's L2 cache as fast as the >>>>>>>PII's, would it achieve parity again? Seems likely to me. >>>>>>It would help... but without register renaming, it becomes difficult to feed >>>>>>two pipes for long sequences of instructions. I think the p6 would still keep >>>>>>a significant edge, but better cache would narrow the gap... >>>>> >>>>>Is there a section of Crafty that will run in 16k? >>>>> >>>>>You could do some comparisons with that. >>>>> >>>>>-Tom >>>> >>>> >>>>not that I can think of. IE even the MakeMove() loop in perft requires a good >>>>bit of data... >>> >>>In that case, I don't think it's possible to use Crafty to compare the processor >>>cores. The TSCP benchmarks give much more accurate data in that regard. >>> >>>-Tom >> >> >>Only for small programs. What about programs with larger cache footprints? >> >>IE I don't think _either_ TSCP or Crafty is the right benchmark. The _right_ >>benchmark is the program that is important to that buyer... For simple >>programs, the old P5 core runs well if both pipes can be fed by the compiler. >>Which means no register jams occur in the program. For more complex programs, >>the renaming logic in the P6 avoids many register jams/spills and does much >>better keeping both pipes filled. >> >>I am surprised any program is faster on a P5 than on a P6, equal clock speeds, >>however. > >P6 doesn't like: >(1) 16-bit code - loading of new value into segment register is *very* slow, >(2) Playing with halves of the registers (e.g. when you are trying to use AL and >AH simultaneously). When it sees the second instruction before the first one is >retired is stalls the entire pipeline and restarts it, losing ~10 CPU clocks. > >Maybe more, that is just from my memory. But any of that would be sufficient for >some programs to run slower on P6 than on P5. > >Eugene At the WMCCC in Jakarta, Fritz was on a P5 and I heard that the reason was that it was faster on that than on the P6. bruce
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.