Author: Robert Hyatt
Date: 16:46:00 06/10/98
Go up one level in this thread
On June 10, 1998 at 16:56:58, Eugene Nalimov wrote: >Sorry Bob, > >But if you'll look in the other Intel manual - AP-526, >"Optimization for Intel's 32-Bit Pcocessors" (available >for download from Intel web site), section 2.4.4 (Branch >Target Buffer for Pentimu Pro Processor), you can read > >"The penalty for mispredicted branches is at least 9 >cycles (the length of the In-Order Issue Pipeline) of >lost instruction fetch, plus additional time spent >waiting for the mispredicted branch instruction to >become the oldest instruction in the machine and retire. >This penalty is non-deterministic, dependant upon >execution circumstances, but experimentation shows that >this is nominally a total of 10-15 cycles". > >Also, if you remember, when Crafty still used assembly >code, and I removed branches in several routines >(including, BTW, FirstOne), Crafty become 1.5% faster. >Code size decreased very slightly, there was still cost >of function call (not at execution time - it greatly >complicated optimizer/register allocation work, with >only 7 registers available), so I think that main >speedup come from removing of mispredicted branches. > >Eugene > no argument from me there, because your comments don't affect what I told vincent... branches will not slow his program 400%, ever. 25% was the absolute upper limit I would predict assuming his branches are so evenly taken/not-taken that prediction fails every time. 9 clocks is less than a cache-miss... 1.5% I remember.. but remember, that is nowhere near the 400% faster figure Vincent mentioned, "if branch prediction was better". That won't ever happen...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.