Author: Matt Taylor
Date: 11:13:47 12/25/02
Go up one level in this thread
On December 25, 2002 at 04:17:34, Dave Gomboc wrote: >On December 23, 2002 at 21:33:50, Matt Taylor wrote: > >>On December 23, 2002 at 21:29:08, Arshad F. Syed wrote: >> >>>Is there anyone here who has actually converted their C code to C++ while >>>keeping the NPS the same or even managed to improve upon it? Sune did mention >>>that he did so only at the cost of 15% degradation in speed. Why do people even >>>bother wasting their time with C++ for a chess program when it inevitably leads >>>to taking a hit in NPS? Regardless, of all the advice to simply inline the >>>functions, I have yet to see a living example of chess program code which >>>performed faster in C++. >>> >>>Regards, >>>Arshad >> >>The biggest hit you take in C++ is virtual functions, and inevitably people use >>them. The virtual functions are very damaging due to the fact that they're >>implemented with a function pointer, and at the machine level it translates into >>an indirect call. The indirect calls branch mispredict because the CPU can't >>figure out ahead-of-time where you're calling to, and branch mispredicts are >>expensive. > >Please excuse my ignorance, but where is the branch? It's an indirect jump, >isn't it? There's nothing optional about it, it must be followed. Is it that I >am not familiar enough with the terminology? e.g. I am thinking that by "branch >prediction" you may be including any operation that moves the program counter in >any way other than advancing sequentially to the next instruction. > >In any case, I'm confident that if you have a serious inheritance hierarchy (as >opposed to a toy example) you're not going to lose any sleep over the virtual >function call. Many if-then-elses aren't efficient either. > >Dave A branch is anything that changes the flow of execution from the pc++ scheme. An unconditional jump/call is still a branch. The problem that superscalar processors have is that they're executing multiple things simultaneously, and often to predict the target of a branch, you need to have completed execution of instructions almost immediately prior. Consider the example I gave in a previous post: mov esi, [ecx+SomeFuncPtr] call esi call esi This is an indirect call. The machine loads the function pointer into the esi register, and then the contents of that register are used as the target of the unconditional branch. The problem is that the machine doesn't get around to storing into esi until a cycle or so before the call, and the pipeline is some 20+ cycles deep (depending on processor). After it decodes the call instruction, it needs to know where to go next, but it has to stall for those 20 cycles until it knows what gets written to esi. Static prediction assumes that it's not going to branch (in this case). Obviously we know that it does, but rather than stall the pipeline for those 20+ cycles until they know WHERE it goes to, they choose simply to continue decode at the wrong place. (If you make a small function call, this is actually marginally advantageous over doing nothing since you've already got the return address in the cache.) In the case of an if/then/else, some implementations will work on BOTH branches so you don't lose as much. Either way, the if/then/else can predict right, and static prediction can NEVER get it right the first time. -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.