Computer Chess Club Archives




Subject: Re: C and C++ --- NPS

Author: Matt Taylor

Date: 19:28:16 12/25/02

Go up one level in this thread

On December 25, 2002 at 17:08:27, Gerd Isenberg wrote:

>On December 25, 2002 at 14:13:47, Matt Taylor wrote:
>>On December 25, 2002 at 04:17:34, Dave Gomboc wrote:
>>>On December 23, 2002 at 21:33:50, Matt Taylor wrote:
>>>>On December 23, 2002 at 21:29:08, Arshad F. Syed wrote:
>>>>>Is there anyone here who has actually converted their C code to C++ while
>>>>>keeping the NPS the same or even managed to improve upon it? Sune did mention
>>>>>that he did so only at the cost of 15% degradation in speed. Why do people even
>>>>>bother wasting their time with C++ for a chess program when it inevitably leads
>>>>>to taking a hit in NPS? Regardless, of all the advice to simply inline the
>>>>>functions, I have yet to see a living example of chess program code which
>>>>>performed faster in C++.
>>>>The biggest hit you take in C++ is virtual functions, and inevitably people use
>>>>them. The virtual functions are very damaging due to the fact that they're
>>>>implemented with a function pointer, and at the machine level it translates into
>>>>an indirect call. The indirect calls branch mispredict because the CPU can't
>>>>figure out ahead-of-time where you're calling to, and branch mispredicts are
>>>Please excuse my ignorance, but where is the branch?  It's an indirect jump,
>>>isn't it?  There's nothing optional about it, it must be followed.  Is it that I
>>>am not familiar enough with the terminology?  e.g. I am thinking that by "branch
>>>prediction" you may be including any operation that moves the program counter in
>>>any way other than advancing sequentially to the next instruction.
>>>In any case, I'm confident that if you have a serious inheritance hierarchy (as
>>>opposed to a toy example) you're not going to lose any sleep over the virtual
>>>function call.  Many if-then-elses aren't efficient either.
>>A branch is anything that changes the flow of execution from the pc++ scheme. An
>>unconditional jump/call is still a branch.
>>The problem that superscalar processors have is that they're executing multiple
>>things simultaneously, and often to predict the target of a branch, you need to
>>have completed execution of instructions almost immediately prior.
>>Consider the example I gave in a previous post:
>>mov    esi, [ecx+SomeFuncPtr]
>>call   esi
>>call   esi
>>This is an indirect call. The machine loads the function pointer into the esi
>>register, and then the contents of that register are used as the target of the
>>unconditional branch. The problem is that the machine doesn't get around to
>>storing into esi until a cycle or so before the call, and the pipeline is some
>>20+ cycles deep (depending on processor). After it decodes the call instruction,
>>it needs to know where to go next, but it has to stall for those 20 cycles until
>>it knows what gets written to esi.
>>Static prediction assumes that it's not going to branch (in this case).
>>Obviously we know that it does, but rather than stall the pipeline for those 20+
>>cycles until they know WHERE it goes to, they choose simply to continue decode
>>at the wrong place. (If you make a small function call, this is actually
>>marginally advantageous over doing nothing since you've already got the return
>>address in the cache.)
>Hi Matt,
>I think i have it now, good explanation.
>But how about:
>mov    esi, [ecx+SomeFuncPtr]
>// getting and pushing several parameter
>call   esi
>Aren't the compiler aware to load the register as early as possible to avoid
>those stalls and mispredictions? I hope they are.
>Btw. is the branch prediction technique a kind of counter productive for
>hyperthreading, decoding instructions that are likely to be mispredicted, rather
>than giving another hyperthead a chance?
>>In the case of an if/then/else, some implementations will work on BOTH branches
>>so you don't lose as much. Either way, the if/then/else can predict right, and
>>static prediction can NEVER get it right the first time.

Presumably it won't even try to guess where the indirect call goes except by
using the BTB. This is the impression that I got from the manuals, and the
manuals are extremely vague about it.

You know, it is interesting that you point out Hyperthreading. HT could solve a
lot of branching issues simply by not worrying about them -- the same mentality
of Sparc, but a different answer. If you hit a branch and you're not sure where
it will go, you can work on something else until you can be sure. However, in
the here and now, I would presume that you are correct.


This page took 0.03 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.