Computer Chess Club Archives




Subject: Re: C and C++ --- NPS

Author: Robert Hyatt

Date: 21:54:57 12/25/02

Go up one level in this thread

On December 25, 2002 at 17:08:27, Gerd Isenberg wrote:

>On December 25, 2002 at 14:13:47, Matt Taylor wrote:
>>On December 25, 2002 at 04:17:34, Dave Gomboc wrote:
>>>On December 23, 2002 at 21:33:50, Matt Taylor wrote:
>>>>On December 23, 2002 at 21:29:08, Arshad F. Syed wrote:
>>>>>Is there anyone here who has actually converted their C code to C++ while
>>>>>keeping the NPS the same or even managed to improve upon it? Sune did mention
>>>>>that he did so only at the cost of 15% degradation in speed. Why do people even
>>>>>bother wasting their time with C++ for a chess program when it inevitably leads
>>>>>to taking a hit in NPS? Regardless, of all the advice to simply inline the
>>>>>functions, I have yet to see a living example of chess program code which
>>>>>performed faster in C++.
>>>>The biggest hit you take in C++ is virtual functions, and inevitably people use
>>>>them. The virtual functions are very damaging due to the fact that they're
>>>>implemented with a function pointer, and at the machine level it translates into
>>>>an indirect call. The indirect calls branch mispredict because the CPU can't
>>>>figure out ahead-of-time where you're calling to, and branch mispredicts are
>>>Please excuse my ignorance, but where is the branch?  It's an indirect jump,
>>>isn't it?  There's nothing optional about it, it must be followed.  Is it that I
>>>am not familiar enough with the terminology?  e.g. I am thinking that by "branch
>>>prediction" you may be including any operation that moves the program counter in
>>>any way other than advancing sequentially to the next instruction.
>>>In any case, I'm confident that if you have a serious inheritance hierarchy (as
>>>opposed to a toy example) you're not going to lose any sleep over the virtual
>>>function call.  Many if-then-elses aren't efficient either.
>>A branch is anything that changes the flow of execution from the pc++ scheme. An
>>unconditional jump/call is still a branch.
>>The problem that superscalar processors have is that they're executing multiple
>>things simultaneously, and often to predict the target of a branch, you need to
>>have completed execution of instructions almost immediately prior.
>>Consider the example I gave in a previous post:
>>mov    esi, [ecx+SomeFuncPtr]
>>call   esi
>>call   esi
>>This is an indirect call. The machine loads the function pointer into the esi
>>register, and then the contents of that register are used as the target of the
>>unconditional branch. The problem is that the machine doesn't get around to
>>storing into esi until a cycle or so before the call, and the pipeline is some
>>20+ cycles deep (depending on processor). After it decodes the call instruction,
>>it needs to know where to go next, but it has to stall for those 20 cycles until
>>it knows what gets written to esi.
>>Static prediction assumes that it's not going to branch (in this case).
>>Obviously we know that it does, but rather than stall the pipeline for those 20+
>>cycles until they know WHERE it goes to, they choose simply to continue decode
>>at the wrong place. (If you make a small function call, this is actually
>>marginally advantageous over doing nothing since you've already got the return
>>address in the cache.)
>Hi Matt,
>I think i have it now, good explanation.
>But how about:
>mov    esi, [ecx+SomeFuncPtr]
>// getting and pushing several parameter
>call   esi
>Aren't the compiler aware to load the register as early as possible to avoid
>those stalls and mispredictions? I hope they are.

Any good compiler will do that.  It is called "instruction lifting" and
the optimizer has a pass called "the instruction scheduler" that is responsible
for doing this...

>Btw. is the branch prediction technique a kind of counter productive for
>hyperthreading, decoding instructions that are likely to be mispredicted, rather
>than giving another hyperthead a chance?

In a way, perhaps.  But only if the two threads are somehow interfering with
each other.

>>In the case of an if/then/else, some implementations will work on BOTH branches
>>so you don't lose as much. Either way, the if/then/else can predict right, and
>>static prediction can NEVER get it right the first time.

This page took 0.01 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.