Author: Matt Taylor
Date: 00:03:03 02/16/03
Go up one level in this thread
On February 15, 2003 at 21:28:39, Tom Kerrigan wrote: >On February 13, 2003 at 19:40:45, Matt Taylor wrote: > >>You're not getting it. Logic on the processor for static branch prediction is >>80% accurate because auxillary information available to the compiler is thrown >>out. Consider the following loop: >>for(i = 0; i < 1000; i++) >> do_something(); > >You're the one who's not getting it if you think processors have logic for >static branch prediction (hint: processors do dynamic prediction) or if you >think these are the kinds of branches that matter for execution or compilation. >(Any branch prediction scheme would predict your branch with 99.9% accuracy.) Doesn't matter what you call it. AMD seems to think my Athlon has static branch prediction. I'm not sure why you disagree. No, this is not the typical scenario. I didn't feel it was necessary to list all sorts of loops and show how the compiler can predict them for the sake of argument. Intel C already does a lot of this, and they've only scratched the surface. >>>Sure, you can avoid having an actual branch instruction. I'm asking you to think >>>deeper. How does that make the processor go any faster? >> >>No branch mispredict = no penalty. Not always possible, but it works well for >>short functions such as abs, min, and max. If it were not so, cmov would be a >>near useless instruction. > >That's true, and I forgot about that reason, I guess because branches are only >mispredicted 5% of the time. The reason why predication would be used more >aggressively on an in-order chip (i.e., why it's a big deal on IA-64) is because >it allows post-condition instructions to be issued without dependancies. The cmov instruction still says leagues. Dynamic prediction works very well on predictable branches. Not all branches are predictable. >>>No, more like 12 results and in only one case does the Itanium 2 outperform the >>>P4. And I think I've done a very good job explaining why Crafty runs faster on >>>the I2 than the P4. >>The speed of gcc and perl are rather irrelevant to Chess, aren't they? > >They are if they better represent computer chess than Crafty does. I'd bet most >chess programs out there don't use bitboards (i.e., 64 bit operations) or use >bitboards less than Crafty. Bitboards are almost certainly the reason why Crafty >performs well on I2 vs. the P4. > >-Tom Perhaps it is, perhaps it isn't. Athlon is much more efficient with 64-bit operations than Pentium 4 is, and the Athlon isn't pulling ahead by huge strides (in Crafty). -Matt
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.