Computer Chess Club Archives




Subject: Re: CZX IA-64 Instruction Re: Will the Itanium have a BSF or BSR instruction

Author: Brian Richardson

Date: 12:02:30 08/20/00

Go up one level in this thread

On August 20, 2000 at 14:58:59, Brian Richardson wrote:

>On August 17, 2000 at 23:54:33, Eugene Nalimov wrote:
>>(1) I already posted that shift by the variable amount have a huge latency on
>>the Itanium, so that there would be no win -- at least no win in terms of clock
>>cycles (yes, function probably would be smaller, but not faster). And you'll
>>need assembly for that -- code I posted is 100% C.
>>(2) I wrote the "original" x86 code (x86.s), after that somebody converted it
>>from Linux-on-x86-asm to everybody-else-on-x86-asm and moved it into vcinline.h.
>>Also, FirstOne()/LastOne() were rewritten, as P6/PII/PIII has fast BSR/BSF
>>instructions, so it's beneficial to use them (on original Pentium they were
>>terrible slow).
>>On August 17, 2000 at 17:23:45, Brian Richardson wrote:
>>>Eugene:  First thank you for your work on EGTBs.  Second, thanks for looking
>>>into the IA-64 coding issues.  Had you considered the IA-64 instruction that
>>>finds the first non-zero byte in operands of various sizes, and then using that
>>>to index an 8-bit array?  Perhaps it is too slow vs your method with may exploit
>>>more parallelism.
>>>PS  Did you write vcinline.h for Crafty?
>Thank you for your reply.
>I was just wondering if you had considered the Compute Zero Index (CZX)
>instruction to find the first non-zero byte and then index lookup that with an
>array preset with first/last bits...

PS  I do not know if the CZX instruction incurs the same shift latency you were
referring to above, or perhaps a subsequent byte load using the result of the
CZX index does...

This page took 0.07 seconds to execute

Last modified: Thu, 07 Jul 11 08:48:38 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.