Author: Brian Richardson
Date: 11:58:59 08/20/00
Go up one level in this thread
On August 17, 2000 at 23:54:33, Eugene Nalimov wrote: >(1) I already posted that shift by the variable amount have a huge latency on >the Itanium, so that there would be no win -- at least no win in terms of clock >cycles (yes, function probably would be smaller, but not faster). And you'll >need assembly for that -- code I posted is 100% C. >(2) I wrote the "original" x86 code (x86.s), after that somebody converted it >from Linux-on-x86-asm to everybody-else-on-x86-asm and moved it into vcinline.h. >Also, FirstOne()/LastOne() were rewritten, as P6/PII/PIII has fast BSR/BSF >instructions, so it's beneficial to use them (on original Pentium they were >terrible slow). > >Eugene > >On August 17, 2000 at 17:23:45, Brian Richardson wrote: > >>Eugene: First thank you for your work on EGTBs. Second, thanks for looking >>into the IA-64 coding issues. Had you considered the IA-64 instruction that >>finds the first non-zero byte in operands of various sizes, and then using that >>to index an 8-bit array? Perhaps it is too slow vs your method with may exploit >>more parallelism. >> >>Brian >> >>PS Did you write vcinline.h for Crafty? Thank you for your reply. I was just wondering if you had considered the Compute Zero Index (CZX) instruction to find the first non-zero byte and then index lookup that with an array preset with first/last bits... Brian
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.