Author: Gerd Isenberg
Date: 10:04:58 10/31/03
Go up one level in this thread
On October 31, 2003 at 11:43:28, Charles Roberson wrote: >On October 31, 2003 at 11:30:13, Robert Hyatt wrote: > >>On October 31, 2003 at 11:08:43, Charles Roberson wrote: >> >>>>> Algebra allows factoring out the 100/largest and creating a loop invariant >>>>> constant. However, largest can be very big and we are using integer >>>>> arithmetic. Thus, a large largest can make 100/largest = 0. >>>> >>>> >>>>make 100/largest a float. let number stay an integer. You end up with >>>>an integer result that will be what you expect. >>> >>> A good suggestion, but I was trying to avoid floating point arithmatic. >>> Are you suggesting that FP multiplication is faster than my stated >>> algorithm? >> >>Not particularly. I was just stating that FP is not that bad on today's >>hardware. The FP multiply will be done in parallel with other loop work, >>so it might not cost a thing, since the FP unit is completely separate. >> > > I thought about superscalar parallelization. My thinking was it doesn't > help. My thoughts are: > > for () > { FP mult > FP Store (may not happen) > FP convert to int > Int Store > } > Since the Int Store is (Write after read) dependent on all the FP work, > where is the parallel processing? Some implicite loop unrolling. E.g. FP mult,convert and store may be out of order executed with processing loop counter, checking the repeat condition, predicting the branch inside the loop and already start with next mul. For explicite (partial) loop unrolling, e.g. with AMD64's SSE-FP-arithmetic (up to four 32-bit floats with one 128-bit XMM-register) have a look at: Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors Page 177 7.2 Loop Unrolling Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.