Author: Gerd Isenberg
Date: 23:52:26 08/23/03
Go up one level in this thread
On August 24, 2003 at 02:13:33, Johan de Koning wrote: >On August 23, 2003 at 09:40:45, Gerd Isenberg wrote: > >>On August 23, 2003 at 04:21:28, Johan de Koning wrote: >> >>>On August 23, 2003 at 03:45:09, Johan de Koning wrote: >>> >>>> ... 1 extra line in main() can >>>>easily change the runtime by 1 or 2% (for reasons I haven't fathomed yet). >>> >>>I mean: I do understand it depends on code alignment. >>>I can imagine the instruction pipeline feeds at very high speed from an "open" >>>cache line. I can also imagine it is rather complicated to have more than 1 >>>cache line "open". But I can't imagine why I get random results. >>> >>>/**/ for( i = 0; i < top; i++ ) sum += i; >>>;;;; more: add, inc, cmp, jl more >>> >>>This loop usually executes in 2 cycles. But depending on the alignment I get >>>somtimes 2.667 or 4 or 4.5 cycles. Isn't that weird?! >>> >>>... Johan >> >>Hi Johan, >> >>Ok, your loop body is about 10 bytes. >>If i look to AMD Athlon Processor >>x86 Code Optimization Guide TM Page it becomes clearer. >>I guess P4 is similar. > >Thanks, it does indeed make things clearer. Still not 100% though. > >In the meanwhile I did some thorough testing (using an assembler! :-). >It did reveal that 16-byte boundaries are critical, not cache lines. >Unfortunately I couldn't reproduce fractional cycles. Maybe I recall them from >cache latency tests. If I do manage to see them again I will report ASAP. > >>Page 49 >> >>4 Instruction Decoding Optimizations >>... >> >>Overview > >I do agree with Chris that adapting to a machine on (sub)cycle is not the way to >develop a better algortithm. However, I'm currently contemplating a change of >data representation. I'd better be ready for the latest and the next generations >of CPU and it seems I have some catching up to do. So I find this mighty >interesting stuff. > ><lazy mode> >Is this stuff available on-line? >Preferrably in plain text format. ></lazy mode> > >... Johan PDF - Adobe Acrobat Reader Required Athlon: AMD Athlon™ Processor x86 Code Optimization Guide: http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_739_2983,00.html Opteron: AMD64 Optimization Guide: Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_9044,00.html Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.