Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: copy cost

Author: Gerd Isenberg

Date: 23:52:26 08/23/03

Go up one level in this thread


On August 24, 2003 at 02:13:33, Johan de Koning wrote:

>On August 23, 2003 at 09:40:45, Gerd Isenberg wrote:
>
>>On August 23, 2003 at 04:21:28, Johan de Koning wrote:
>>
>>>On August 23, 2003 at 03:45:09, Johan de Koning wrote:
>>>
>>>> ... 1 extra line in main() can
>>>>easily change the runtime by 1 or 2% (for reasons I haven't fathomed yet).
>>>
>>>I mean: I do understand it depends on code alignment.
>>>I can imagine the instruction pipeline feeds at very high speed from an "open"
>>>cache line. I can also imagine it is rather complicated to have more than 1
>>>cache line "open". But I can't imagine why I get random results.
>>>
>>>/**/ for( i = 0; i < top; i++ ) sum += i;
>>>;;;; more: add, inc, cmp, jl more
>>>
>>>This loop usually executes in 2 cycles. But depending on the alignment I get
>>>somtimes 2.667 or 4 or 4.5 cycles. Isn't that weird?!
>>>
>>>... Johan
>>
>>Hi Johan,
>>
>>Ok, your loop body is about 10 bytes.
>>If i look to AMD Athlon Processor
>>x86 Code Optimization Guide TM Page it becomes clearer.
>>I guess P4 is similar.
>
>Thanks, it does indeed make things clearer. Still not 100% though.
>
>In the meanwhile I did some thorough testing (using an assembler! :-).
>It did reveal that 16-byte boundaries are critical, not cache lines.
>Unfortunately I couldn't reproduce fractional cycles. Maybe I recall them from
>cache latency tests. If I do manage to see them again I will report ASAP.
>
>>Page 49
>>
>>4 Instruction Decoding Optimizations
>>...
>>
>>Overview
>
>I do agree with Chris that adapting to a machine on (sub)cycle is not the way to
>develop a better algortithm. However, I'm currently contemplating a change of
>data representation. I'd better be ready for the latest and the next generations
>of CPU and it seems I have some catching up to do. So I find this mighty
>interesting stuff.
>
><lazy mode>
>Is this stuff available on-line?
>Preferrably in plain text format.
></lazy mode>
>
>... Johan

PDF - Adobe Acrobat Reader Required

Athlon:

AMD Athlon™ Processor x86 Code Optimization Guide:

http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_739_2983,00.html


Opteron:

AMD64 Optimization Guide:
Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors

http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_9044,00.html

Gerd



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.