Computer Chess Club Archives


Search

Terms

Messages

Subject: Cool docs

Author: Johan de Koning

Date: 00:31:41 08/26/03

Go up one level in this thread


On August 24, 2003 at 02:52:26, Gerd Isenberg wrote:

>On August 24, 2003 at 02:13:33, Johan de Koning wrote:
>
>>On August 23, 2003 at 09:40:45, Gerd Isenberg wrote:
>>
>>>On August 23, 2003 at 04:21:28, Johan de Koning wrote:
>>>
>>>>On August 23, 2003 at 03:45:09, Johan de Koning wrote:
>>>>
>>>>> ... 1 extra line in main() can
>>>>>easily change the runtime by 1 or 2% (for reasons I haven't fathomed yet).
>>>>
>>>>I mean: I do understand it depends on code alignment.
>>>>I can imagine the instruction pipeline feeds at very high speed from an "open"
>>>>cache line. I can also imagine it is rather complicated to have more than 1
>>>>cache line "open". But I can't imagine why I get random results.
>>>>
>>>>/**/ for( i = 0; i < top; i++ ) sum += i;
>>>>;;;; more: add, inc, cmp, jl more
>>>>
>>>>This loop usually executes in 2 cycles. But depending on the alignment I get
>>>>somtimes 2.667 or 4 or 4.5 cycles. Isn't that weird?!
>>>>
>>>>... Johan
>>>
>>>Hi Johan,
>>>
>>>Ok, your loop body is about 10 bytes.
>>>If i look to AMD Athlon Processor
>>>x86 Code Optimization Guide TM Page it becomes clearer.
>>>I guess P4 is similar.
>>
>>Thanks, it does indeed make things clearer. Still not 100% though.
>>
>>In the meanwhile I did some thorough testing (using an assembler! :-).
>>It did reveal that 16-byte boundaries are critical, not cache lines.
>>Unfortunately I couldn't reproduce fractional cycles. Maybe I recall them from
>>cache latency tests. If I do manage to see them again I will report ASAP.
>>
>>>Page 49
>>>
>>>4 Instruction Decoding Optimizations
>>>...
>>>
>>>Overview
>>
>>I do agree with Chris that adapting to a machine on (sub)cycle is not the way to
>>develop a better algortithm. However, I'm currently contemplating a change of
>>data representation. I'd better be ready for the latest and the next generations
>>of CPU and it seems I have some catching up to do. So I find this mighty
>>interesting stuff.
>>
>><lazy mode>
>>Is this stuff available on-line?
>>Preferrably in plain text format.
>></lazy mode>
>>
>>... Johan
>
>PDF - Adobe Acrobat Reader Required
>
>Athlon:
>
>AMD Athlon™ Processor x86 Code Optimization Guide:
>
>http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_739_2983,00.html
>
>
>Opteron:
>
>AMD64 Optimization Guide:
>Software Optimization Guide for AMD Athlon™ 64 and AMD Opteron™ Processors
>
>http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_9044,00.html

Thanks!
That saved me a lot of browsing. And more importantly, sharing sources is an
excellent way to facilitate future discussions.

These docs make me feel all 1993 again.
Hence the need for me to do some catching up. :-)

... Johan



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.