Author: Matt Taylor
Date: 02:39:16 12/08/02
Go up one level in this thread
On December 07, 2002 at 23:40:38, Walter Faxon wrote: >On December 07, 2002 at 14:14:32, Matt Taylor wrote: > >>On December 06, 2002 at 22:50:40, Walter Faxon wrote: >> >>>On December 06, 2002 at 05:33:42, Matt Taylor wrote: >>> >>><snip> >>>> >>>>It's also a tad strange that the code loads dl and then copies edx into eax. >It would be more direct to simply store the table value in eax. >>>> >>><snip> >>> >>>I get the feeling that, for the compiler in question at least, once it decides >>>that a register is going to be used as an address or offset, it loses or ignores >>>its knowledge of the register's bitwise mapping. It preps edx to receive the >>>byte in dl, uses eax to load dl, then copies the whole thing to eax. If it used >>>eax to write to al directly, the compiler would think it still needs to mask out >>>the (already zeroed) rest of eax afterwards. So it does it this way because the >>>reg-reg copy is faster and the edx prep can be overlapped with other work. >>>Anyway, that's a possible explanation. One would need detailed knowledge of the >>>compiler to know for sure. (And don't get mad at the compiler writers; writing >>>good compilers is hard work!) >>> >>>-- Walter >> >>Yeah, but the x86 architecture has a movzx instruction for that very purpose. >>AMD manuals actually advise that it is faster to use movzx than the equivalent >>sequence... >> >>And yeah, I know compiler writing is very difficult. I have actually been >>working on one for various reasons. The difference is that I have a human >>optimizer. :-) >> >>Actually I was working on an optimizer that takes machine code and produces more >>optimal machine code (which is what I will spit my compiler output through when >>it's done). >> >>-Matt > > >Man, it's clear I gotta get a PIV asm book (or the AMD equivalent). All I got >is an old 486 manual and that says movzx is slower than mov, 3 clocks to 1. > >A machine code optimizer is terrifically general. Let us know when you're done; >a lot of us will be interested. > >-- Walter Yeah, movzx -used- to be slower. You won't find timing data past the Pentium. The most comprehensive timing reference I've ever seen was documented by a company called Quantasm. I can't find them on the web anymore and can only presume they've gone out of business. However, I managed to find those old docs and upload them to my website: http://my.fit.edu/~mtaylor/opcode_i.html "Integer" ops (ALU/system) http://my.fit.edu/~mtaylor/opcode_f.html FPU ops I dug up links to the latest manuals in case you were interested. Personally, I prefer the Intel manuals as I find them easier to navigate. Intel Volume 2 has an exhaustive instruction listing. The AMD manuals contain certain subsets (system, general, FPU/3DNow, MMX/SSE) in different manuals. Optimization references are a bit harder to come by. The only stuff I have is specific to the Pentium 2/3. I might also have some early Athlon optimization docs lying around somewhere... AMD x86-64 manuals: http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_4699_875%5e7044,00.html?redir=CPX801 Intel P4 manuals: http://developer.intel.com/design/pentium4/manuals/ FYI you can order a hard copy of the x86-64 manuals from AMD's website for free last I checked. Most of the application stuff is identical to current x86 architecture. The only differences I can think of offhand besides the 64-bitness are the REX encodings, RIP-relative addressing, and registers r8-r15 and xmm8-xmm15. -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.