Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: IA-64 vs OOOE (attn Taylor, Hyatt)

Author: Matt Taylor

Date: 15:44:37 02/13/03

Go up one level in this thread


On February 13, 2003 at 14:59:18, Tom Kerrigan wrote:

>Well, I guess I'm wrong about P6 cmov then. Somebody told me that it was broken
>down into uops that were the equivalent of doing a branch (i.e., one of the uops
>was a branch) because the decision to add cmov to the P6 came late, and the
>ability to do predicated (sorry, conditional) moves in the datapath would have
>been too big a change.
>
>Although so far it sounds like nobody has actually done a test to see if cmov is
>faster, slower, or equal to a branch. You have your timing numbers (apparently
>from a manual) and Bob sort of vaguely recalls running something on a Pentium
>3...
>
>-Tom

Uh, no. The manual says 2 u-ops for reg-reg and 3 u-ops for reg-mem. The figures
I posted were measured on my Pentium 2 350 MHz. My timing numbers were obtained
from the code that follows:

#include <stdio.h>

int main(void)
{
	int i, base, lat, thpt;

	for(i = 0; i < 16; i++)
	{
		asm("cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "movl	%%eax, %%edi\n\t"
		    "movl	%%ecx, %%ecx\n\t"
		    "movl	%%edx, %%edx\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "subl	%%edi, %%eax"
		    : "=a" (base)
		    :
		    : "%ebx", "%ecx", "%edx", "%edi");

		asm("cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "movl	%%eax, %%edi\n\t"
		    "movl	%%ecx, %%ecx\n\t"
		    "movl	%%edx, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cmovnz	%%edi, %%eax\n\t"
		    "cmovnz	%%edi, %%ebx\n\t"
		    "cmovnz	%%edi, %%ecx\n\t"
		    "cmovnz	%%edi, %%edx\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "subl	%%edi, %%eax"
		    : "=a" (thpt)
		    :
		    : "%ebx", "%ecx", "%edx", "%edi");

		asm("cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "movl	%%eax, %%edi\n\t"
		    "movl	%%ecx, %%ecx\n\t"
		    "movl	%%edx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cmovnz	%%edx, %%eax\n\t"
		    "cmovnz	%%eax, %%ebx\n\t"
		    "cmovnz	%%ebx, %%ecx\n\t"
		    "cmovnz	%%ecx, %%edx\n\t"
		    "cpuid\n\t"
		    "rdtsc\n\t"
		    "subl	%%edi, %%eax"
		    : "=a" (lat)
		    :
		    : "%ebx", "%ecx", "%edx", "%edi");
	}

	printf("base: %d clocks\n", base);
	printf("throughput: %d\n", thpt - base);
	printf("latency: %d\n", lat - base);

	return 0;
}

Output:
base: 119 clocks
throughput: 62
latency: 65

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.