Author: Eugene Nalimov
Date: 10:45:49 02/18/03
Go up one level in this thread
On February 18, 2003 at 13:20:38, Tom Kerrigan wrote: >On February 16, 2003 at 00:04:34, Eugene Nalimov wrote: > >>On February 15, 2003 at 21:37:19, Tom Kerrigan wrote: >> >>>On February 13, 2003 at 15:46:26, Eugene Nalimov wrote: >>> >>>>Some time ago I wrote in the (lost) thread: please compare Itanium2 not with >>>>P4/Athlon/etc., but with server CPUs. I.e. with 4-way Xeons, Power4, etc. >>> >>>I don't know how I got dragged into talking about x86 at all. Everybody seems to >>>assume that I want to prove that the P4 is better than the I2 in all cases no >>>matter what. The comparison I'm interested in is Opteron vs. I2... >>> >>>>benchmarks it resembles real-world server-like code a most: it's (relatively) >>>>large, execution time not spent in several functions but heavily spread across >>>>lot of functions, lot of loops across pointer chains, lot of calls, >>>>unpredictable branches, etc. >>> >>>In other words, it's suited to chips with lots of cache or low latency memory >>>and low branch mispredict penalties. Opteron addresses all of these issues. >> >>Why it should be better than Power4+? > >Higher clock speeds (if not initially, shortly thereafter) and lower branch >mispredict penalties. (12 vs 17 stage pipeline.) Also, is the POWER4's memory >controller on-die or just in-package? If it's not on-die, that's a big memory >latency advantage for x86-64... (1) Why x86-64 should has higher clock speed? 1,450MHz Power4+ is shipping for several months already. The only possibility I see is that IBM will not clock Power4+ aggressively, mainly because RAS for server CPUs is more important than pure speed. But than I doubt that AMD would clock server version of x86-64 aggressively, too. (2) You yourself wrote "lots of cache or low latency memory". Power4+ has so much cache that I believe it would more than compensate faster x86-64 memory latency. Returning to the original point: I still don't see why x86-64 should be better than Power4+, and on the server-like code current Itanium2 is faster (and, BTW, cheaper) than higher clocked Power4+. So probably for such code in-order execution is not much worse than OoO (as you suggested)? Or probably not worse at all? Of course for now it's all waiving hands in the air. We don't have enough data, because there are no in-order aggressively designed desktop CPUs. I pointed that for server applications the good in-order design is not worse than much more mature OoO design with faster clock speed and 10-30x more cache, and we should ask "is chess program more resemble server application, or desktop application"? Let's wait till x86-64 launch. But next generation Itanium CPUs will be shipped at that moment, too :-) Thanks, Eugene >-Tom
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.