Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Some opteron results for Crafty

Author: Robert Hyatt

Date: 13:08:50 11/26/03

Go up one level in this thread


On November 26, 2003 at 15:36:58, Dann Corbit wrote:

>On November 26, 2003 at 15:25:10, Robert Hyatt wrote:
>
>>I have been working both with Eugene and AMD.  The following bench run is
>>on a quad 1.8ghz opteron, 8 gigs of ram.  The only "option" I have set is
>>"mt=4".  There is _no_ assembly code in this version, pure C only.  I am
>>looking at updating the asm to 64 bit but that will take some time and
>>studying.
>>
>>Meanwhile:
>>
>>Crafty v19.6 (1 cpus)
>>
>>White(1): mt=4
>>max threads set to 4
>>White(1): bench
>>Running benchmark. . .
>>......
>>Total nodes: 105863114
>>Raw nodes per second: 5881284
>>Total elapsed time: 18
>>SMP time-to-ply measurement: 35.555556
>>
>>This is using gcc, although I am not sure whether it is producing 64 bit
>>or 32 bit code at the moment.  However, 5.8M nps is not bad.  About 1M less
>>than Eugene's MSVC numbers.  I will look into the 64 bit stuff more to see if
>>gcc is producing real opteron assembly or not...  And I will study the
>>PGO options although the list time I tried them on GCC the compiler promptly
>>crashed. :)
>>
>>Note that the above is with default hash and everything, no endgame tables,
>>no opening book, etc...
>
>Could we see the numbers for 1,2,3 threads active also?
>I would be interested to see how it scales.


Sure.

one processor:

White(1): bench
Running benchmark. . .
......
Total nodes: 100409437
Raw nodes per second: 1498648
Total elapsed time: 67
SMP time-to-ply measurement: 9.552239


two processors:

max threads set to 2
White(1): bench
Running benchmark. . .
......
Total nodes: 99562452
Raw nodes per second: 3017044
Total elapsed time: 33
SMP time-to-ply measurement: 19.393939

three processors:

max threads set to 3
White(1): bench
Running benchmark. . .
......
Total nodes: 102543114
Raw nodes per second: 4458396
Total elapsed time: 23
SMP time-to-ply measurement: 27.826087

four processors:

max threads set to 4
White(1): bench
Running benchmark. . .
......
Total nodes: 102606915
Raw nodes per second: 5700384
Total elapsed time: 18
SMP time-to-ply measurement: 35.555556


Let me note here that this is not a very NUMA-aware implementation,
nowhere near as good as what we did (Eugene and I) for windows.  I
am going to look at the Linux NUMA library tonight and work on getting
some of those features in, which should further push performance up.

This is way better than 19.4, but it is not "all there" yet.  Note also
that there is no assembly language of any kind in this version, it is pure
C.  I plan on rectifying that _soon_.  :)




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.