Author: Simon Waters
Date: 16:21:20 10/30/00
Go up one level in this thread
On October 30, 2000 at 14:56:02, Simon Waters wrote: > >I'd actually changed gnuchess to use the same algorithmn for PopCnt as Crafty >before looking at the assembler - based on the excellent bitwtiddling paper from >RICE - although it doesn't seem to have any performance benefit over the >algorithmns you've already donated to gnuchess (At least I seem to have gained >about 9% but can't be sure where it is coming from - my poor linux box was >floating around run level 2 trying to produce some more consistent numbers). Found most of my 9% - and learnt a lesson in keeping code tight. When I applied the new "lookup table free" nbits and leadz into the lastest copy I forgot to cut the lookup table declaration and initialisation from the code (I was only testing that it was faster 8-() Even when it isn't used anywhere (i.e. the lookup tables are not even initialized, just declared) GCC 2.95.2 allocates a great chunk of memory and then presumably the CPU cache is cluttered with junk, or it has to access more pages, and so goes slower. Hmm don't you just love assembler, oops I mean C.... I vaguely remember the Cray YMP family being the opposite, certain memory operations could be tuned to access diverse parts of solid state storage to stop the CPU's blocking on the (painfully slow *8-) Cray solid state memory devices. Still junks junk - it has got to go. I wonder if the remaining missing few percent are down to the compiler not removing comments, or the few remaining unused variables. Anyone recommend a free lint or has C moved on? Any tips on optimising CPU cache performance or is it all too architecture dependant? Thanks for the help, all those that responded. Simon
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.