Author: Robert Hyatt
Date: 09:23:32 09/28/02
Go up one level in this thread
On September 27, 2002 at 23:29:58, Anthony Cozzie wrote: >Recently, I profiled my chess engine, and one function in particular stood out. >The transposition probe function takes about 7% of the CPU time, or about 350 >cycles/call. All it does is access the transposition table, but the random >nature of the accesses means that it usually misses in the cache AND the TLB, >thus requiring 2 memory accesses at 100+ cycles each. > >In my engine, the search function generates the next move, makes the next move, >checks if it is legal, checks if the opponent is in check, and recurses, so >there are two calls to is_check() between when the transposition key is >available and when the key is used. I tried inserting a prefetch instruction [I >run an Athlon] with absolutely no effect. I even tried following the prefetch >with a long loop to make SURE it would have enough time to access the memory, >with no results. Lastly I tried a MOV instruction, also with no result. Am I >just doing something wrong here? > >Has anyone else tried to something similar with better results? You are basically stuck in memory-latency land, and there is little you can do. I doubt it is a TLB issue, but then that depends on whether your O/S used 4kb or 4mb pages... But if the TLB is getting crushed, that is more damaging than the cache issue because a memory access takes three accesses, two to map the address, one to fetch it.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.