Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Fritz's Tablebase Initialisation

Author: Robert Hyatt
Date: 18:14:44 05/05/04
On May 05, 2004 at 20:01:48, Marc Bourzutschky wrote:

>On May 05, 2004 at 19:34:10, Robert Hyatt wrote:
>
>>On May 05, 2004 at 18:53:11, Marc Bourzutschky wrote:
>>
>>>On May 05, 2004 at 17:53:50, Robert Hyatt wrote:
>>>
>>>>On May 05, 2004 at 17:06:49, Marc Bourzutschky wrote:
>>>>
>>>>>On May 05, 2004 at 16:35:10, Robert Hyatt wrote:
>>>>>
>>>>>>On May 05, 2004 at 16:29:52, Marc Bourzutschky wrote:
>>>>>>
>>>>>>>On May 05, 2004 at 15:36:23, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>On May 05, 2004 at 14:41:45, Marc Bourzutschky wrote:
>>>>>>>>
>>>>>>>>>On May 05, 2004 at 13:51:12, Robert Hyatt wrote:
>>>>>>>>>
>>>>>>>>>>On May 05, 2004 at 13:25:18, Marc Bourzutschky wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 05, 2004 at 11:55:07, Robert Hyatt wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On May 05, 2004 at 09:37:14, Marc Bourzutschky wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>On May 05, 2004 at 09:14:50, Mike Hood wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>On May 05, 2004 at 08:12:44, Vincent Diepeveen wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>On May 05, 2004 at 07:47:57, Mike Hood wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>I just let Filemon run while loading Fritz 8 to see why it takes so long. I was
>>>>>>>>>>>>>>>>shocked to see that during the initialisation Fritz tries to open every possible
>>>>>>>>>>>>>>>>tablebase. For instance...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Open kpk.nbw.emd -- good, it's there
>>>>>>>>>>>>>>>>Open kpknbw.emd -- file not found
>>>>>>>>>>>>>>>>Open kpk_nbw.emd -- file not found
>>>>>>>>>>>>>>>>Open kpk_nbw_emd -- file not found (I never knew this format was valid)
>>>>>>>>>>>>>>>>Open kpk.nbw -- file not found
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>And the same five accesses for the nbb file.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Why carry on with the other three after finding the first tablebase? But it gets
>>>>>>>>>>>>>>>>even wilder when it comes to the 6-piece tablebases. All 365 possible tablebase
>>>>>>>>>>>>>>>>pairs in all possible formats are accessed, even though I don't have any on my
>>>>>>>>>>>>>>>>disk. Thousands of "file not found" results. Just one example, to show how
>>>>>>>>>>>>>>>>ludicrous it is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>First Fritz tries to open krbnkp.nbw.emd, krbnkpnbw.emd, krbnkp_nbw.emd and
>>>>>>>>>>>>>>>>krbnkp.nbw.emd. Almost the same as before, except Fritz is assuming 6-piece
>>>>>>>>>>>>>>>>tablebases are compressed. But then Fritz tries to open krbnkp.0.nbw.emd,
>>>>>>>>>>>>>>>>krbnkp.0_nbw.emd, krbnkp.0nbw.emd and krbnkp.0_nbw_emd. Then krbnkp.1.nbw.emd,
>>>>>>>>>>>>>>>>etc... and krbnkp.2.nbw.emd... and all the way through to krbnkp.g.nbw.emd. That
>>>>>>>>>>>>>>>>means 136 disk accesses for a tablebase that I don't have! And that's only one
>>>>>>>>>>>>>>>>tablebase out of 365.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Wouldn't it be much easier just to scan the tablebase directory and only open
>>>>>>>>>>>>>>>>the files that actually exist?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Both nalimov and i do this in a similar way.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>If you are willing to write code for this that works faster and works both for
>>>>>>>>>>>>>>>windows and *nix, then i will be real happy to use it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Best Regards,
>>>>>>>>>>>>>>>Vincent
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Thanks for the info, Vincent. I assumed the initialization code had been written
>>>>>>>>>>>>>>by Chessbase, not by Eugene.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>My math was a bit off in my original post, but after looking at Filemon's log I
>>>>>>>>>>>>>>can give the exact figure: Fritz attempts to access 33647 non-existent tablebase
>>>>>>>>>>>>>>files. And please... you can't tell me that if the file krbnkp.0.nbw.emd doesn't
>>>>>>>>>>>>>>exist it still makes sense to look for krbnkp.1.nbw.emd, krbnkp.2.nbw.emd, etc
>>>>>>>>>>>>>>all the way to krbnkp.g.nbw.emd. That's a waste of processor time on any
>>>>>>>>>>>>>>operating system.
>>>>>>>>>>>>>
>>>>>>>>>>>>>As this is only done once during initialization it is not such a big deal.  IMHO
>>>>>>>>>>>>>a more serious nuisance is that all available endgames on the paths are
>>>>>>>>>>>>>initialized even though they may never be used.  As a fair amount of memory is
>>>>>>>>>>>>>taken up by each endgame that is initialized this is a serious inefficiency.
>>>>>>>>>>>>>I'm surprised that Fritz and Co. have not implemented a scheme where an endgame
>>>>>>>>>>>>>is only initialized when it is actually required.
>>>>>>>>>>>>>
>>>>>>>>>>>>>-Marc
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>Would you _really_ want to wait until you have a few seconds left, with no
>>>>>>>>>>>>increment, then start opening files, malloc()'ing  buffers, setting up the
>>>>>>>>>>>>decompression stuff?  Oops.  flag just fell.
>>>>>>>>>>>>
>>>>>>>>>>>>:)
>>>>>>>>>>>
>>>>>>>>>>>I actually do just that and it takes a small fraction of a second to initialize
>>>>>>>>>>>one tablebase.  For me at least the likelyhood of this being an issue is
>>>>>>>>>>>miniscule compared to the amount of memory I can save.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>What will you use it for?  And if memory is tight, that "fraction of a second"
>>>>>>>>>>will grow as you might have to page out an inactive process to make room...
>>>>>>>>>
>>>>>>>>>For one, I can use the extra memory for hash tables and other things I might be
>>>>>>>>>doing on my computer.
>>>>>>>>
>>>>>>>>We are not on the same page apparently.
>>>>>>>>
>>>>>>>>Your idea won't work.  Because inside the search, when you _do_ need to access a
>>>>>>>>table, you need to malloc() memory for the decompression indices and read them
>>>>>>>>in.  That takes time.  If you used all of memory for hashing you just introduced
>>>>>>>>significant paging overhead that is going to slow you further.  If you want to
>>>>>>>>dynamically reduce the size of the hash table inside the search, forget about
>>>>>>>>it.  That adds so much overhead it isn't thinkable...
>>>>>>>>
>>>>>>>
>>>>>>>I already have it working!  Of course, if you allocate all free memory to hash
>>>>>>>tables initially you might run into the problem you describe.  But even that can
>>>>>>>be easily fixed by maintaining an index cache, just like there is already a
>>>>>>>cache for the tablebase values themselves.
>>>>>>
>>>>>>
>>>>>>Either you use more memory for hash, as you said you would, and run into paging
>>>>>>when you start probing tables, or you don't use more memory for hash, which
>>>>>>means you "reserve" the necessary EGTB memory but don't use it until needed.
>>>>>>When you do need it, you will lose games on time if they are short time
>>>>>>controls.
>>>>>>
>>>>>
>>>>>The point is that for practical purpose the number of tablebases actually needed
>>>>>is way less than the total number possible.
>>>>>
>>>>>>One way or another, the current approach is the correct one.  Why do you think
>>>>>>_everybody_ is doing it that way???
>>>>>
>>>>>First of all, the current approach is not incorrect, just memory inefficient.
>>>>>Not a big deal if you only use 5-man tables, but a bit more of a deal once all
>>>>>the 6-man tables are there.  Also not a big deal if you have lots of memory to
>>>>>burn.  Second of all, _everybody_ is just Eugene Nalimov himself, since people
>>>>>just copy his code.
>>>>
>>>>Not quite.  Steven Edwards wrote the first egtb code I used.  It did the same
>>>>although there were no decompression indices since he didn't support on-the-fly
>>>>decompression.  Bruce Moreland also did tables and I used his for a bit as well,
>>>>and he also opened them up front to see what was present.
>>>>
>>>>
>>>>>  I'm sure Eugene would agree that with a suitable index
>>>>>cache one can eliminate loading all the tablebase stuff on startup, with a
>>>>>miniscule chance of this leading to losing games on time.
>>>>
>>>>
>>>>I don't see how.  I play 1 0 games regularly.  No time to start opening files
>>>>and loading decompression indices with a second or two left total...
>>>
>>>But you are already opening files and reading from disk because the position
>>>itself will not be in memory!
>>
>>Perhaps.  There is a cache in the EGTB code.  And a file-system cache on top of
>>that...
>>
>>>  The only difference is that your _first_ access
>>>of a tablebase will take twice as long,
>>
>>
>>It might be much longer.  You have to malloc a bunch of memory.  Just set up
>>something with 2 pawns and see what happens to how many tables get probed...
>>starting with 8 pieces and 2-3 pawns on the board can produce _huge_ I/O rates.
>
>But these I/O rates are multiple probes to the same ending(s).

No.  Notice I said two or three _pawns_.  That can cause you to hit on 50
different tables (or more).  All the promotion cases before and after trading
enough pieces to get to the tables...


>
>With an index cache there will be exactly 0 malloc calls at run time, and one
>malloc on initialization.  In addition, there will be almost no cache misses,
>because I bet that even if you run 1 0 games continuously for one month much
>fewer than 10% of the 6-man endings will ever get probed, so making the index
>cache size 10% of the total index space will be sufficient.



You keep one decompression index?  forget it.  Just pick any 6 piece ending
where there is one pawn.  Now you have just _doubled_ your I/O bottleneck as you
promote to 5 different pieces, load 5 different decompression indices, etc...

Add pawns and maybe 8-9 pieces total and see how many different tables get
probed in a single search.  I could supply data if you want...





>
>>All in very fast time control games.  All will use memory that had better be
>>free when it is needed.
>>
>>I am using < 300 megs of RAM for all the decompression indices...  I'd much
>>rather malloc() and fill that at initialization time that deep in a game with
>>little time left...
>>
>
>Once you have all the tablebases you'll probably need more like 500 meg for
>indices.  As I said, if memory is not an issue there is no point in the extra
>work.  But for me memory was very much an issue.

Then it would seem to me to be _very_ risky to try to use a machine that is not
up to the demand of the algorithm...  IE memory is tight, forget the 6-piece
endings in the first place.  512 megs of ram is under one hundred bucks today...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.