Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fritz's Tablebase Initialisation

Author: Marc Bourzutschky

Date: 17:01:48 05/05/04

Go up one level in this thread


On May 05, 2004 at 19:34:10, Robert Hyatt wrote:

>On May 05, 2004 at 18:53:11, Marc Bourzutschky wrote:
>
>>On May 05, 2004 at 17:53:50, Robert Hyatt wrote:
>>
>>>On May 05, 2004 at 17:06:49, Marc Bourzutschky wrote:
>>>
>>>>On May 05, 2004 at 16:35:10, Robert Hyatt wrote:
>>>>
>>>>>On May 05, 2004 at 16:29:52, Marc Bourzutschky wrote:
>>>>>
>>>>>>On May 05, 2004 at 15:36:23, Robert Hyatt wrote:
>>>>>>
>>>>>>>On May 05, 2004 at 14:41:45, Marc Bourzutschky wrote:
>>>>>>>
>>>>>>>>On May 05, 2004 at 13:51:12, Robert Hyatt wrote:
>>>>>>>>
>>>>>>>>>On May 05, 2004 at 13:25:18, Marc Bourzutschky wrote:
>>>>>>>>>
>>>>>>>>>>On May 05, 2004 at 11:55:07, Robert Hyatt wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 05, 2004 at 09:37:14, Marc Bourzutschky wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On May 05, 2004 at 09:14:50, Mike Hood wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>On May 05, 2004 at 08:12:44, Vincent Diepeveen wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>On May 05, 2004 at 07:47:57, Mike Hood wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>I just let Filemon run while loading Fritz 8 to see why it takes so long. I was
>>>>>>>>>>>>>>>shocked to see that during the initialisation Fritz tries to open every possible
>>>>>>>>>>>>>>>tablebase. For instance...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Open kpk.nbw.emd -- good, it's there
>>>>>>>>>>>>>>>Open kpknbw.emd -- file not found
>>>>>>>>>>>>>>>Open kpk_nbw.emd -- file not found
>>>>>>>>>>>>>>>Open kpk_nbw_emd -- file not found (I never knew this format was valid)
>>>>>>>>>>>>>>>Open kpk.nbw -- file not found
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>And the same five accesses for the nbb file.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Why carry on with the other three after finding the first tablebase? But it gets
>>>>>>>>>>>>>>>even wilder when it comes to the 6-piece tablebases. All 365 possible tablebase
>>>>>>>>>>>>>>>pairs in all possible formats are accessed, even though I don't have any on my
>>>>>>>>>>>>>>>disk. Thousands of "file not found" results. Just one example, to show how
>>>>>>>>>>>>>>>ludicrous it is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>First Fritz tries to open krbnkp.nbw.emd, krbnkpnbw.emd, krbnkp_nbw.emd and
>>>>>>>>>>>>>>>krbnkp.nbw.emd. Almost the same as before, except Fritz is assuming 6-piece
>>>>>>>>>>>>>>>tablebases are compressed. But then Fritz tries to open krbnkp.0.nbw.emd,
>>>>>>>>>>>>>>>krbnkp.0_nbw.emd, krbnkp.0nbw.emd and krbnkp.0_nbw_emd. Then krbnkp.1.nbw.emd,
>>>>>>>>>>>>>>>etc... and krbnkp.2.nbw.emd... and all the way through to krbnkp.g.nbw.emd. That
>>>>>>>>>>>>>>>means 136 disk accesses for a tablebase that I don't have! And that's only one
>>>>>>>>>>>>>>>tablebase out of 365.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Wouldn't it be much easier just to scan the tablebase directory and only open
>>>>>>>>>>>>>>>the files that actually exist?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Both nalimov and i do this in a similar way.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>If you are willing to write code for this that works faster and works both for
>>>>>>>>>>>>>>windows and *nix, then i will be real happy to use it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Best Regards,
>>>>>>>>>>>>>>Vincent
>>>>>>>>>>>>>
>>>>>>>>>>>>>Thanks for the info, Vincent. I assumed the initialization code had been written
>>>>>>>>>>>>>by Chessbase, not by Eugene.
>>>>>>>>>>>>>
>>>>>>>>>>>>>My math was a bit off in my original post, but after looking at Filemon's log I
>>>>>>>>>>>>>can give the exact figure: Fritz attempts to access 33647 non-existent tablebase
>>>>>>>>>>>>>files. And please... you can't tell me that if the file krbnkp.0.nbw.emd doesn't
>>>>>>>>>>>>>exist it still makes sense to look for krbnkp.1.nbw.emd, krbnkp.2.nbw.emd, etc
>>>>>>>>>>>>>all the way to krbnkp.g.nbw.emd. That's a waste of processor time on any
>>>>>>>>>>>>>operating system.
>>>>>>>>>>>>
>>>>>>>>>>>>As this is only done once during initialization it is not such a big deal.  IMHO
>>>>>>>>>>>>a more serious nuisance is that all available endgames on the paths are
>>>>>>>>>>>>initialized even though they may never be used.  As a fair amount of memory is
>>>>>>>>>>>>taken up by each endgame that is initialized this is a serious inefficiency.
>>>>>>>>>>>>I'm surprised that Fritz and Co. have not implemented a scheme where an endgame
>>>>>>>>>>>>is only initialized when it is actually required.
>>>>>>>>>>>>
>>>>>>>>>>>>-Marc
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Would you _really_ want to wait until you have a few seconds left, with no
>>>>>>>>>>>increment, then start opening files, malloc()'ing  buffers, setting up the
>>>>>>>>>>>decompression stuff?  Oops.  flag just fell.
>>>>>>>>>>>
>>>>>>>>>>>:)
>>>>>>>>>>
>>>>>>>>>>I actually do just that and it takes a small fraction of a second to initialize
>>>>>>>>>>one tablebase.  For me at least the likelyhood of this being an issue is
>>>>>>>>>>miniscule compared to the amount of memory I can save.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>What will you use it for?  And if memory is tight, that "fraction of a second"
>>>>>>>>>will grow as you might have to page out an inactive process to make room...
>>>>>>>>
>>>>>>>>For one, I can use the extra memory for hash tables and other things I might be
>>>>>>>>doing on my computer.
>>>>>>>
>>>>>>>We are not on the same page apparently.
>>>>>>>
>>>>>>>Your idea won't work.  Because inside the search, when you _do_ need to access a
>>>>>>>table, you need to malloc() memory for the decompression indices and read them
>>>>>>>in.  That takes time.  If you used all of memory for hashing you just introduced
>>>>>>>significant paging overhead that is going to slow you further.  If you want to
>>>>>>>dynamically reduce the size of the hash table inside the search, forget about
>>>>>>>it.  That adds so much overhead it isn't thinkable...
>>>>>>>
>>>>>>
>>>>>>I already have it working!  Of course, if you allocate all free memory to hash
>>>>>>tables initially you might run into the problem you describe.  But even that can
>>>>>>be easily fixed by maintaining an index cache, just like there is already a
>>>>>>cache for the tablebase values themselves.
>>>>>
>>>>>
>>>>>Either you use more memory for hash, as you said you would, and run into paging
>>>>>when you start probing tables, or you don't use more memory for hash, which
>>>>>means you "reserve" the necessary EGTB memory but don't use it until needed.
>>>>>When you do need it, you will lose games on time if they are short time
>>>>>controls.
>>>>>
>>>>
>>>>The point is that for practical purpose the number of tablebases actually needed
>>>>is way less than the total number possible.
>>>>
>>>>>One way or another, the current approach is the correct one.  Why do you think
>>>>>_everybody_ is doing it that way???
>>>>
>>>>First of all, the current approach is not incorrect, just memory inefficient.
>>>>Not a big deal if you only use 5-man tables, but a bit more of a deal once all
>>>>the 6-man tables are there.  Also not a big deal if you have lots of memory to
>>>>burn.  Second of all, _everybody_ is just Eugene Nalimov himself, since people
>>>>just copy his code.
>>>
>>>Not quite.  Steven Edwards wrote the first egtb code I used.  It did the same
>>>although there were no decompression indices since he didn't support on-the-fly
>>>decompression.  Bruce Moreland also did tables and I used his for a bit as well,
>>>and he also opened them up front to see what was present.
>>>
>>>
>>>>  I'm sure Eugene would agree that with a suitable index
>>>>cache one can eliminate loading all the tablebase stuff on startup, with a
>>>>miniscule chance of this leading to losing games on time.
>>>
>>>
>>>I don't see how.  I play 1 0 games regularly.  No time to start opening files
>>>and loading decompression indices with a second or two left total...
>>
>>But you are already opening files and reading from disk because the position
>>itself will not be in memory!
>
>Perhaps.  There is a cache in the EGTB code.  And a file-system cache on top of
>that...
>
>>  The only difference is that your _first_ access
>>of a tablebase will take twice as long,
>
>
>It might be much longer.  You have to malloc a bunch of memory.  Just set up
>something with 2 pawns and see what happens to how many tables get probed...
>starting with 8 pieces and 2-3 pawns on the board can produce _huge_ I/O rates.

But these I/O rates are multiple probes to the same ending(s).

With an index cache there will be exactly 0 malloc calls at run time, and one
malloc on initialization.  In addition, there will be almost no cache misses,
because I bet that even if you run 1 0 games continuously for one month much
fewer than 10% of the 6-man endings will ever get probed, so making the index
cache size 10% of the total index space will be sufficient.

>All in very fast time control games.  All will use memory that had better be
>free when it is needed.
>
>I am using < 300 megs of RAM for all the decompression indices...  I'd much
>rather malloc() and fill that at initialization time that deep in a game with
>little time left...
>

Once you have all the tablebases you'll probably need more like 500 meg for
indices.  As I said, if memory is not an issue there is no point in the extra
work.  But for me memory was very much an issue.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.