Computer Chess Club Archives

Search

Terms

Messages

Subject: Parallelism on Single CPU machine?

Author: Roberto Waldteufel

Date: 03:25:26 12/26/98

On December 24, 1998 at 19:17:42, Dave Gomboc wrote:

>On December 24, 1998 at 10:17:41, Robert Hyatt wrote:
>
>>On December 24, 1998 at 06:07:17, Roberto Waldteufel wrote:
>>
>>>
>>>On December 24, 1998 at 01:40:45, Eugene Nalimov wrote:
>>>
>>>>The idea is interesting, but I doubt it can be used. For
>>>>example, I don't have FAT drive - all my drives are NTFS;
>>>>I also know persons who use HPFS at Windows NT. Also, there
>>>>are several FAT formats - e.g. FAT16 and FAT32. Possible
>>>>more to come when Windows 2000 will be released... I don't
>>>>want to compete with MS in supporting new disk formats.
>>>
>>>Aha - this is a pity. Still, how many different formats can there be? I suppose
>>>it might be possible to cater for some presently common FAT types, and resort to
>>>a default normal disc read if the program encounters a format that it did not
>>>recognise, which would allow for support of newer formats to be added later on,
>>>but it certainly makes the whole approach rather problematic. Perhaps as a
>>>safeguard the program could run a test at start-up time where it looks up data
>>>using both methods to ensure that it is reading the same data?
>>>
>>>>
>>>>I don't know what percentage of time is spend inside the
>>>>operating system on non-disk-reading tasks, but I think
>>>>that it's small enough. SQL Server was speed up by only
>>>>several percents when moving from the OS partitions to
>>>>the raw partitions, and I'd bet that yours speedup will
>>>>be even smaller.
>>>>
>>>>OS keeps parts of FAT (or NTFS disk allocation table)
>>>>in it's disk cache. You'll keep data on data location
>>>>directly in the chess program. This way you'll save several
>>>>hundred instructions, maybe several thousand, not more. After
>>>>that you'll wait disk drive (that is rotated with 4,500-
>>>>10,000RPM) for the data. That will take tents of thousand
>>>>instructions.
>>>>
>>>I had a feeling there were going to be more problems to face :-(
>>>However, I think that the disc waiting time might be made use of for a limited
>>>amount of parallelism in the search algorithm, ie while waiting for the
>>>tablebase data to become available from the disc, the program can perhaps be
>>>doing something useful like retracting the move that lead to the look-up
>>>position and generating/making the next move from the ply above. If we are
>>>talking time enough for tens of thousands of instructions, then relatively CPU
>>>intensive tasks might be performed while access takes place.
>>>
>>
>>
>>you can already do this under linux at least, and probably in windows.  In
>>linux, we call this "non-blocking I/O" where I can do a read when I want, but
>>my process isn't blocked at that point.  Later I can either ask for a signal to
>>be sent to my process when the I/O has been completed, or when I reach the point
>>where I need the data I can block there waiting on the I/O to finish.
>>
>>It doesn't help much.  The fastest disk I know of has a 5ms average access
>>time.  the xeon/400mhz processor executes a couple of instructions every 2.5
>>nanoseconds, which translates into 2.5 *million* instructions every millisecond,
>>or (in the case of my IBM 10K drives) 12.5 *million* instructions that I could
>>execute while waiting on that seek/read to complete.  I don't have that many
>>instructions to execute, unfortunately...  Most programs do maybe 2,000
>>instructions or so per node.  not 2 million...  so there is really nothing to
>>do while waiting on the I/O...
>
>Windows NT supports OVERLAPPED_IO (non-blocking), but Windows 95 and 98 do not.
>
>Dave Gomboc

I can't help myself from licking my chops at the thought of all those CPU
instructions. Suppose we use threads under Windows to simulate multiple CPU's on
a single CPU computer. I guess that entails some non-trivial overhead, but on
the plus side, when we hit a tablebase node, assuming we have non-blocked disk
access capability (ie we either have NT or else have a VxD to provide such a
facility under Win95 or Win98), we could use the Windows Sleep API call to
simply give a time slice back to Windows while the physical disk access actually
takes place. This time slice could then be used by the other active threads,
thereby gaining a large number of otherwise wasted CPU instructions.

Now the question is whether the saved time at tablebase nodes would be enough to
pay for any inherant overhead in the simulated parallel search. Of course in
positions that have no tablebase hits at all in their trees, we gain nothing at
all to compensate for the overhead, but in an iterative deepening framework, we
could count the number of tablebase hits on each iteration, and only call the
simulated parallel algorithm if a previous iteration had more than some
threshold of tablebae hits.

I have absolutely no idea how great the overhead of a simulated parallel search
would be. Has anyone ever experimented with using threads in this way? I have a
dim recollection of reading something here several months ago about using
threads to simulate a parallel search in order to develop and debug a parallel
search algorithm on a single CPU machine, but I think the idea was just for
development, with the final version intended for a multiple CPU platform. If the
idea is not practical, please tell me, as I would not want to waste my time on a
big experimental program that was doomed to failure from the start! If the idea
is practical, the my next inevitable question is what is the best reference for
parallel search algorithms, and is there any single algorithm that is considered
the most efficient?

Re: Parallelism on Single CPU machine? Robert Hyatt 07:40:17 12/27/98
Re: Parallelism on Single CPU machine? KarinsDad 23:12:57 12/26/98
- Re: Parallelism on Single CPU machine? Roberto Waldteufel 09:40:37 12/27/98
- Re: Parallelism on Single CPU machine? Robert Hyatt 07:42:47 12/27/98
  - Re: Parallelism on Single CPU machine? KarinsDad 09:33:41 12/27/98
Re: Parallelism on Single CPU machine? Komputer Korner 20:38:22 12/26/98
- Re: Parallelism on Single CPU machine? Roberto Waldteufel 09:53:57 12/27/98
  - Re: Parallelism on Single CPU machine? Komputer Korner 00:07:06 12/29/98

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.