Author: Michel Langeveld
Date: 12:23:54 04/19/00
Go up one level in this thread
On April 19, 2000 at 14:54:34, Dann Corbit wrote:
>On April 19, 2000 at 14:33:38, Andrew Dados wrote:
>
>>On April 19, 2000 at 14:28:33, Andrew Dados wrote:
>>
>>>On April 19, 2000 at 13:49:25, Dann Corbit wrote:
>>>
>>>>What is the fastest way to fill a linear array of bytes with zero, given the
>>>>following conditions:
>>>>1. Intel PII or higher CPU
>>>>2. Guarantee that the number of bytes is an even number?
>>>>
>>>>I am porting a chess program, and memset() is the bottleneck. I don't need to
>>>>memset an arbitrary character. It's always zero.
>>>
>>>for pure 32bit windoze (assuming es==ds):
>>>
>>>asm
>>> mov edi, begin_address
>>> mov ecx, count ; number of words to fill
>>> xor eax,eax ; filling with 0x0000
>>> shr ecx,2 ; here carry gets set if count is not divisible by 4
>>> rep stosd
>>> jnc @finito
>>> stosw ; fill extra word if count mod 4 !=0
>>>@finito:
>>>end;
>>
>>oeps.. count above is of course number of *bytes* to fill.
>> if count was number of words, then shift ecx by 1 only....
>
>Thanks.
>
>Turns out, I have a guarantee that the objects will always be 4 byte integers,
>so here is what I have so far:
>
>/*
>On April 19, 2000 at 14:28:33, Andrew Dados wrote:
>
>On April 19, 2000 at 13:49:25, Dann Corbit wrote:
>
>What is the fastest way to fill a linear array of bytes with zero, given the
>following conditions:
>1. Intel PII or higher CPU
>2. Guarantee that the number of bytes is an even number?
>
>I am porting a chess program, and memset() is the bottleneck. I don't need to
>memset an arbitrary character. It's always zero.
>
>for pure 32bit windoze (assuming es==ds):
>*/
>void __cdecl fillit(unsigned long *begin_address, unsigned long count_of_longs)
>{
> _asm {
> mov edi, begin_address
> mov ecx, count_of_longs ; number of longs to fill
> mov eax, 0 ; filling with 0x0000
> rep stosd
> }
>}
Keep in mind tnat this is ofcourse not the fastest with small sizes (probably
<= 4 bytes).
Make also sure that your void will be inlined, I think it's better to use
__fastcall or even more make a define of it.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.