Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fruit 1.0 64-bit

Author: Ratko V Tomic

Date: 09:08:33 03/26/04

Go up one level in this thread


The Intel's x86 instruction set penalizes pointer access when
the offsets are more than +127/-128 bytes away from the
structure/class/stack-frame pointer. This affects not only
the structures and classes containing pointers, but also the
function arguments and auto variables (accessed via frame
pointers EBP or ESP).

NOTE:

You can often make 32-bit programs faster & smaller by modifying
the structures exceeding 128 bytes. You break uch structures
into 2 substructures and then use the pointer which points into
the substructure in middle of the composite enclosing structure.
That way you can take advantage of the CPU efficient offset subrange
-1...-128 which is not used by C/C++ compilers (you need a macro to
convert field to structure ptr).

Example:

//--- get offset of field f within structure s

#define FLD(s,f)       ((int)&(((s*)0)->f))

//--- convert field ptr fp to struct ptr sp: fp=&s.fn --> &s

#define FLD2S(s,fn,fp) ((s*)((char*)(fp)-FLD(s,fn)))

Say, you have a large structure C0 (greater than 128 bytes):

typedef struct {
  int a1,...;
  int b1,...;
} SC0;

You break it down into two halves:

typedef struct {
 int a1,...;     // up to 128 bytes
} SA;

typedef struct {
 int b1,...;     // up to 128 bytes
} SB;

tyedef struct { // New struct SC replaces SC0
 SA sa;
 SB sb;
} SC;

Then instead of using ptr SC0 *pc0; you use ptr SB *pb; and replace
access to pc0->a1 with FLD2S(SC,sa,pb)->a1  and pc0->b1 with pb->b1.

One would normally do all this via macros so the faster code doesn't
look bulkier and messier to type or read. In the case of structures
larger than 256 bytes, you would place the most accessed fields near
the pointer pb, i.e. at the front of SB and near the end of SA.

The same decomposition applies to functions and classes with lots of
fields/autovars-args. Functions can be decomposed in separate { blocks }
with each { block } having its own block-local autovars (or, alternatively,
broken into several sub-functions).






This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.