Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Seg fault problems on 64bit linux

Author: Carey

Date: 21:32:43 09/05/05

Go up one level in this thread


On September 05, 2005 at 22:49:55, Robert Hyatt wrote:

>On September 05, 2005 at 19:17:33, Carey wrote:
>
>>On September 05, 2005 at 18:01:27, Robert Hyatt wrote:
>>
>>>On September 05, 2005 at 16:58:01, Carey wrote:
>>>
>>>>On September 05, 2005 at 14:44:58, Robert Hyatt wrote:
>>>>
>>>>>
>>>>>    void *p = (void *) ((int) malloc(size+63) + 63) & ~63);
>>>>>
>>>>>What I do is malloc 63 bytes more than I need, add 63 to the resulting pointer,
>>>>>then and with a constant (an int unfortunately) that has the rightmost 6 bits
>>>>
>>>>I always did that seperately.  Allocate it and then cast to unsigned int, and
>>>>then masked off how much I needed, then added that to the original pointer.
>>>>That way the pointer never had a chance to be truncated.
>>>
>>>Sorry.  "cast to unsigned int" casts a 64 bit value to a 32 bit value.  You just
>>>lost the upper 32 bits...  Remember that _any_ int on these compilers is 32
>>>bits.  While pointers in 64 bit mode are 64 bits long.  Any conversions will
>>>completely wreck an address (pointer).
>>
>>
>>Right.  But that's not what I said.  Or at least not what I meant.  Rereading it
>>I can see how you misread what I said.
>>
>>You are doing the whole thing as a single statement.  That's the problem.  You
>>are trying to do too much at once.
>>
>>I said to do it seperately.
>>
>>Leave the malloc return as a pointer.  This part never gets chopped to any
>>integer.  It stays safe.
>>
>>Then seperately, we cast it to an int.  Or even a char.  Just enough we can do
>>the alightment.  The loss in precision there is irrelevant because we are
>>working with small integer (a few bits) anyway.
>>
>>Then we take that very small integer adjustment and add that to the original
>>pointer.  That type of adjustment is perfectly safe.  No different from doing
>>any of the normal pointer math, such as ptr=ptr+1;
>
>Perhaps.  I see what you are saying, but the only risk is that it is not
>possible to add ptr = ptr + int without some re-casting to get rid of the
>warnings/errors.  And there is a potential problem.


It should be possible pretty much the way I described.  Unless C99 changed some
stuff.

The key is that the pointer from malloc was already cast to "unsigned char *"
right at the call.  That's a guaranteed base type in the C specs.

Anything else might have holes in it or be unable to represent every particular
storage cell.

But an unsigned char * is guaranteed to be able to hold and represent any
pointer.

(In the old days, we had to be careful about 'far' pointers because they could
be unnormalized.  We had to take care of that before the alignment attempt.)

(The C specs are pretty flexible...  And kind of amusing.  It only defines a
very few low level guaranteed base types.  The rest can have all sorts of weird
behavior.  For example, signed integers and chars can have holes in the middle
of them, but unsigned char can't.  The NULL value is guaranteed to be 0 even if
that is inapropriate on the system.  It has to convert it.  And a bunch of
others.)



>>
>>That way the pointer itself never ever gets cast to an int.  It always stays as
>>the full pointer.  Only the adjustment calculation part ever gets chopped to an
>>int, and in that case, it's okay.  All we are needing is a few bits anyway.
>>
>>This is identical to what we used to have to do back in the days of 16 bit DOS
>>when mixing 16 bit near vs. 20/32 bit 'huge' &'far' vs. 32 bit flat pointers.
>>(Although back in those days, we had to do a bit of extra work to make sure the
>>pointers were normalized, etc.)
>>
>>We used to do wrapper routines.  (Sometimes macros, but usually functions.)
>>Something like:
>>
>>void *AlignedMalloc(size_t Bytes)
>>{unsigned int a;
>> unsigned char *ptr;
>>/* for proper official standard behavior, must be unsigned char */
>>
>>ptr=(unsigned char*)malloc(Bytes+64);
>>a=(unsigned int)ptr;
>>a= a & 63;
>>a=64-a;
>>ptr = ptr + a;
>>return (void *)Ptr;
>>}
>>
>>As you can see, the pointer is never at risk from being chopped.  Back in the
>>old days, we never really knew what size pointer we'd be working with.  It might
>>be compiled with 16 bit pointers, or 32 bit 'far'/'huge' pointers, or even on a
>>full brand new 32 bit 386 system.
>
>Right.  only issue is the warning / error since adding a 64 bit pointer and a 32
>bit integer is not permissable...


There shouldn't be any warnings.

The language spec doesn't know it's a 64 bit pointer.

As far as the C language is concerned it is *only* a pointer.  It has no
particular size.  (The implementation has one, of course, but the C specs don't
care whether it's 16, 24, 32, 64, or whatever bits.)

You are then adding an integer offset.  Which is well defined in the C specs.

So whether that integer offset is stored in a char, a short int, a 32 bit int,
or a full 64 bit int doesn't matter.

You are doing well defined math on a pointer and the specs will require the
compiler generate code that automatically adapts and does it properly.  If that
means extending that char from 8 bits to 64 bits, then that's what it does.  If
it means converting an ambiguous size 'int' to 64 bits, then that's what it
does.

I took a quick look through my C89 specs (can't find my copy of C99), and all it
says about pointer arithmetic is that the value be an "integral type", which
means any normal char, short, int, long or longlong.


If you want to feel more cofortable, just assign the offset amount to a
ptrdiff_t type.  That's a predefined signed integral type (ie: int of some sort)
that is guaranteed to be big enough to hold pointer arithmetic.  Regardless
whether the pointers are 64 bits and the ints are 32 bits.  (If the compiler
gets this wrong, then its broken.  It'll violate the specs.)

Right off hand, I don't know if you are supposed to be able to do basic math
operations on ptrdif_t or not.  Meaning I'm not sure it's entirely legal /
correct to do the '&' to get the offset stuff.  It is an "integral type", so I
would expect so, but I'd need to check the specs first to make sure.

You could also do something like:

ptrdiff_t d;

d=MallocPtr - NULL;

That would correctly convert the pointer into a ptrdiff_t type for you to do the
math on it.  Then when you get done, you just do

MallocPtr += d;

The types should be taken care of.


Another way to do the alignment is to do the ptrdiff_t subtraction like above,
and check the alignment, and if it's not what you want, then inc the malloc
pointer and recheck, and repeat until satisified.

(The only problem with using the NULL pointer like this is as I said way
above... Although NULL is defined as 0, the actual implementation is free to use
whatever it needs.  So if you align the pointer this way, there is a chance
(albiet very remote chance on regular hardware) that you'll actually be
misaligning it.)

There are a lot of ways of doing this kind of stuff, without casting a pointer
to an int.

The best way....  That's really hard to say.

Just a matter of picking one you like.




>
>>
>>(Actually, there are other, technically better, ways to do that function.  This
>>assumes that a char is a byte is a cell, and so on.  It wont work right on some
>>exotic systems.  For those, it might be safer to just increment the pointer up
>>to 64 times until the lower bits show it's aligned.  If it hasn't done it by
>>then, you give a fatal error becuase you are on a really weird system.)
>>
>>Of course, we also did similar things for calloc, etc.
>>
>>And, of course, doing a 'free' was more than a little difficult, since the
>>original pointer was lost.  This could be dealt with by storing it, or by just
>>not caring and letting the OS free the memory when we were done.
>>
>
>I saved the original when doing this in Crafty.  I no longer do it since I
>allocate memory differently (shmget()) which only allocates memory starting on a
>page boundary anyway...


Back in the bad old DOS days when I was doing this, I used to store it a fixed
amount below the aligned memory.  I had to allocate a few extra bytes, but that
usually was no big deal.

But usually in the DOS world, it worked just as well to not ever bother freeing
the memory and just let the OS do it when the program was done.

Nowdays, to be honest, I usually don't care that much about alignment.  As long
as it's word aligned, then that's good enough for most of the programming I do.
When you are allocating half a gig of memory, it usually doesn't matter too much
if the array happens to be page aligned or such.  Just as long as it meets basic
alignment.  And most compilers get that right.


>
>>
>>>>>32 bits, pointer (and long) = 64 bits...  Why an int is 32 bits on a 64 bit
>>>>>machine is a good question.  We really needed some better int types, but the
>>>>
>>>>Up to the compiler designer.
>>>>
>>>>Realistically, it makes quite a bit of sense.  So much code today is hardwired
>>>>for 32 bit ints that going fully 64 by default would cause a lot of code to
>>>>fail.  By keeping the int at 32 bits, most semi-properly written code will still
>>>>compile and work.
>>>
>>>Problem is all ints are _not_ 32 bits.  That was my point.  Declare an int on a
>>>Cray.  Or on an alpha...
>>
>>
>>I know.  The Cray etc. people complained about that back in the late 80's when
>>the original ANSI / ISO C standard was being done.
>>
>>The C standard people patiently explained the situation to them.  That their
>>charter was limited to "codifying existing practice"  (Their words.)  That they
>>only had limited authority to invent or drastically change.
>>
>>That's why it waited until the C99 standard.
>>
>>And there is absolutely nothing that can be done about plain 'int'.  It pretty
>>much is defined as the machine word.  Whatever that happens to be.
>
>Yes.  But on an "opteron" what would you call a "word" when the processor is in
>64 bit mode?  That is quite an aggravation...


[shrug]

Generally 64 bits.  Unless the compiler author wants to leave it at 32 bits for
compatability with lots and lots of old programs.  (Not a great idea, but I can
understand why.  There can be a lot of hidden 32 bit'isms in programs.)

Nobody is forcing you to use plain 'int'.  The C99 specs give you 8, 16, 32, and
64 bit datatypes.  They are guaranteed regardless whether you are on an 8 bit,
16 bit, 32 bit or 64 bit system.

I agree it can be anoying.  I'm not disagreeing with you.

I'm just saying that's the way things are.  C was written a long long time ago.
It was used a long time before people really started caring about 64 bit
integers.

And now that people do care, they give you enough support to do things the way
you need to.  You just need to give up the habit of using plain 'char' and 'int'
and such.

It could be worse.... Look at Pascal.  It has three types.  char, signed int,
and an unspecified 'real' type.

You've got no idea what size they are.  No way to do unsigned at all.  And the
REAL format can be any size the compiler author wants... even 16 bits.

(The only thing I really liked about Pascal was the somewhat stronger type
checking.  It can be real nice at times.  Really cuts down on the possible
errors.)


>
>
>>
>>But, the standard does allow for some flexibility.  Hence, it's possible for a
>>version of C to have 32 bit ints even on a 64 bit system.
>
>And to default to either signed or unsigned for chars, etc.  This is less a
>"standard" than a "guideline" which is really poor...


Better than some languages.

The signed nature was due entirely to what existing implementations did.  Some
did it signed, others did it unsigned.  Forcing a particular one would have
broken 50% of the existing code....

They knew at the time it'd be nice to fix a few issues like that, and remove
some of the "implementation defined" areas, but doing it would have broken half
the code and their charter just didn't give them enough room to do that.

Unlike the C++ people who could make arbitrary decisions and break existing
implementations and do whatever they way, the C89 people were a lot more limited
in what they could do.


It could have been worse... It could have been done like Pascal.  Practically a
useless language.

If you don't like the default nature of int or char, then don't use them.  Go
ahead and specify whether they are signed or unsigned and whatever particular
size you need.  C99 has the tools just waiting to be used.  You just have to
give up the habit of using 'char' and 'int'.  (A habit that can be very hard to
break...)


Don't misunderstand me... I agree with you that some things could (should?) have
been done better.  But they really did do a heck of a lot of work making it as
clear and well defined and compatible as they did.



>
>
>>
>>And for portability the large number of 32 bit programs that might not be 64 bit
>>safe, it does make sense.
>>
>>Not necessairly the best choice, but it does make sense.
>>
>>(In fact, I remember reading articles back in the days when people were moving
>>from 16 to 32 bits.  People were complaining about the difficulties of moving
>>from 16 to 32 bits.  And the Cray people spoke up made similar comments about 32
>>bit unix programs being ported to the Cray.)
>>
>>The only time it causes problems is if the program author does very stupid
>>things, like violating cardnial rules of pointers and integers being the same
>>size.
>>
>>That's such a cardnial rule that no programmer today should violate it.  But
>>some do.
>>
>>That was a painful lesson we learned the hard way back then.  But since then,
>>most programmers have forgotten about it and are having to relearn it when
>>moving to 64 bits.
>>
>>The reality is that you should never ever expect a pointer to be any particular
>>size or value.  Always use ptrdiff_t, and so on.
>>
>>The reality is that if the C compiler author wants to, a pointer could be 96
>>bits or more.  The compiler author may decide to throw in some extra boundary
>>info.  Or maybe the pointer include extra info such as a page table entry.  Or
>>whatever.
>>
>>A properly written C program will never notice what size a pointer is.  And the
>>size of the integer will only be relevant to the amount of regular computational
>>data it needs to hold.
>>
>>In which case, a 64 bit compiler with a 32 bit int can make some sense.
>>
>>(Again, not necessarily the best choice.  But it can help make unsafe int
>>behavior less likely to fail.  A lot of programs today depend on 32 bit
>>rollovers, etc.)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.