Author: Robert Hyatt
Date: 07:21:44 09/06/05
Go up one level in this thread
On September 06, 2005 at 00:32:43, Carey wrote: >On September 05, 2005 at 22:49:55, Robert Hyatt wrote: > >>On September 05, 2005 at 19:17:33, Carey wrote: >> >>>On September 05, 2005 at 18:01:27, Robert Hyatt wrote: >>> >>>>On September 05, 2005 at 16:58:01, Carey wrote: >>>> >>>>>On September 05, 2005 at 14:44:58, Robert Hyatt wrote: >>>>> >>>>>> >>>>>> void *p = (void *) ((int) malloc(size+63) + 63) & ~63); >>>>>> >>>>>>What I do is malloc 63 bytes more than I need, add 63 to the resulting pointer, >>>>>>then and with a constant (an int unfortunately) that has the rightmost 6 bits >>>>> >>>>>I always did that seperately. Allocate it and then cast to unsigned int, and >>>>>then masked off how much I needed, then added that to the original pointer. >>>>>That way the pointer never had a chance to be truncated. >>>> >>>>Sorry. "cast to unsigned int" casts a 64 bit value to a 32 bit value. You just >>>>lost the upper 32 bits... Remember that _any_ int on these compilers is 32 >>>>bits. While pointers in 64 bit mode are 64 bits long. Any conversions will >>>>completely wreck an address (pointer). >>> >>> >>>Right. But that's not what I said. Or at least not what I meant. Rereading it >>>I can see how you misread what I said. >>> >>>You are doing the whole thing as a single statement. That's the problem. You >>>are trying to do too much at once. >>> >>>I said to do it seperately. >>> >>>Leave the malloc return as a pointer. This part never gets chopped to any >>>integer. It stays safe. >>> >>>Then seperately, we cast it to an int. Or even a char. Just enough we can do >>>the alightment. The loss in precision there is irrelevant because we are >>>working with small integer (a few bits) anyway. >>> >>>Then we take that very small integer adjustment and add that to the original >>>pointer. That type of adjustment is perfectly safe. No different from doing >>>any of the normal pointer math, such as ptr=ptr+1; >> >>Perhaps. I see what you are saying, but the only risk is that it is not >>possible to add ptr = ptr + int without some re-casting to get rid of the >>warnings/errors. And there is a potential problem. > > >It should be possible pretty much the way I described. Unless C99 changed some >stuff. > >The key is that the pointer from malloc was already cast to "unsigned char *" >right at the call. That's a guaranteed base type in the C specs. > >Anything else might have holes in it or be unable to represent every particular >storage cell. > >But an unsigned char * is guaranteed to be able to hold and represent any >pointer. > >(In the old days, we had to be careful about 'far' pointers because they could >be unnormalized. We had to take care of that before the alignment attempt.) > >(The C specs are pretty flexible... And kind of amusing. It only defines a >very few low level guaranteed base types. The rest can have all sorts of weird >behavior. For example, signed integers and chars can have holes in the middle >of them, but unsigned char can't. The NULL value is guaranteed to be 0 even if >that is inapropriate on the system. It has to convert it. And a bunch of >others.) > > > >>> >>>That way the pointer itself never ever gets cast to an int. It always stays as >>>the full pointer. Only the adjustment calculation part ever gets chopped to an >>>int, and in that case, it's okay. All we are needing is a few bits anyway. >>> >>>This is identical to what we used to have to do back in the days of 16 bit DOS >>>when mixing 16 bit near vs. 20/32 bit 'huge' &'far' vs. 32 bit flat pointers. >>>(Although back in those days, we had to do a bit of extra work to make sure the >>>pointers were normalized, etc.) >>> >>>We used to do wrapper routines. (Sometimes macros, but usually functions.) >>>Something like: >>> >>>void *AlignedMalloc(size_t Bytes) >>>{unsigned int a; >>> unsigned char *ptr; >>>/* for proper official standard behavior, must be unsigned char */ >>> >>>ptr=(unsigned char*)malloc(Bytes+64); >>>a=(unsigned int)ptr; >>>a= a & 63; >>>a=64-a; >>>ptr = ptr + a; >>>return (void *)Ptr; >>>} >>> >>>As you can see, the pointer is never at risk from being chopped. Back in the >>>old days, we never really knew what size pointer we'd be working with. It might >>>be compiled with 16 bit pointers, or 32 bit 'far'/'huge' pointers, or even on a >>>full brand new 32 bit 386 system. >> >>Right. only issue is the warning / error since adding a 64 bit pointer and a 32 >>bit integer is not permissable... > > >There shouldn't be any warnings. > >The language spec doesn't know it's a 64 bit pointer. > >As far as the C language is concerned it is *only* a pointer. It has no >particular size. (The implementation has one, of course, but the C specs don't >care whether it's 16, 24, 32, 64, or whatever bits.) > >You are then adding an integer offset. Which is well defined in the C specs. > >So whether that integer offset is stored in a char, a short int, a 32 bit int, >or a full 64 bit int doesn't matter. > >You are doing well defined math on a pointer and the specs will require the >compiler generate code that automatically adapts and does it properly. If that >means extending that char from 8 bits to 64 bits, then that's what it does. If >it means converting an ambiguous size 'int' to 64 bits, then that's what it >does. > >I took a quick look through my C89 specs (can't find my copy of C99), and all it >says about pointer arithmetic is that the value be an "integral type", which >means any normal char, short, int, long or longlong. > > >If you want to feel more cofortable, just assign the offset amount to a >ptrdiff_t type. That's a predefined signed integral type (ie: int of some sort) >that is guaranteed to be big enough to hold pointer arithmetic. Regardless >whether the pointers are 64 bits and the ints are 32 bits. (If the compiler >gets this wrong, then its broken. It'll violate the specs.) > >Right off hand, I don't know if you are supposed to be able to do basic math >operations on ptrdif_t or not. Meaning I'm not sure it's entirely legal / >correct to do the '&' to get the offset stuff. It is an "integral type", so I >would expect so, but I'd need to check the specs first to make sure. > >You could also do something like: > >ptrdiff_t d; > >d=MallocPtr - NULL; > >That would correctly convert the pointer into a ptrdiff_t type for you to do the >math on it. Then when you get done, you just do > >MallocPtr += d; > >The types should be taken care of. > > >Another way to do the alignment is to do the ptrdiff_t subtraction like above, >and check the alignment, and if it's not what you want, then inc the malloc >pointer and recheck, and repeat until satisified. > >(The only problem with using the NULL pointer like this is as I said way >above... Although NULL is defined as 0, the actual implementation is free to use >whatever it needs. So if you align the pointer this way, there is a chance >(albiet very remote chance on regular hardware) that you'll actually be >misaligning it.) > >There are a lot of ways of doing this kind of stuff, without casting a pointer >to an int. > >The best way.... That's really hard to say. > >Just a matter of picking one you like. > > > > >> >>> >>>(Actually, there are other, technically better, ways to do that function. This >>>assumes that a char is a byte is a cell, and so on. It wont work right on some >>>exotic systems. For those, it might be safer to just increment the pointer up >>>to 64 times until the lower bits show it's aligned. If it hasn't done it by >>>then, you give a fatal error becuase you are on a really weird system.) >>> >>>Of course, we also did similar things for calloc, etc. >>> >>>And, of course, doing a 'free' was more than a little difficult, since the >>>original pointer was lost. This could be dealt with by storing it, or by just >>>not caring and letting the OS free the memory when we were done. >>> >> >>I saved the original when doing this in Crafty. I no longer do it since I >>allocate memory differently (shmget()) which only allocates memory starting on a >>page boundary anyway... > > >Back in the bad old DOS days when I was doing this, I used to store it a fixed >amount below the aligned memory. I had to allocate a few extra bytes, but that >usually was no big deal. > >But usually in the DOS world, it worked just as well to not ever bother freeing >the memory and just let the OS do it when the program was done. > >Nowdays, to be honest, I usually don't care that much about alignment. As long >as it's word aligned, then that's good enough for most of the programming I do. >When you are allocating half a gig of memory, it usually doesn't matter too much >if the array happens to be page aligned or such. Just as long as it meets basic >alignment. And most compilers get that right. I care because of cache linesize. For example, in the hash table, I always fetch 4 entries in one "gulp" which is 64 bytes. I'd like them to be in one cache line when possible. If I don't make sure that the hash table is aligned on a 64-byte boundary, I can get two cache miss penalties on _every_ hash table probe that would normally get one miss. That is significant... > > >> >>> >>>>>>32 bits, pointer (and long) = 64 bits... Why an int is 32 bits on a 64 bit >>>>>>machine is a good question. We really needed some better int types, but the >>>>> >>>>>Up to the compiler designer. >>>>> >>>>>Realistically, it makes quite a bit of sense. So much code today is hardwired >>>>>for 32 bit ints that going fully 64 by default would cause a lot of code to >>>>>fail. By keeping the int at 32 bits, most semi-properly written code will still >>>>>compile and work. >>>> >>>>Problem is all ints are _not_ 32 bits. That was my point. Declare an int on a >>>>Cray. Or on an alpha... >>> >>> >>>I know. The Cray etc. people complained about that back in the late 80's when >>>the original ANSI / ISO C standard was being done. >>> >>>The C standard people patiently explained the situation to them. That their >>>charter was limited to "codifying existing practice" (Their words.) That they >>>only had limited authority to invent or drastically change. >>> >>>That's why it waited until the C99 standard. >>> >>>And there is absolutely nothing that can be done about plain 'int'. It pretty >>>much is defined as the machine word. Whatever that happens to be. >> >>Yes. But on an "opteron" what would you call a "word" when the processor is in >>64 bit mode? That is quite an aggravation... > > >[shrug] > >Generally 64 bits. Unless the compiler author wants to leave it at 32 bits for >compatability with lots and lots of old programs. (Not a great idea, but I can >understand why. There can be a lot of hidden 32 bit'isms in programs.) > >Nobody is forcing you to use plain 'int'. The C99 specs give you 8, 16, 32, and >64 bit datatypes. They are guaranteed regardless whether you are on an 8 bit, >16 bit, 32 bit or 64 bit system. > >I agree it can be anoying. I'm not disagreeing with you. > >I'm just saying that's the way things are. C was written a long long time ago. >It was used a long time before people really started caring about 64 bit >integers. Yes, but this problem also existed on the 16/32 boundary as well, that was the same sort of "problem child" for programming... > >And now that people do care, they give you enough support to do things the way >you need to. You just need to give up the habit of using plain 'char' and 'int' >and such. Problem with that is exactly what data type do I use for 64 bits? Every compiler is not C99 compliant. Every compiler doesn't support "long long". So it still remains an issue of portability... > >It could be worse.... Look at Pascal. It has three types. char, signed int, >and an unspecified 'real' type. > >You've got no idea what size they are. No way to do unsigned at all. And the >REAL format can be any size the compiler author wants... even 16 bits. > >(The only thing I really liked about Pascal was the somewhat stronger type >checking. It can be real nice at times. Really cuts down on the possible >errors.) > > >> >> >>> >>>But, the standard does allow for some flexibility. Hence, it's possible for a >>>version of C to have 32 bit ints even on a 64 bit system. >> >>And to default to either signed or unsigned for chars, etc. This is less a >>"standard" than a "guideline" which is really poor... > > >Better than some languages. > >The signed nature was due entirely to what existing implementations did. Some >did it signed, others did it unsigned. Forcing a particular one would have >broken 50% of the existing code.... > >They knew at the time it'd be nice to fix a few issues like that, and remove >some of the "implementation defined" areas, but doing it would have broken half >the code and their charter just didn't give them enough room to do that. > >Unlike the C++ people who could make arbitrary decisions and break existing >implementations and do whatever they way, the C89 people were a lot more limited >in what they could do. > > >It could have been worse... It could have been done like Pascal. Practically a >useless language. > >If you don't like the default nature of int or char, then don't use them. Go >ahead and specify whether they are signed or unsigned and whatever particular >size you need. C99 has the tools just waiting to be used. You just have to >give up the habit of using 'char' and 'int'. (A habit that can be very hard to >break... Again I will remind you that not all compilers are C99 compliant. If I were just writing code for me, this would not be a problem. But I write code that is compiled on every compiler and system known to man, and a few that are not.. > > >Don't misunderstand me... I agree with you that some things could (should?) have >been done better. But they really did do a heck of a lot of work making it as >clear and well defined and compatible as they did. > > > >> >> >>> >>>And for portability the large number of 32 bit programs that might not be 64 bit >>>safe, it does make sense. >>> >>>Not necessairly the best choice, but it does make sense. >>> >>>(In fact, I remember reading articles back in the days when people were moving >>>from 16 to 32 bits. People were complaining about the difficulties of moving >>>from 16 to 32 bits. And the Cray people spoke up made similar comments about 32 >>>bit unix programs being ported to the Cray.) >>> >>>The only time it causes problems is if the program author does very stupid >>>things, like violating cardnial rules of pointers and integers being the same >>>size. >>> >>>That's such a cardnial rule that no programmer today should violate it. But >>>some do. >>> >>>That was a painful lesson we learned the hard way back then. But since then, >>>most programmers have forgotten about it and are having to relearn it when >>>moving to 64 bits. >>> >>>The reality is that you should never ever expect a pointer to be any particular >>>size or value. Always use ptrdiff_t, and so on. >>> >>>The reality is that if the C compiler author wants to, a pointer could be 96 >>>bits or more. The compiler author may decide to throw in some extra boundary >>>info. Or maybe the pointer include extra info such as a page table entry. Or >>>whatever. >>> >>>A properly written C program will never notice what size a pointer is. And the >>>size of the integer will only be relevant to the amount of regular computational >>>data it needs to hold. >>> >>>In which case, a 64 bit compiler with a 32 bit int can make some sense. >>> >>>(Again, not necessarily the best choice. But it can help make unsafe int >>>behavior less likely to fail. A lot of programs today depend on 32 bit >>>rollovers, etc.)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.