[Python-3000] struni and the Apple four-character-codes

Ronald Oussoren ronaldoussoren at mac.com
Wed Jul 25 12:04:49 CEST 2007


I've CC-ed Jack Jansen as he has maintained the Mac libraries for ages (from way before OS9 was shiny and new).
 

On Wednesday, July 25, 2007, at 07:18AM, "Jeffrey Yasskin" <jyasskin at gmail.com> wrote:
>I'm looking through a couple of the OS X tests and have run into the
>question of what to do with four-character codes. (For those of you
>who are unfamiliar with these, Apple, around the dawn of time, decided
>that C constants like 'TEXT' (yes, those are single quotes) would
>compile to the uint32_t 0x54455854 (or maybe the other-endian version
>of that) so they could use these as cheap-but-readable type

AFAIK the are always converted as big-endian values.

>identifiers.) In Python 2, these are represented as 'str' instances,
>which PyMac_GetOSType() in Python/mactoolboxglue.c converts to the
>native int format. For Python 3, right now they're str8's, but str8 is
>theoretically supposed to go away. Because they're binary constants
>displayed as ASCII, not unicode text, I initially thought that 'bytes'
>was the appropriate type. Unfortunately, bytes is mutable, and I think
>it makes sense to hash these constants (and some code in aepack.py
>does).
>
>So, I'm stuck and wanted to ask the list for input. I see 5 options:
> 1) Make these str instances so they're immutable and just rely on
>convention and runtime errors to keep them in ascii.
> 2) Make them bytes, and cast them to something else when you want to
>make them keys in a dict.
> 3) Keep them str8 and give up on getting rid of it.
> 4) Make bytes immutable, add a 'buffer' type which acts like the
>current bytes type, and make these codes instances of bytes. [probably
>impossible this late in the game]
> 5) Make a new hashable class for these codes which converts them to
>and from ints and bytes and becomes the general argument type for the
>apple platform interface. [Cleanest, but lots of work that I'm not
>volunteering to do]

A 6th option is a subclass of int. It's constructor would accept a string containing the 4CC and the repr/str method would return the string representation of the code.  IMHO this is the cleanest representation of 4CCs in Python because those codes are basicy a "neat" way to enter integer literals in C.

This would also solve a problem that PyObjC users sometimes run into: Several C/Objective-C APIs return a dictionary where one  of the values is an integer and where one would commonly use 4CCs to write down literals. This currently causes unexpected failures but would do the right thing with this option.

Ronald



More information about the Python-3000 mailing list