[Adding the py3k list; please remove python-dev in followups.]
On 5/29/06, "Martin v. Löwis"
I thought Py3k will have a single integer type whose representation varies depending on the value being represented.
That's one proposal. Another is to have an abstract 'int' type with two concrete subtypes, e.g. 'short' and 'long', corresponding to today's int and long. At the C level the API should be unified so C programmers are isolated from the difference (they aren't today).
I haven't seen an actual proposal for such a type,
I'm not sure that my proposal above has ever been said out loud. I'm also not partial; I think we may have to do an experiment to decide.
so let me make one:
struct PyInt{ struct PyObject ob; Py_ssize_t value_or_size; char is_long; digit ob_digit[1]; };
If is_long is false, then value_or_size is the value (represented as Py_ssize_t), else the value is in ob_digit, and value_or_size is the size.
Nice. I guess if we store the long value in big-endian order we could drop is_long, since the first digit of the long would always be nonzero. This would save a byte (on average) for the longs, but it would do nothing for the wasted space for short ints.
PyLong_* will be synonyms for PyInt_*.
Why do we need to keep the PyLong_* APIs at all? Even at the Python level we're not planning any backward compatibility features; at the C level I like even more freedom to break things.
PyInt_FromLong/AsLong will continue to exist; PyInt_AsLong will indicate an overflow with -1. Likewise, PyArg_ParseTuple "i" will continue to produce int, and raise an exception (OverflowError?) when the value is out of range.
C code can then decide whether to parse a Python integer as C int, long, long long, or ssize_t.
Nice. I like the unified API and I like using Py_ssize_t instead of long for the value; this ensures that an int can hold a pointer (if we allow for signed pointers) and matches the native word size better on Windows (I guess it makes no difference for any other platform, where ssize_t and long already have the same size). I worry about all the wasted space for alignment caused by the extra flag byte though. That would be 4 byte per integer on 32-bit machines (where they are currently 12 bytes) and 8 bytes on 64-bit machines (where they are currently 24 bytes). That's why I'd like my alternative proposal (int as ABC and two subclasses that may remain anonymous to the Python user); it'll save the alignment waste for short ints and will let us use a smaller int type for the size for long ints (if we care about the latter). -- --Guido van Rossum (home page: http://www.python.org/~guido/)