[Python-Dev] FW: 64-bit port of Python

Tim Peters tim_one@email.msn.com
Thu, 10 Feb 2000 23:59:00 -0500


[leaving out areas of universal harmony]

>>> "Modules/arraymodule.c::728":
>>>
>>>   static PyObject *
>>>   array_buffer_info(self, args)
>>>   	arrayobject *self;
>>>   	PyObject *args;
>>>   {
>>>   	return Py_BuildValue("ll",
>>>   			     	(long)(self->ob_item),
>>> 					(long)(self->ob_size));
>>>   }

[Greg Stein]
> Yah. That function is quite insane. *shudder*
>
> It should use PyLong_FromVoidPtr for the first item, and
> PyInt_FromLong for the second (IMO, it is reasonable to assume
> ob_size fits in a C long).

Until someone whines otherwise, ya.

> However, I'd simply argue that the function should probably be
> punted.  What is it for?

Haven't used it myself.  From the (array module) docs:

buffer_info()
    Return a tuple (address, length) giving the current memory
    address and the length in bytes of the buffer used to hold
    array's contents. This is occasionally useful when working
    with low-level (and inherently unsafe) I/O interfaces that
    require memory addresses, such as certain ioctl() operations.
    The returned numbers are valid as long as the array exists
    and no length-changing operations are applied to it.

So we have to assume it's used, and returns what it says it does.


[on "id"]
> Assuming that we can say that id() is allowed to return a
> PyLongObject, then this should just use PyLong_FromVoidPtr.

Yes, that would be fine.  The docs say it returns "an integer", which
generally means it's not specified whether it returns a (Python) int or
(Python) long.  Andrew is eradicating that distinction anyway.

> Tim! Wake up! New functions snuck in behind your back! hehe...

Oh, indeed they do!  I have been living in a fool's paradise, populated with
loving memories.  That is, since I stopped writing compilers for a living,
the only attention I've given to Python *internals* is when I've had to,
either to fix a bug (very rare) or speed something up (even rarer -- it's so
bloody fast already <wink>).  I'm delighted to see *you're* up to date!


[Trent Mick]

Trent, you should be on the Python-Dev list if you're going to (as I sure
hope you are!) be working on Python internals.  This requires Guido's
approval, but in his temporary absence I'm channeling his approval for you.
Guido can take it up with David Ascher if he doesn't like that (*nobody*
messes with David -- the guy is a hard core psycho).

[on id]
> This means that my System A and System B (above) get different
> resultant object types for id() just because the compiler used
> for their Python interpreter uses a different data model.

So long as they're using different memory models, and so long as Python
distinguishes between int and long, I'd say that's *expected*.  id()'s are
valid only for the life of a single run, and, as Greg said, the only thing
you can do with them that's guaranteed to work is compare them for equality
(well, you can use cmp on 'em too, but that's unusual and will work fine
anyway).

> ...
> I know that noone should really need to be passing converted
> pointer results between platforms, but...

id() is not guaranteed to return an address to begin with, so anyone relying
on that is hosed regardless.

> ...
> I want to make a couple of suggestions about this 64-bit
> compatibility stuff. I will probably sound like I am on glue
> but please bear with me and let me try and convince you that
> I am not.

It did not sound like you're on glue.  It did sound like you have strong
opinions about how C "should be" used that don't coincide with the bulk of C
programmers' views, and that's enough to stop it right there:  part of why
extending & embedding Python is so popular is that the API caters to "lowest
common denominator" views of C.  Non-experts hooking up ancient legacy code
is *no problem* now; they need to learn the Python API, but the C part still
looks like C <wink>.

So, in the absence of evidence of *widespread* Win64 problems with Python, I
expect everyone here will favor the "find the handful of problems and just
fix 'em" approach implicit in what I've said and explicit in what Greg's
said.  Massive declaration changes don't seem *needed*, and would likely be
much more destabilizing (some of us here remember the Great Renaming without
fondness ...).

> ...
> PyInt was tied to C's 'long' based on the (reasonable) assumption
> that this would represent the largest native integral type on which
> Python was running

Na, it was C's "long" specifically, something every C programmer was & is
comfortable with.  It makes little sense to expose platform-specific
extensions to C's set of types as if they were somehow "std".  long isn't
always good enough anymore for Win64, but it's still good enough everywhere
else, and I'd be amazed to see others following MS's strange decision here.
Tail, dog, wag <wink>.

Summary:

> On the Python/C API side, use things like:
>  - PyInt would be tied to intlongest_t
>  - extern DL_IMPORT(PyObject *) PyInt_FromLongest
> Py_PROTO((intlongest_t));

This is exactly the kind of thing that will make the API instantly repellent
to the people it's trying to attract.  I understand (& agree!) that your
scheme is better, but Python is more interested in getting used.

> ...
> Well, just using 'int' carries the implicit assumption that 'int'
> is at least 16-bits wide.

ANSI C guarantees that it is, BTW.

> (Yes, Tim. ActiveState *is* paying me to look at this stuff.)

Heh heh -- I know everything <wink>.


[back to Greg]
> People do use the id() value when they are printing a repr() of
> objects.  Those uses may overflow, though, because people are
> using '%d' or '%x' format codes. It should be %s.

%d and %x should get fixed -- the int/long distinction mostly just creates
stumbling blocks.


[and on to Fredrik]
> footnote: assert Unix in (LP32, LP64), according to the single Unix
> specification (and if you dig up their rationale, you'll see why
> everything else is totally braindead -- I'm usually no Microsoft
> basher, but this really pisses me off)

Strangely enough, when KSR was doing its 64-bit Unix, it, DEC and Cray were
the *only* ones pushing LP64.  Everyone else was announcing MS-style plans.
By the time they got around to actually building HW and porting code, they
changed their minds.  Unfortunately, MS has dozens of millions of lines of
its *own* cheating code to port, and sizeof(long) == sizeof(DWORD) is a
universal bad assumption in that code.  I understood more of it a couple
years ago (when it was being planned), and expected they would take this
short-term easiest (for them) way out; I'm not sure they had a realistic
alternative; and *they're* sure they didn't.

BTW, when KSR folded, a 128-bit machine was on the drawing board.  Nothing
lasts forever <wink>.