[Python-Dev] 64-bit sequence and buffer protocol
Travis Oliphant
oliphant at ee.byu.edu
Tue Mar 29 23:04:23 CEST 2005
I'm posting to this list to again generate open discussion on the
problem in current Python that an int is used in both the Python
sequence protocol and the Python buffer protocol.
The problem is that a C-int is typically only 4 bytes long while there
are many applications (mmap for example), that would like to access
sequences much larger than can be addressed with 32 bits. There are
two aspects to this problem:
1) Some 64-bit systems still define an C-int as 4 bytes long (so even
in-memory sequence objects could not be addressed using the sequence
protocol).
2) Even 32-bit systems have occasion to sequence a more abstract object
(perhaps it is not all in memory) which requires more than 32 bits to
address.
These are the solutions I've seen:
1) Convert all C-ints to Py_LONG_LONG in the sequence and buffer protocols.
2) Add new C-API's that mirror the current ones which use Py_LONG_LONG
instead of the current int.
3) Change Python to use the mapping protocol first (even for slicing)
when both the mapping and sequence protocols are defined.
4) Tell writers of such large objects to not use the sequence and/or
buffer protocols and instead use the mapping protocol and a different
"bytes" object (that currently they would have to implement themselves
and ignore the buffer protocol C-API).
What is the opinion of people on this list about how to fix the
problem. I believe Martin was looking at the problem and had told
Perry Greenfield he was "fixing it." Apparently at the recent PyCon,
Perry and he talked and Martin said the problem is harder than he had
initially thought. It would be good to document what some of this
problems are so that the community can assist in fixing this problem.
-Travis O.
More information about the Python-Dev
mailing list