[Python-Dev] Py_ssize_t
Guido van Rossum
guido at python.org
Tue Feb 20 16:57:48 CET 2007
On 2/20/07, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> After thinking more about Py_ssize_t, I'm surprised that we're not hearing about
> 64 bit users having a couple of major problems.
>
> If I'm understanding what was done for dictionaries, the hash table can grow
> larger than the range of hash values. Accordingly, I would expect large
> dictionaries to have an unacceptably large number of collisions. OTOH, we
> haven't heard a single complaint, so perhaps my understanding is off.
Not until the has table has 4 billion entries. I believe that would be
96 GB just for the hash table; plus probably at least that for that
many unique key strings. Not to mention the values (but those needn't
be unique). I think the benefit of 64-bit architecture start above
using 2 or 3 GB of RAM, so there's quite a bit of expansion space for
64-bit users before they run into this theoretical problem.
> The other area where I expected to hear wailing and gnashing of teeth is users
> compiling with third-party extensions that haven't been updated to a Py_ssize_t
> API and still use longs. I would have expected some instability due to the size
> mismatches in function signatures -- the difference would only show-up with
> giant sized data structures -- the bigger they are, the harder they fall. OTOH,
> there have not been any compliants either -- I would have expected someone to
> submit a patch to pyport.h that allowed a #define to force Py_ssize_t back to a
> long so that the poster could make a reliable build that included non-updated
> third-party extensions.
>
> In the absence of a bug report, it's hard to know whether there is a real
> problem. Have all major third-party extensions adopted Py_ssize_t or is some
> divine force helping unconverted extensions work with converted Python code?
> Maybe the datasets just haven't gotten big enough yet.
My suspicion is that building Python for an 64-bit address space is
still a somewhat academic exercise. I know we don't do this at Google
(we switch to other languages long before the datasets become so large
we'd need a 64-bit address space for Python). What's your experience
at EWT?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-Dev
mailing list