Overflow error (was Vol 67, Issue 192)
Scott David Daniels
Scott.Daniels at Acm.Org
Mon Apr 13 16:13:40 EDT 2009
Dave Angel wrote:
> Ryniek90 wrote:
>>>> .... But i've still haven't got answer for question:
>>>> "What's the max. length of string bytes which Python can hold?"
>>> sys.maxsize
>>> The largest positive integer supported by the platform’s
>>> Py_ssize_t type, and thus the maximum size lists, strings, dicts, and
>>> many other containers can have.
>> Thanks. I've wanted to check very carefully what's up, and i found
>> this: "strings (currently restricted to 2GiB)".
>> It's here, in PEP #353 (PEP 0353
>> <http://www.python.org/dev/peps/pep-0353/>). Besides of this, i've
>> found in sys module's docstring this:
>> maxint = 2147483647
>> maxunicode = 1114111
>> Which when added gives us 2148597758.0 bytes, which are equal to
>> 2049.0624980926514 MiB's.
This arithmetic makes very little sense. You are adding the maximum
value for a unicode code point and the maximum integer represented in
the underlying C compiler's int. Were you to do some kind of arithmetic
on those two numbers, I'd do:
sys.maxint / math.ceil(log(sys.maxunicode, 256))
That is "supposed to be" the number of unicode characters in a
maximal-length sequence of bytes. However, it doesn't even manage
that, as (I believe) even for those Pythons with 32-bit unicode
characters, sys.maxunicode is currently 0x10FFFF (the largest
code point defined by the UNicode consortium).
> How much RAM is in your system? Unless it's at least 50 gb, in a 64bit
> OS, I'd keep my max chunk size to much smaller than 2gb. For a typical
> 32bit system with 2 to 4gb of RAM, I'd probably chunk the file a meg or
> so at a time. Using large sizes is almost always a huge waste of resources.
Agreed. I you must do arithmetic to determine the chunk length, you
using the magic constant for practically everything, "42", can give you
the chunk size to use:
ord("4") * ord("2") * int("42") == 109200
--Scott David Daniels
Scott.Daniels at Acm.Org
More information about the Python-list
mailing list