[Python-Dev] ssize_t: ints in header files

"Martin v. Löwis" martin at v.loewis.de
Mon May 29 23:35:45 CEST 2006


Neal Norwitz wrote:
> Should the following values be ints (limited to 2G)?
> 
>  * dict counts (ma_fill, ma_used, ma_mask)

I think Tim said he'll look into them.

>  * line #s and column #s

I think we should really formalize a limit, and then enforce it
throughout. column numbers shouldn't exceed 16-bits, and line #s
shouldn't exceed 31 bits.

>  * AST (asdl.h) sequences

we should first limit all the others, and then it should come
out naturally that AST sequences can be happily limited to 31 bits.

>  * recursion limit

This should be Py_ssize_t. While the stack is typically more limited,
it should be possible to configure it to exceed 4GiB on a 64-bit
machine.

>  * read/write/send/recv results and buf size

This is very tricky. Often, the underlying C library has int there
(e.g. msvcrt). Eventually, we should get rid of msvcrt, and then
hope that the underlying system API can deal with larger buffers.

>  * code object values like # args, # locals, stacksize, # instructions

IMO, they should all get formally limited to 15 bits (i.e. short).
I think some are already thus limited through the byte code format.
Somebody would have to check, and document what the limits are.

Either we end up with different limits for each one, or, more likely,
the same limit, in which case I would suggest to introduce another
symbolic type (e.g. Py_codesize_t or some such). Then we should
consistently reject source code that exceeds such a limit, and
elsewhere rely on the guarantee that the values will be always
limited much more than the underlying data structures.

>  * sre (i think i have a patch to fix this somewhere)

This is a huge set of changes, I think. SRE *should* support strings
larger than 4GiB. I could happily accept limitations for the regexes
themselves (e.g. size of the compiled expression, number of repeats,
etc).

Regards,
Martin


More information about the Python-Dev mailing list