[Tutor] Limitation of int() in converting strings

eryksun eryksun at gmail.com
Fri Dec 28 07:34:21 CET 2012


On Thu, Dec 27, 2012 at 12:13 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
>
> I hadn't realised that. Does the int(obj) function use isinstance(obj,
> str) under the hood?

Yes. int_new and long_new use the macros PyString_Check (in 3.x
PyBytes_Check) and PyUnicode_Check, which check the type's tp_flags.
The C API can check for a subclass via tp_flags for the following
types:

    #define Py_TPFLAGS_INT_SUBCLASS         (1L<<23)
    #define Py_TPFLAGS_LONG_SUBCLASS        (1L<<24)
    #define Py_TPFLAGS_LIST_SUBCLASS        (1L<<25)
    #define Py_TPFLAGS_TUPLE_SUBCLASS       (1L<<26)
    #define Py_TPFLAGS_STRING_SUBCLASS      (1L<<27)
    #define Py_TPFLAGS_UNICODE_SUBCLASS     (1L<<28)
    #define Py_TPFLAGS_DICT_SUBCLASS        (1L<<29)
    #define Py_TPFLAGS_BASE_EXC_SUBCLASS    (1L<<30)
    #define Py_TPFLAGS_TYPE_SUBCLASS        (1L<<31)

In 3.x bit 27 is renamed Py_TPFLAGS_BYTES_SUBCLASS.

nb_int (__int__) in a types's PyNumberMethods is a unaryfunc, so
__int__ as designed can't have the optional "base" argument that's
used for strings. That has to be special cased.

Without a specified a base, int_new (in 3.x long_new) redirects to the
abstract function PyNumber_Int (in 3.x PyNumber_Long). This tries
__int__ and __trunc__ (the latter returns an Integral, which is
converted to int) before checking for a string or char buffer.

Using the buffer interface is the reason the following works for a
bytearray in 2.x:

    >>> int(bytearray('123'))
    123

but specifying a base fails:

    >>> int(bytearray('123'), 10)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: int() can't convert non-string with explicit base

long_new in 3.x adds a PyByteArray_Check:

    >>> int(bytearray(b'123'), 10)
    123

Regarding this whole debate, I think a separate constructor for
strings would have been cleaner, but I'm not Dutch.

Source links:

3.3, long_new (see 4277):
http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/longobject.c#l4248

3.3, PyNumber_Long:
http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/abstract.c#l1262

2.7.3, int_new:
http://hg.python.org/cpython/file/70274d53c1dd/Objects/intobject.c#l1049

2.7.3, PyNumber_int:
http://hg.python.org/cpython/file/70274d53c1dd/Objects/abstract.c#l1610


More information about the Tutor mailing list