[Tutor] Limitation of int() in converting strings
eryksun
eryksun at gmail.com
Fri Dec 28 07:34:21 CET 2012
On Thu, Dec 27, 2012 at 12:13 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
>
> I hadn't realised that. Does the int(obj) function use isinstance(obj,
> str) under the hood?
Yes. int_new and long_new use the macros PyString_Check (in 3.x
PyBytes_Check) and PyUnicode_Check, which check the type's tp_flags.
The C API can check for a subclass via tp_flags for the following
types:
#define Py_TPFLAGS_INT_SUBCLASS (1L<<23)
#define Py_TPFLAGS_LONG_SUBCLASS (1L<<24)
#define Py_TPFLAGS_LIST_SUBCLASS (1L<<25)
#define Py_TPFLAGS_TUPLE_SUBCLASS (1L<<26)
#define Py_TPFLAGS_STRING_SUBCLASS (1L<<27)
#define Py_TPFLAGS_UNICODE_SUBCLASS (1L<<28)
#define Py_TPFLAGS_DICT_SUBCLASS (1L<<29)
#define Py_TPFLAGS_BASE_EXC_SUBCLASS (1L<<30)
#define Py_TPFLAGS_TYPE_SUBCLASS (1L<<31)
In 3.x bit 27 is renamed Py_TPFLAGS_BYTES_SUBCLASS.
nb_int (__int__) in a types's PyNumberMethods is a unaryfunc, so
__int__ as designed can't have the optional "base" argument that's
used for strings. That has to be special cased.
Without a specified a base, int_new (in 3.x long_new) redirects to the
abstract function PyNumber_Int (in 3.x PyNumber_Long). This tries
__int__ and __trunc__ (the latter returns an Integral, which is
converted to int) before checking for a string or char buffer.
Using the buffer interface is the reason the following works for a
bytearray in 2.x:
>>> int(bytearray('123'))
123
but specifying a base fails:
>>> int(bytearray('123'), 10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() can't convert non-string with explicit base
long_new in 3.x adds a PyByteArray_Check:
>>> int(bytearray(b'123'), 10)
123
Regarding this whole debate, I think a separate constructor for
strings would have been cleaner, but I'm not Dutch.
Source links:
3.3, long_new (see 4277):
http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/longobject.c#l4248
3.3, PyNumber_Long:
http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/abstract.c#l1262
2.7.3, int_new:
http://hg.python.org/cpython/file/70274d53c1dd/Objects/intobject.c#l1049
2.7.3, PyNumber_int:
http://hg.python.org/cpython/file/70274d53c1dd/Objects/abstract.c#l1610
More information about the Tutor
mailing list