Re: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int()

+ # For example, PyPy 1.9.0 raised TypeError for these cases because it + # expects x to be a string if base is given. + @support.cpython_only + def test_base_arg_with_no_x_arg(self): + self.assertEquals(int(base=6), 0) + # Even invalid bases don't raise an exception. + self.assertEquals(int(base=1), 0) + self.assertEquals(int(base=1000), 0) + self.assertEquals(int(base='foo'), 0)
I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right. The current online doc gives the signature as int(x=0) int(x, base=10) <<where x is s string>> The 3.3.0 docstring says "When converting a string, use the optional base. It is an error to supply a base when converting a non-string." Certainly, accepting any object as a base, violating "The allowed values are 0 and 2-36." just because giving a base is itself invalid is crazy. -- Terry Jan Reedy

On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:
+ # For example, PyPy 1.9.0 raised TypeError for these cases because it + # expects x to be a string if base is given. + @support.cpython_only + def test_base_arg_with_no_x_arg(self): + self.assertEquals(int(base=6), 0) + # Even invalid bases don't raise an exception. + self.assertEquals(int(base=1), 0) + self.assertEquals(int(base=1000), 0) + self.assertEquals(int(base='foo'), 0)
I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right.
I support further discussion here. (I did draft the patch, but it was a first version. I did not commit the patch.)
The current online doc gives the signature as int(x=0) int(x, base=10) <<where x is s string>>
The 3.3.0 docstring says "When converting a string, use the optional base. It is an error to supply a base when converting a non-string."
One way to partially explain CPython's behavior is that when base is provided, the function behaves as if x defaults to '0' rather than 0. This is similar to the behavior of str(), which defaults to b'' when encoding or errors is provided, but otherwise defaults to '': http://docs.python.org/dev/library/stdtypes.html#str
Certainly, accepting any object as a base, violating "The allowed values are 0 and 2-36." just because giving a base is itself invalid is crazy.
For further background (and you can see this is the 2.7 commit), int(base='foo') did raise TypeError in 2.7, but this particular case was relaxed in Python 3. --Chris

On 12/23/2012 4:47 PM, Chris Jerdonek wrote:
On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:
+ # For example, PyPy 1.9.0 raised TypeError for these cases because it + # expects x to be a string if base is given. + @support.cpython_only + def test_base_arg_with_no_x_arg(self): + self.assertEquals(int(base=6), 0) + # Even invalid bases don't raise an exception. + self.assertEquals(int(base=1), 0) + self.assertEquals(int(base=1000), 0) + self.assertEquals(int(base='foo'), 0)
I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right.
In any case, the discrepancy between doc and behavior is a bug and should be fixed one way or the other way. Unlike int(), I do not see a realistic use case for int(base=x) that would make it anything other than a bug.
I support further discussion here. (I did draft the patch, but it was a first version. I did not commit the patch.)
The current online doc gives the signature as int(x=0) int(x, base=10) <<where x is s string>>
The 3.3.0 docstring says "When converting a string, use the optional base. It is an error to supply a base when converting a non-string."
One way to partially explain CPython's behavior is that when base is provided, the function behaves as if x defaults to '0' rather than 0.
That explanation does not work. int('0', base = invalid) and int(x='0', base=invalid) raise TypeError or ValueError. If providing a value explicit changes behavior, then that value is not the default. To make '0' really be the base-present default, the doc and above behavior should be changed. Or, make '' the default and have int('', base=whatever) return 0 instead of raising. (This would be the actual parallel to the str case.)
This is similar to the behavior of str(), which defaults to b'' when encoding or errors is provided, but otherwise defaults to '':
This is different. Providing b'' explicitly has no effect. str(encoding=x, errors=y) and str(b'', encoding=x, errors=y) act the same. If x or y is not a string, both raise TypeError. (Unlike int and base.) A bad encoding string is ignored because the encoding lookup is not done unless there is something to encode. (This is why the ignore-base base-default should be '', not '0'.) A bad error specification is (I believe) ignored for any error-free bytes/encoding pair because, again, the lookup is only done when needed.
http://docs.python.org/dev/library/stdtypes.html#str
Certainly, accepting any object as a base, violating "The allowed values are 0 and 2-36." just because giving a base is itself invalid is crazy.
For further background (and you can see this is the 2.7 commit), int(base='foo') did raise TypeError in 2.7, but this particular case was relaxed in Python 3.
Since the doc was not changed, that introduced a bug. -- Terry Jan Reedy

On Sun, Dec 23, 2012 at 6:19 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 12/23/2012 4:47 PM, Chris Jerdonek wrote:
On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:
+ # For example, PyPy 1.9.0 raised TypeError for these cases because it + # expects x to be a string if base is given. + @support.cpython_only + def test_base_arg_with_no_x_arg(self): + self.assertEquals(int(base=6), 0) + # Even invalid bases don't raise an exception. + self.assertEquals(int(base=1), 0) + self.assertEquals(int(base=1000), 0) + self.assertEquals(int(base='foo'), 0)
I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right.
In any case, the discrepancy between doc and behavior is a bug and should be fixed one way or the other way. Unlike int(), I do not see a realistic use case for int(base=x) that would make it anything other than a bug.
Just to be clear, I agree with you that something needs fixing (and again, I did not commit the patch). But I want to clarify a couple of your responses to my points.
One way to partially explain CPython's behavior is that when base is provided, the function behaves as if x defaults to '0' rather than 0.
That explanation does not work. int('0', base = invalid) and int(x='0', base=invalid) raise TypeError or ValueError.
I was referring to the behavioral discrepancy between CPython returning 0 for int(base=valid) and the part of the docstring you quoted which says, "It is an error to supply a base when converting a non-string." I wasn't justifying the case of int(base=invalid). That's why I said "partially" explains. The int(base=valid) case is covered by the following line of the CPython-specific test that was committed (which in PyPy raises TypeError): + self.assertEquals(int(base=6), 0)
If providing a value explicit changes behavior, then that value is not the default. To make '0' really be the base-present default, the doc and above behavior should be changed. Or, make '' the default and have int('', base=whatever) return 0 instead of raising. (This would be the actual parallel to the str case.)
This is similar to the behavior of str(), which defaults to b'' when encoding or errors is provided, but otherwise defaults to '':
This is different. Providing b'' explicitly has no effect. str(encoding=x, errors=y) and str(b'', encoding=x, errors=y) act the same. If x or y is not a string, both raise TypeError. (Unlike int and base.) A bad encoding string is ignored because the encoding lookup is not done unless there is something to encode. (This is why the ignore-base base-default should be '', not '0'.) A bad error specification is (I believe) ignored for any error-free bytes/encoding pair because, again, the lookup is only done when needed.
Again, I was referring to the "valid" case. My point was that str()'s object argument defaults to '' when encoding or errors isn't given, and otherwise defaults to b''. You can see that the object argument defaults to '' in the simpler case here:
str(), str(object=''), str(object=b'') ('', '', "b''")
But when the encoding argument is given the default is different (it is b''):
str(object='', encoding='utf-8') TypeError: decoding str is not supported str(encoding='utf-8'), str(object=b'', encoding='utf-8') ('', '')
But again, these are clarifications of my comments. I'm not disagreeing with your larger point. --Chris

On 23.12.12 22:03, Terry Reedy wrote:
I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right.
participants (3)
-
Chris Jerdonek
-
Serhiy Storchaka
-
Terry Reedy