Warn about comparing bytes to int for `python3 -b`
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down. What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
As long as it's part of -b/-bb this sounds like a useful (though small) bit of help for people in the last throes of porting a big package to PY3. As for how many false positives it will trigger, who knows? The most likely case would be if people use dicts whose keys can be bytestrings or ints -- I know that's a popular hobby when it comes to str/int, but I don't know if it's also common with bytes/int. I guess the only way to find out is to build and release it. On Mon, Mar 16, 2015 at 8:11 AM, Brett Cannon <bcannon@gmail.com> wrote:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
http://bugs.python.org/issue23681 to track this for Python 3.5. On Mon, Mar 16, 2015 at 11:16 AM Guido van Rossum <guido@python.org> wrote:
As long as it's part of -b/-bb this sounds like a useful (though small) bit of help for people in the last throes of porting a big package to PY3. As for how many false positives it will trigger, who knows? The most likely case would be if people use dicts whose keys can be bytestrings or ints -- I know that's a popular hobby when it comes to str/int, but I don't know if it's also common with bytes/int. I guess the only way to find out is to build and release it.
On Mon, Mar 16, 2015 at 8:11 AM, Brett Cannon <bcannon@gmail.com> wrote:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
Hi, 2015-03-16 16:11 GMT+01:00 Brett Cannon <bcannon@gmail.com>:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
I worked on such patch in the past, but I lost it :-) I can help to rewrite it if needed. So yes, it *is* very useful to port a large Python 2 project to Python 3. For example, you may not be able to run your application with Python 3 because a third party library cannot be imported on Python 3, so it blocks the whole work on porting an application to Python 3. Until the module is ported, you may want to prepare the port. Checking bytes==str just by reading the source code is difficult. Other issues which can only be "seen" at runtime when running an application on Python 3 : - "x > 0" with x=None => TypeError is raised in Python 3 - x / 8 where x is an int => becomes a float in Python 3, it's hard to detect this issue in Python 2 just by reading the source code :-/
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
First ensure that the stdlib doesn't raise any BytesWarning exception. For example, os.get_exec_path() has to modify warnings filters temporary in Python 3 :-/ # {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a # BytesWarning when using python -b or python -bb: ignore the warning Victor
On Mon, Mar 16, 2015 at 11:22 AM Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2015-03-16 16:11 GMT+01:00 Brett Cannon <bcannon@gmail.com>:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
I worked on such patch in the past, but I lost it :-) I can help to rewrite it if needed.
I filed http://bugs.python.org/issue23681 if you want to help (although I don't expect the patch to be complicated; I might ask you for a code review though =).
So yes, it *is* very useful to port a large Python 2 project to Python 3.
For example, you may not be able to run your application with Python 3 because a third party library cannot be imported on Python 3, so it blocks the whole work on porting an application to Python 3. Until the module is ported, you may want to prepare the port. Checking bytes==str just by reading the source code is difficult.
Other issues which can only be "seen" at runtime when running an application on Python 3 :
- "x > 0" with x=None => TypeError is raised in Python 3
Testing can easily catch that *and* have a traceback to trace down where the problem originated from, so I'm not worried.
- x / 8 where x is an int => becomes a float in Python 3, it's hard to detect this issue in Python 2 just by reading the source code :-/
-Q in Python 2 and/or adding the division __future__ statement solves this one. Basically I'm just trying to plug the last few holes that can't be automated by tools like Modernize or Futurize *and* won't surface where the problem is easily during testing under Python 2 or 3 (e.g., a traceback during testing makes it easy to find unless you swallow the exception, in which case a warning won't help you either if you use -Werror).
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
First ensure that the stdlib doesn't raise any BytesWarning exception.
Yep, I always test with -Werror whenever I muck with something that can raise a warning (side-effect from helping to write _warnings.c). -Brett
For example, os.get_exec_path() has to modify warnings filters temporary in Python 3 :-/
# {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a # BytesWarning when using python -b or python -bb: ignore the warning
Victor
Just to update this thread, Serhiy beat me to a patch and committed it. Through the change he found a few latent bugs in the stdlib itself so this idea has already proven/paid for itself. On Mon, Mar 16, 2015 at 1:25 PM Brett Cannon <bcannon@gmail.com> wrote:
On Mon, Mar 16, 2015 at 11:22 AM Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2015-03-16 16:11 GMT+01:00 Brett Cannon <bcannon@gmail.com>:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
I worked on such patch in the past, but I lost it :-) I can help to rewrite it if needed.
I filed http://bugs.python.org/issue23681 if you want to help (although I don't expect the patch to be complicated; I might ask you for a code review though =).
So yes, it *is* very useful to port a large Python 2 project to Python 3.
For example, you may not be able to run your application with Python 3 because a third party library cannot be imported on Python 3, so it blocks the whole work on porting an application to Python 3. Until the module is ported, you may want to prepare the port. Checking bytes==str just by reading the source code is difficult.
Other issues which can only be "seen" at runtime when running an application on Python 3 :
- "x > 0" with x=None => TypeError is raised in Python 3
Testing can easily catch that *and* have a traceback to trace down where the problem originated from, so I'm not worried.
- x / 8 where x is an int => becomes a float in Python 3, it's hard to detect this issue in Python 2 just by reading the source code :-/
-Q in Python 2 and/or adding the division __future__ statement solves this one.
Basically I'm just trying to plug the last few holes that can't be automated by tools like Modernize or Futurize *and* won't surface where the problem is easily during testing under Python 2 or 3 (e.g., a traceback during testing makes it easy to find unless you swallow the exception, in which case a warning won't help you either if you use -Werror).
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
First ensure that the stdlib doesn't raise any BytesWarning exception.
Yep, I always test with -Werror whenever I muck with something that can raise a warning (side-effect from helping to write _warnings.c).
-Brett
For example, os.get_exec_path() has to modify warnings filters temporary in Python 3 :-/
# {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a # BytesWarning when using python -b or python -bb: ignore the warning
Victor
participants (3)
-
Brett Cannon
-
Guido van Rossum
-
Victor Stinner