Just to update this thread, Serhiy beat me to a patch and committed it. Through the change he found a few latent bugs in the stdlib itself so this idea has already proven/paid for itself. On Mon, Mar 16, 2015 at 1:25 PM Brett Cannon <bcannon@gmail.com> wrote:
On Mon, Mar 16, 2015 at 11:22 AM Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2015-03-16 16:11 GMT+01:00 Brett Cannon <bcannon@gmail.com>:
One of the rather subtle issues with writing Python 2/3 code is that indexing on bytes in Python 2 returns a length-1 bytes object while in Python 3 it returns an int. Because ==/!= always returns True/False it can be a very subtle failure and tough to track down.
I worked on such patch in the past, but I lost it :-) I can help to rewrite it if needed.
I filed http://bugs.python.org/issue23681 if you want to help (although I don't expect the patch to be complicated; I might ask you for a code review though =).
So yes, it *is* very useful to port a large Python 2 project to Python 3.
For example, you may not be able to run your application with Python 3 because a third party library cannot be imported on Python 3, so it blocks the whole work on porting an application to Python 3. Until the module is ported, you may want to prepare the port. Checking bytes==str just by reading the source code is difficult.
Other issues which can only be "seen" at runtime when running an application on Python 3 :
- "x > 0" with x=None => TypeError is raised in Python 3
Testing can easily catch that *and* have a traceback to trace down where the problem originated from, so I'm not worried.
- x / 8 where x is an int => becomes a float in Python 3, it's hard to detect this issue in Python 2 just by reading the source code :-/
-Q in Python 2 and/or adding the division __future__ statement solves this one.
Basically I'm just trying to plug the last few holes that can't be automated by tools like Modernize or Futurize *and* won't surface where the problem is easily during testing under Python 2 or 3 (e.g., a traceback during testing makes it easy to find unless you swallow the exception, in which case a warning won't help you either if you use -Werror).
What do people think of extending -b/-bb in Python 3 to warn when performing equality between an int and a bytes object of any length? I don't want to restrict to length-1 bytes objects because people may be doing comparisons where the result can be length-1 or any other length and thus would still have a subtle bug to pick up. Do people think this would raise a ton of false-positives? Would people find it useful?
First ensure that the stdlib doesn't raise any BytesWarning exception.
Yep, I always test with -Werror whenever I muck with something that can raise a warning (side-effect from helping to write _warnings.c).
-Brett
For example, os.get_exec_path() has to modify warnings filters temporary in Python 3 :-/
# {b'PATH': ...}.get('PATH') and {'PATH': ...}.get(b'PATH') emit a # BytesWarning when using python -b or python -bb: ignore the warning
Victor