[Python-Dev] Division of tool labour in porting Python 2 code to 2/3

Fri Jun 6 18:37:30 CEST 2014

After Glyph and Alex's email about their asks for assisting in writing
Python 2/3 code, it got me thinking about where in the toolchain various
warnings and such should go in order to help direct energy to help develop
whatever future toolchain to assist in porting.

There seems to be three places where issues are/can be caught once a
project has embarked down the road of 2/3 source compatibility:

   1. -3 warnings
   2. Some linter tool
   3. Failing tests

-3 warnings are things that we know are flat-out wrong and do not cause
massive compatibility issues in the stdlib. For instance, warning that
buffer() is not in Python 3 is a py3k warning -- Glyph made a mistake when
he asked for it as a new warning -- is a perfect example of something that
isn't excessively noisy and won't cause issues when people run with it.

But what about warning about classic classes? The stdlib is full of them
and they were purposefully left alone for compatibility reasons. But there
is a subtle semantic difference between classic and new-style classes, and
so 2/3 code should consider switching (this is when people chime in saying
"this is why we want a 2.8 release!", but that still isn't happening). If
this were made a py3k warning in 2.7 then the stdlib itself would spew out
warnings which we can't change due to compatibility, so that makes it not
useful (http://bugs.python.org/issue21231). But as part of a lint tool
specific to Python 2.7 that kind of warning would not be an issue and is
easily managed and integrated into CI setups to make sure there are no
regressions.

Lastly, there are things like string/unicode comparisons.
http://bugs.python.org/issue21401 has a patch from VIctor which warns when
comparing strings and unicode in Python 2.7. Much like the classic classes
example, the stdlib becomes rather noisy due to APIs that handle either/or,
etc. But unlike the classic classes example, you just can't systematically
verify that two variables are always going to be str vs. unicode in Python
2.7 if they aren't literals. If people want to implement type constraint
graphs for 2.7 code to help find them then that's great, but I personally
don't have that kind of time. In this instance it would seem like relying
on a project's unit tests to find this sort of problem is the best option.

With those three levels in mind, where do we draw the line between these
levels? Take for instance the print statement. Right now there is no
warning with -3. Do we add one and then update the 2.7 stdlib to prevent
warnings being generated by the stdlib? Or do we add it to some linter tool
to pick up when people accidentally leave one in their code?

The reason I ask is since this is clear I'm willing to spearhead the
tooling work we talked about at the language summit to make sure there's a
clear path for people wanting to port which is as easy as (reasonably)
possible, but I don't want to start on it until I have a clear indication
of what people are going to be okay with.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/ca24db50/attachment.html>