Problem with -3 switch

John Machin sjmachin at lexicon.net
Mon Jan 12 07:32:07 CET 2009


On Jan 12, 12:23 pm, Carl Banks <pavlovevide... at gmail.com> wrote:
> On Jan 9, 6:11 pm, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
>
>
> > On Jan 10, 6:58 am, Carl Banks <pavlovevide... at gmail.com> wrote:
>
> > > On Jan 9, 12:36 pm, "J. Cliff Dyer" <j... at sdf.lonestar.org> wrote:
>
> > > > On Fri, 2009-01-09 at 13:13 -0500, Steve Holden wrote:
> > > > > Aivar Annamaa wrote:
> > > > > >> As was recently pointed out in a nearly identical thread, the -3
> > > > > >> switch only points out problems that the 2to3 converter tool can't
> > > > > >> automatically fix. Changing print to print() on the other hand is
> > > > > >> easily fixed by 2to3.
>
> > > > > >> Cheers,
> > > > > >> Chris
>
> > > > > > I see.
> > > > > > So i gotta keep my own discipline with print() then :)
>
> > > > > Only if you don't want to run your 2.x code through 2to3 before you use
> > > > > it as Python 3.x code.
>
> > > > > regards
> > > > >  Steve
>
> > > > And mind you, if you follow that route, you are programming in a
> > > > mightily crippled language.
>
> > > How do you figure?
>
> > > I expect that it'd be a PITA in some cases to use the transitional
> > > dialect (like getting all your Us in place), but that doesn't mean the
> > > language is crippled.
>
> > What is this "transitional dialect"? What does "getting all your Us in
> > place" mean?
>
> Transitional dialect is the subset of Python 2.6 that can be
> translated to Python3 with 2to3 tool.

I'd never seen it called "transitional dialect" before.

>  Getting all your Us in place
> refers to prepending a u to strings to make them unicode objects,
> which is something 2to3 users are highly advised to do to keep hassles
> to a minimum.  (Getting Bs in place would be a good idea too.)

Ummm ... I'm not understanding something. 2to3 changes u"foo" to
"foo", doesn't it? What's the point of going through the code and
changing all non-binary "foo" to u"foo" only so that 2to3 can rip the
u off again? What hassles? Who's doing the highly-advising where and
with what supporting argument?

"Getting Bs into place" is necessary eventually. Whether it is
worthwhile trying to find these in advance, or waiting for them to be
picked up at testing time is a bit of a toss-up.

Let's look at this hypothetical but fairly realistic piece of 2.x
code:
OLE2_SIGNATURE = "\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1"
def is_ole2_file(filepath):
     return open(filepath, "rb").read(8) == OLE2_SIGNATURE

This is already syntactically valid 3.x code, and won't be changed by
2to3, but it won't work in 3.x because b"x" != "x" for all x. In this
case, the cause of test failures should be readily apparent; in other
cases the unexpected exception or test failure may happen at some
distance.

The 3.x version needs to have the effect of:
OLE2_SIGNATURE = b"\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1"
def is_ole2_file(filepath):
     return open(filepath, "rb").read(8) == OLE2_SIGNATURE

So in my regional variation of the transitional dialect, this becomes:
from timemachine import *
OLE2_SIGNATURE = BYTES_LITERAL("\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1")
def is_ole2_file(filepath):
     return open(filepath, "rb").read(8) == OLE2_SIGNATURE
# NOTE: don't change "rb"
...
and timemachine.py contains (amongst other things):
import sys
python_version = sys.version_info[:2] # e.g. version 2.4 -> (2, 4)
if python_version >= (3, 0):
    BYTES_LITERAL = lambda x: x.encode('latin1')
else:
    BYTES_LITERAL = lambda x: x

It is probably worthwhile taking an up-front inventory of all file open
() calls and [c]StringIO.StringIO() calls -- is the file being used as
a text file or a binary file?
If a text file, check that any default encoding is appropriate.
If a binary file, ensure there's a "b" in the mode (real file) or you
supply (in 3.X) an io.BytesIO() instance, not an io.StringIO()
instance.

>
> > Steve & Cliff are talking about the rather small subset of Python that
> > is not only valid syntax in both 2.x and 3.x but also has the same
> > meaning in 2.x and 3.x.
>
> That would be a crippled language, yes.  But I do not believe that's
> what Steve and Cliff are referring to.  Steve wrote of "running your
> code through 2to3", and that was what Cliff followed up to, so I
> believe they are both referring to writing valid code in 2.6 which is
> able to be translated through 2to3, and then generating 3.0 code using
> 2to3.  That is not a crippled language at all, just a PITA sometimes.

Uh huh; I assumed that "crippled" was being applied to the worse of
the two options :-)

Cheers,
John



More information about the Python-list mailing list