
Chris McDonough <chrism <at> plope.com> writes:
It's great to have software that installs easily. That said, the versions of Python that my software supports is (and has to be) be my choice.
Of course. And if I understand correctly, that's 2.6, 2.7, 3.2 and later versions. I'll ignore 2.5 and earlier in this specific reply.
None of them would so much as bat an eyelash if I told them today they had to use Python 3.3 (if it existed in a final released form anyway) to use my software. It's just a minor drop in the bucket of inconvenience they have to currently withstand.
Their pain (lacklustre library support and transliterating examples from 2.x to 3.x) would be the same under 3.2 and 3.3 (unless for some perverse reason people only made libraries work under one of 3.2 and 3.3, but not both). Is it really that hard to transliterate 2.x examples to 3.x in the literal-string dimension? I can't believe it is, as the target audience is programmers.
If the lack of u'' literal is what's holding them back, that's germane to the discussion of the PEP. If it's not, then why propose the PEP?
Like I said in an earlier email, u'' literal support is by no means the only issue for people who want to straddle. But it *is* an issue, and it's incredibly low-hanging fruit with near-zero real-world impact if it is reintroduced.
But the implication of the PEP is that lack of u'' support is a major hindrance to porting, justifying the production of the PEP and this discussion. And it's not low-hanging fruit with near-zero real-world impact if we're going to deprecate it at some point (which Guido was talking about) - you're just moving the pain to a later date, unless we don't ever deprecate. I feel, like some others, that 'xxx' is natural for text, u'xxx' is inelegant by comparison, and u('xxx') a little more inelegant still. However, allowing u'' syntax in 3.3 as per this PEP, but allowing it to be optional, allows any combination of u'xxx' and 'xxx' in code in a 3.x context, which doesn't see to me to be an ideal situation especially if you have hit-and-run contributors who are not necessarily attuned to project conventions.
You cast it as "backtracking" to reintroduce the syntax, but things have changed from when the decision to omit it was first made. Its omission introduces pain in a world where it's expected that we don't use 2to3 to automatically translate code at installation time.
I'm calling it like it is. "reintroduce" in this case means undoing something already done, so it's appropriate to say "backtracking". I don't agree that things have changed. If I want to write code that works on 2.x and 3.x without the pain of running 2to3 after every change, and I'm only interested in supporting >= 2.6 (your situation, IIUC), then I use "from __future__ import unicode_literals" - that's what it was created for, wasn't it? - and use 'xxx' where I need text, b'xxx' where I need bytes, and a function to deliver native strings where they're needed. If I have a 2.x project full of u'' code which I need to bring into this approach, then I run 2to3, review what it tells me, make the changes necessary (as far as literals go, that's adding the unicode_literals import to all files, and converting u'xxx' -> 'xxx'. When I test the result, I will find numerous failures, some of which point to places where I should have used native strings (e.g. kwargs keys), which I then fix. Other areas will be where I needed to use bytes (e.g. encoding/decoding/hashing), which I will also fix. I use six or a similar approach to sort out any other issues which crop up, e.g. metaclass syntax, execfile, and so on. After a relatively modest amount of work, I have a codebase that works on 2.x and 3.x, and all I have to remember is that 'xxx' is Unicode, and if I create a new module, I need to add the future import (on the assumption that I might add literal strings later, if not now). After that, it seems to be plain sailing, and I don't have to switch mental gears re. string literals.
If you look at a piece of code as something that exists in one of the two states "ported" or "not-ported", sure. But code often needs to be changed, and people of varying buy-in levels need to understand and change such code. It's just much easier for them to assume that the same syntax works on some versions of Python 2 and Python 3 and be done with it rather than need to explain the introduction of a function that only exists to paper over a syntax omission.
Well, according to the approach I described above, that one thing needs to be the present 3.x syntax - 'xxx' is text, b'xxx' is bytes, and f('xxx') is native string (or whatever name you want instead of f). With the unicode_literals import, that syntax works on 2.6+ and 3.2+, so ISTM it should work within the constraints you mentioned for your software. Regards, Vinay Sajip