Totally confused by the str/bytes/unicode differences introduced in Pythyon 3.x

Terry Reedy tjreedy at udel.edu
Sat Jan 17 23:10:53 CET 2009


Martin v. Löwis wrote:
>>> Does he intend to maintain two separate codebases, one 2.x and the
>>> other 3.x?
>> I think I have no other choice.
>> Why? Is theoretically possible to maintain an unique code base for
>> both 2.x and 3.x?
> 
> That is certainly possible! One might have to make tradeoffs wrt.
> readability sometimes, but I found that this approach works quite
> well for Django. I think Mark Hammond is also working on maintaining
> a single code base for both 2.x and 3.x, for PythonWin.

Where 'single codebase' means that the code runs as is in 2.x and as 
autoconverted by 2to3 (or possibly a custom comverter) in 3.x.

One barrier to doing this is when the 2.x code has a mix of string 
literals with some being character strings that should not have 'b' 
prepended and some being true byte strings that should have 'b' 
prepended.  (Many programs do not have such a mix.)

One approach to dealing with string constants I have not yet seen 
discussed here is to put them all in separate file(s) to be imported. 
Group the text and bytes separately.  Them marking the bytes with a 'b', 
either by hand or program would be easy.

tjr




More information about the Python-list mailing list