On Sat, 15 Jul 2006 14:35:08 +1000, Nick Coghlan firstname.lastname@example.org wrote:
A __future__ import would allow these behaviors to be upgraded module-by- module.
No it wouldn't.
Yes it would! :)
__future__ works solely on the semantics of different pieces of syntax, because any syntax changes are purely local to the current module.
Changing all the literals in a module to be unicode instances instead of str instances is merely scratching the surface of the problem - such a module would still cause severe problems for any non-Unicode aware applications that expected it to return strings.
A module with the given __future__ import could be written to expect that literals are unicode instances instead of str, and encode them appropriately when passing to modules that expect str. This doesn't solve the problem, but unlike -U, you can make fixes which will work persistently without having to treat the type of every string literal as unknown.
The obvious way to write code that works under -U and still works in normal Python is to .encode('charmap') every value intended to be an octet, and put 'u' in front of every string intended to be unicode. That would seem to defeat the purpose of changing the default literal type.
Think of it this way: with the u'' literal prefix you can upgrade the behavior on a per-literal basis ;-)
Seriously, it really does makes more sense to upgrade to Unicode on a per-literal basis than on a per-module or per- process basis. When upgrading to Unicode you have to carefully consider the path each changed object will take through the code.
Note that it also helps setting the default encoding to 'unknown'. That way you disable the coercion of strings to Unicode and all the places where this implicit conversion takes place crop up, allowing you to take proper action (i.e. explicit conversion or changing of the string to Unicode as appropriate).
Later on, you can then simply remove the u'' prefix when moving to Py3k. In fact, Py3k could simply ignore the prefix for you (and maybe generate an optional warning) to make the transition easier.