[Python-Dev] Community buildbots

M.-A. Lemburg mal at egenix.com
Sat Jul 15 21:15:04 CEST 2006


glyph at divmod.com wrote:
> On Sat, 15 Jul 2006 14:35:08 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> glyph at divmod.com wrote:
>>> A __future__ import would allow these behaviors to be upgraded module-by- 
>>> module.
>> No it wouldn't.
> 
> Yes it would! :)
> 
>> __future__ works solely on the semantics of different pieces of syntax, 
>> because any syntax changes are purely local to the current module.
> ...
>> Changing all the literals in a module to be unicode instances instead of str 
>> instances is merely scratching the surface of the problem - such a module 
>> would still cause severe problems for any non-Unicode aware applications 
>> that expected it to return strings.
> 
> A module with the given __future__ import could be written to expect that
> literals are unicode instances instead of str, and encode them appropriately
> when passing to modules that expect str.  This doesn't solve the problem, but
> unlike -U, you can make fixes which will work persistently without having to
> treat the type of every string literal as unknown.
> 
> The obvious way to write code that works under -U and still works in normal
> Python is to .encode('charmap') every value intended to be an octet, and put
> 'u' in front of every string intended to be unicode.  That would seem to
> defeat the purpose of changing the default literal type.

Think of it this way: with the u'' literal prefix you can upgrade
the behavior on a per-literal basis ;-)

Seriously, it really does makes more sense to upgrade to
Unicode on a per-literal basis than on a per-module or per-
process basis. When upgrading to Unicode you have to carefully
consider the path each changed object will take through the
code.

Note that it also helps setting the default encoding
to 'unknown'. That way you disable the coercion of strings
to Unicode and all the places where this implicit conversion
takes place crop up, allowing you to take proper action (i.e.
explicit conversion or changing of the string to Unicode
as appropriate).

Later on, you can then simply remove the u'' prefix when
moving to Py3k. In fact, Py3k could simply ignore the prefix
for you (and maybe generate an optional warning) to make the
transition easier.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 15 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list