[Python-Dev] from __future__ import unicode_strings?

Jean-Paul Calderone exarkun at divmod.com
Thu Feb 16 22:53:57 CET 2006


On Thu, 16 Feb 2006 16:29:40 +0100, "M.-A. Lemburg" <mal at egenix.com> wrote:
>Jean-Paul Calderone wrote:
>> On Thu, 16 Feb 2006 11:24:35 +0100, "M.-A. Lemburg" <mal at egenix.com> wrote:
>>> Neil Schemenauer wrote:
>>>> On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote:
>>>>> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote:
>>>>>
>>>>>>>     from __future__ import unicode_strings
>>>>>> Didn't we have a command-line option to do this? I believe it was
>>>>>> removed because nobody could see the point. (Or am I hallucinating?
>>>>>> After several days of non-stop discussing bytes that must be
>>>>>> considered a possibility.)
>>>>> We do, and it's not been removed: the -U switch.
>>>> As Guido alluded, the global switch is useless.  A per-module switch
>>>> something that could actually useful.  One nice advantage is that
>>>> you would write code that works the same with Jython (wrt to string
>>>> literals anyhow).
>>> The global switch is not useless. It's purpose is to test the
>>> standard library (or any other piece of Python code) for Unicode
>>> compatibility.
>>>
>>> Since we're not even close to such compatibility, I'm not sure
>>> how useful a per-module switch would be.
>>
>> Just what Neil suggested: developers writing new code benefit from having the behavior which will ultimately be Python's default, rather than the behavior that is known to be destined for obsolescence.
>>
>> Being able to turn this on per-module is useful for the same reason the rest of the future system is useful on a per-module basis.  It's easier to convert things incrementally than monolithicly.
>
>Sure, but in this case the option would not only affect the module
>you define it in, but also all other code that now gets Unicode
>objects instead of strings as a result of the Unicode literals
>defined in these modules.

This is precisely correct.  It is also exactly parallel to the only other __future__ import which changes any behavior.  Personally, I _also_ like future division.  Is it generally considered to have been a mistake?

>
>It is rather likely that you'll start hitting Unicode-related
>compatibility bugs in the standard lib more often than you'd
>like.

You can guess this.  I'll guess that it isn't the case.  And who's to say how often I'd like that to happen, anyway? :)  Anyone who's afraid that will happen can avoid using the import.  Voila, problem solved.

>
>It's usually better to switch to Unicode in a controlled manner:
>not by switching all literals to Unicode, but only some, then
>test things, then switch over some more, etc.

There's nothing uncontrolled about this proposed feature, so this statement doesn't hold any meaning.

>
>This can be done by prepending the literal with the u"" modifier.
>

Anyone who is happier converting one string literal at a time can do this.  Anyone who would rather convert a module at a time can use the future import.

Jean-Paul


More information about the Python-Dev mailing list