[Python-Dev] unicode_string future, str -> basestring, fix or feature

Guido van Rossum guido at python.org
Mon Mar 3 03:44:20 CET 2014


AFACT, in that message Victor was only talking about allowing Unicode
filenames.

Making everything polymorphic is clearly pulling on the thread that will
unravel the entire sweater.

But... The start of this thread was about changing a few occurrences of
isinstance(..., str) to use basestring, and that's a different matter. The
Python 2 Unicode design calls for mixing of Unicode and 8-bit strings as
long as the latter contain 7-bit ASCII -- the code in turtle violates that
design by insisting on an 8-bit string. The underlying Tkinter module
handles Unicode strings just fine (and not just 7-bit ASCII).

As far as lib2to3 goes, using basestring instead of str actually
disambiguates things -- with str it can't tell for sure whether text or
binary was meant, but with basestring it's a safe bet that the intention
was text.

Finally, in most places Python 2.7 *does* handle Unicode filenames just
fine.


On Sun, Mar 2, 2014 at 6:26 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Terry Reedy writes:
>  > On 3/2/2014 4:23 PM, Serhiy Storchaka wrote:
>
>  > > Patches which add support for unicode strings were accepted for one
>  > > issues (e.g. http://bugs.python.org/issue19099) and rejected for
> other
>  > > issues (e.g. http://bugs.python.org/issue20014 and
>  > > http://bugs.python.org/issue20015). Some issues (e.g.
>  > > http://bugs.python.org/issue18695) hang in undefined state.
>  >
>  > If Antoine and Guido don't reverse themselves, those could perhaps be
>  > re-opened. It strikes me as borderline, depending interpretation of
>  > 'string'. I am not surprised there have been different resolutions.
>
> I agree with Victor in http://bugs.python.org/issue18695#msg208857:
> there's no "bug".  It is just that in the design of 2.x 'str' is not
> Unicode, and the "fix" is Python 3.  This may be an area where 2to3
> could give more help.
>
> As Victor points out in that message, the issue-by-issue approach to
> this inconsistency is just whack-a-mole.
>
> I would worry not only about the whack-a-mole aspect where 'unicode'
> objects leak into contexts where they're not supported, but also that
> this could confuse tools like 2to3.
>
> I agree that usage of the word "string" is all too often ambiguous in
> the documentation, but that doesn't justify a wholesale overhaul of
> the Python 2.7 API to make everything polymorphic.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140302/262865c2/attachment.html>


More information about the Python-Dev mailing list