[Python-Dev] bytes / unicode

Stephen J. Turnbull stephen at xemacs.org
Fri Jun 25 09:05:43 CEST 2010


Guido van Rossum writes:
 > On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:

 > Understood, but both the majority of str/bytes methods and several
 > existing APIs (e.g. many in the os module, like os.listdir()) do it
 > this way.

Understood.

 > Also, IMO a polymorphic function should *not* accept *mixed*
 > bytes/text input -- join('x', b'y') should be rejected.

Agreed.

 > But join('x', 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make
 > sense to me.
 > 
 > So, actually, I *don't* understand what you mean by needing LBYL.

Consider docutils.  Some folks assert that URIs *are* bytes and should
be manipulated as such.  So base URIs should be bytes.  But there are
various ways to refer to a base URI and combine it with relative URI
taken from literal text in reST.  That literal text will be
represented as str.  So you want to use urljoin, but this usage isn't
polymorphic.

If you forget to do a conversion here, urljoin will raise, of course.
But late conversion may not be appropriate.  AIUI Philip at least
wants ways to raise exceptions earlier than that on some code paths.
That's LBYL, no?


More information about the Python-Dev mailing list