[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Tue Feb 14 07:52:13 CET 2006

Phillip J. Eby wrote:
> I was just pointing out that since byte strings are bytes by definition, 
> then simply putting those bytes in a bytes() object doesn't alter the 
> existing encoding.  So, using latin-1 when converting a string to bytes 
> actually seems like the the One Obvious Way to do it.

This is a misconception. In Python 2.x, the type str already *is* a
bytes type. So if S is an instance of 2.x str, bytes(S) does not need
to do any conversion. You don't need to assume it is latin-1: it's
already bytes.

> In fact, the 'encoding' argument seems useless in the case of str objects, 
> and it seems it should default to latin-1 for unicode objects.

I agree with the former, but not with the latter. There shouldn't be a
conversion of Unicode objects to bytes at all. If you want bytes from
a Unicode string U, write

  bytes(U.encode(encoding))

Regards,
Martin