[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Guido van Rossum guido at python.org
Wed Feb 15 00:13:29 CET 2006


On 2/13/06, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Guido van Rossum wrote:
> >>In py3k, when the str object is eliminated, then what do you have?
> >>Perhaps
> >>- bytes("\x80"), you get an error, encoding is required. There is no
> >>such thing as "default encoding" anymore, as there's no str object.
> >>- bytes("\x80", encoding="latin-1"), you get a bytestring with a
> >>single byte of value 0x80.
> >
> > Yes to both again.
>
> Please reconsider, and don't give bytes() an encoding= argument.
> It doesn't need one. In Python 3, people should write
>
>   "\x80".encode("latin-1")
>
> if they absolutely want to, although they better write
>
>   bytes([0x80])
>
> Now, the first form isn't valid in 2.5, but
>
>   bytes(u"\x80".encode("latin-1"))
>
> could work in all versions.

In 3.0, I agree that .encode() should return a bytes object.

I'd almost be convinced that in 2.x bytes() doesn't need an encoding
argument, except it will require excessive copying.
bytes(u.encode("utf8")) will certainly use 2*len(u) bytes  space (plus
a constant); bytes(u, "utf8") only needs len(u) bytes. In 3.0,
bytes(s.encode(xxx)) would also create an extra copy, since the bytes
type is mutable (we all agree on that, don't we?).

I think that's a good enough argument for 2.x. We could keep the
extended API as an alternative form in 3.x, or automatically translate
calls to bytes(x, y) into x.encode(y).

BTW I think we'll need a new PEP instead of PEP 332. The latter has
almost no details relevant to this discussion, and it seems to treat
bytes as a near-synonym for str in 2.x. That's not the way this
discussion is going it seems.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list