[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Wed Feb 15 00:13:36 CET 2006

On 2/14/06, Adam Olsen <rhamph at gmail.com> wrote:
> I'm starting to wonder, do we really need anything fancy?  Wouldn't it
> be sufficient to have a way to compactly store 8-bit integers?
>
> In 2.x we could convert unicode like this:
> bytes(ord(c) for c in u"It's...".encode('utf-8'))

Yuck.

> u"It's...".byteencode('utf-8')  # Shortcut for above

Yuck**2. I'd like to avoid adding new APIs to existing types to return
bytes instead of str. (It's okay to change existing APIs to *accept*
bytes as an alternative to str though.)

> In 3.0 it changes to:
> "It's...".encode('utf-8')
> u"It's...".byteencode('utf-8')  # Same as above, kept for compatibility

No. 3.0 won't have "backward compatibility" features. That's the whole
point of 3.0.

> Passing a str or unicode directly to bytes() would be an error.
> repr(bytes(...)) would produce bytes([1,2,3]).

I'm fine with that.

> Probably need a __bytes__() method that print can call, or even better
> a __print__(file) method[0].  The write() methods would of course have
> to support bytes objects.

Right on the latter.

> I realize it would be odd for the interactive interpret to print them
> as a list of ints by default:
> >>> u"It's...".byteencode('utf-8')
> [73, 116, 39, 115, 46, 46, 46]

No. This prints the repr() which should include the type. bytes([73,
116, 39, 115, 46, 46, 46]) is the right thing to print here.

> But maybe it's time we stopped hiding the real nature of bytes from users?

That's the whole point.

> [0] By this I mean calling objects recursively and telling them what
> file to print to, rather than getting a temporary string from them and
> printing that.  I always wondered why you could do that from C
> extensions but not from Python code.

I want to keep the Python-level API small.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)