[Python-Dev] methods on the bytes object (was: Crazy idea for str.join)

Josiah Carlson jcarlson at uci.edu
Sun Apr 30 12:01:11 CEST 2006


"Guido van Rossum" <guido at python.org> wrote:
> On 4/29/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> > I understand the underlying implementation of str.join can be a bit
> > convoluted (with the auto-promotion to unicode and all), but I don't
> > suppose there is any chance to get str.join to support objects which
> > implement the buffer interface as one of the items in the sequence?
> 
> In Py3k, buffers won't be compatible with strings -- buffers will be
> about bytes, while strings will be about characters. Given that future
> I don't think we should mess with the semantics in 2.x; one change in
> the near(ish) future is enough of a transition.

This brings up something I hadn't thought of previously.  While unicode
will obviously keep its .join() method when it becomes str in 3.x, will
bytes objects get a .join() method?  Checking the bytes PEP, very little
is described about the type other than it basically being an array of 8
bit integers.  That's fine and all, but it kills many of the parsing
and modification use-cases that are performed on strings via the non
__xxx__ methods.


Specifically in the case of bytes.join(), the current common use-case of
<literal>.join(...) would become something similar to
bytes(<literal>).join(...), unless bytes objects got a syntax... Or
maybe I'm missing something?


Anyways, when the bytes type was first being discussed, I had hoped that
it would basically become array.array("B", ...) + non-unicode str.
Allowing for bytes to do everything that str was doing before, plus a
few new tricks (almost like an mmap...), minus those operations which
require immutability.


 - Josiah



More information about the Python-Dev mailing list