[Python-3000] Immutable bytes -- looking for volunteer
Jim Jewett
jimjjewett at gmail.com
Wed Sep 26 00:14:19 CEST 2007
> How about we take the existing PyString implementation (Python 2's
> str, currently still present as str8 in py3k), remove the locale and
> unicode mixing support, and call it bytes.
Is that just encode/decode?
But isn't this one sensible way to store an encoded str, so that
decode (only) would still make sense?
I would have expected to drop text or character-oriented methods,
because they should really be done on the (decoded) unicode version.
Given bytes use in wire protocols, I could also understand saying that
these methods only work on ASCII, and either raise an exception or
return false for other byte values.
text-or-chararacter-oriented methods:
'capitalize', 'center', 'endswith', 'expandtabs', 'isalnum',
'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper',
'ljust', 'lower', 'lstrip', 'rjust', 'rstrip', 'splitlines', 'strip',
'swapcase', 'title', 'translate', 'upper', 'zfill'
> It would mean more fixes beyond what Jeffrey and Adam did, since
> iterating over a bytes instance would return a bytes instance of
> length 1 instead of a small int,
makes sense
> and the bytes constructor would
> change accordingly (no more initializing a bytes object from a list of
> ints).
Why not?
I expect the literal b"ASCII string" to be the most common
constructor, but I don't see the problem with a sequence of ints (or
hex) as an alternative constructor.
> The (new) buffer object would also have to change to be more
> compatible with the (new) bytes object -- bytes<-->buffer conversions
> should be 1-1, and iterating over a buffer instance would also have to
> return a length-1 buffer (or bytes???) instance.
I would return a bytes instance. If you return a 1-char buffer, and
someone does modify that, it isn't clear whether the change should be
reflected in the original source buffer. If someone does want an
in-place filter, they can always use enumerate and slicing.
Can we assume that the two types are unequal, but that you can search
a buffer for a (constant) bytes?
>>> mybytes = b"some data"
>>> mybuffer = buffer(mybytes)
>>> mybuffer == mybytes
False
>>> mybuffer.startswith(mybytes) and \
... mybuffer.endswith(mybytes) and \
... len(mybuffer) == len(mybytes)
True
-jJ
More information about the Python-3000
mailing list