[Python-Dev] Re: adding a bytes sequence type to Python

M.-A. Lemburg mal at egenix.com
Wed Aug 18 00:30:27 CEST 2004


Bob Ippolito wrote:
> 
> On Aug 17, 2004, at 5:33 PM, Guido van Rossum wrote:
> 
>>> So, how will it be different from:
>>>
>>>      from array import array
>>>
>>>      def bytes(*initializer):
>>>          return array('B',*initializer)
>>>
>>> Even if it's desirable for 'bytes' to be an actual type (e.g. 
>>> subclassing
>>> ArrayType), it might help the definition process to describe the 
>>> difference
>>> between the new type and a byte array.
>>
>>
>> Not a whole lot different, except for the ability to use a string as
>> alternate argument to the constructor, and the fact that it's going to
>> be an actual type, and that it should support the buffer API (which
>> array mysteriously doesn't?).
>>
>> The string argument support may not even be necessary -- an
>> alternative way to spell that would be to let s.decode() return a
>> bytes object, which has the advantage of being explicit about the
>> encoding; there's even a base64 encoding already!  But it would be a
>> bigger incompatibility, more likely to break existing code using
>> decode() and expecting to get a string.
> 
> 
> IMHO current uses of decode and encode are really confusing.  Many 
> decodes are from str -> unicode, and many encodes are from unicode -> 
> str (or str -> unicode -> str implicitly, which is usually going to fail 
> miserably)... while yet others like zlib, base64, etc. are str <-> str.  
> Technically unicode.decode(base64) should certainly work, but it doesn't 
> because unicode doesn't have a decode method.

They do in 2.4. Note that in 2.4 .decode() and .encode() guarantee that
you get a basestring instance. If you want more flexibility in
terms of return type, the new codecs.encode() and codecs.decode()
will allow arbitrary types as return value.

> I don't have a proposed solution at the moment, but perhaps these 
> operations should either be outside of the data types altogether (i.e. 
> use codecs only) or there should be separate methods for doing separate 
> things (character translations versus data->data transformations).

It all depends on whether you are discussing placing binary
data into the Python source file (by some means of using
literals) or just working with bytes you got from a file,
generator, socket, etc.

The current discussion is mixing these contexts a bit too
much, I believe, which is probably why people keep misunderstanding
each other (at least that's how I perceive the debate).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 17 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list