[I18n-sig] Re: [Python-Dev] Unicode debate

M.-A. Lemburg mal@lemburg.com
Tue, 02 May 2000 17:27:39 +0200


Guido van Rossum wrote:
> 
> [MAL]
> > Let's not do the same mistake again: Unicode objects should *not*
> > be used to hold binary data. Please use buffers instead.
> 
> Easier said than done -- Python doesn't really have a buffer data
> type.  Or do you mean the array module?  It's not trivial to read a
> file into an array (although it's possible, there are even two ways).
> Fact is, most of Python's standard library and built-in objects use
> (8-bit) strings as buffers.
> 
> I agree there's no reason to extend this to Unicode strings.
> 
> > BTW, I think that this behaviour should be changed:
> >
> > >>> buffer('binary') + 'data'
> > 'binarydata'
> >
> > while:
> >
> > >>> 'data' + buffer('binary')
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > TypeError: illegal argument type for built-in operation
> >
> > IMHO, buffer objects should never coerce to strings, but instead
> > return a buffer object holding the combined contents. The
> > same applies to slicing buffer objects:
> >
> > >>> buffer('binary')[2:5]
> > 'nar'
> >
> > should prefereably be buffer('nar').
> 
> Note that a buffer object doesn't hold data!  It's only a pointer to
> data.  I can't off-hand explain the asymmetry though.

Dang, you're right...
 
> > --
> >
> > Hmm, perhaps we need something like a data string object
> > to get this 100% right ?!
> >
> > >>> d = data("...data...")
> > or
> > >>> d = d"...data..."
> > >>> print type(d)
> > <type 'data'>
> >
> > >>> 'string' + d
> > d"string...data..."
> > >>> u'string' + d
> > d"s\000t\000r\000i\000n\000g\000...data..."
> >
> > >>> d[:5]
> > d"...da"
> >
> > etc.
> >
> > Ideally, string and Unicode objects would then be subclasses
> > of this type in Py3K.
> 
> Not clear.  I'd rather do the equivalent of byte arrays in Java, for
> which no "string literal" notations exist.

Anyway, one way or another I think we should make it clear
to users that they should start using some other type for
storing binary data.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/