[I18n-sig] Re: [Python-Dev] Unicode debate
Tue, 02 May 2000 12:46:06 +0200
Moshe Zadka wrote:
> I'd much prefer Python to reflect a
> fundamental truth about Unicode, which at least makes sure binary-goop can
> pass through Unicode and remain unharmed, then to reflect a nasty problem
> with UTF-8 (not everything is legal).
Let's not do the same mistake again: Unicode objects should *not*
be used to hold binary data. Please use buffers instead.
BTW, I think that this behaviour should be changed:
>>> buffer('binary') + 'data'
>>> 'data' + buffer('binary')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: illegal argument type for built-in operation
IMHO, buffer objects should never coerce to strings, but instead
return a buffer object holding the combined contents. The
same applies to slicing buffer objects:
should prefereably be buffer('nar').
Hmm, perhaps we need something like a data string object
to get this 100% right ?!
>>> d = data("...data...")
>>> d = d"...data..."
>>> print type(d)
>>> 'string' + d
>>> u'string' + d
Ideally, string and Unicode objects would then be subclasses
of this type in Py3K.
Python Pages: http://www.lemburg.com/python/