[Python-Dev] marshal (was:Buffer interface in abstract.c? )

M.-A. Lemburg mal@lemburg.com
Fri, 06 Aug 1999 10:54:20 +0200


Jack Jansen wrote:
> 
> Recently, "Fredrik Lundh" <fredrik@pythonware.com> said:
> > on the other hand, I think something is missing from
> > the buffer design; I definitely don't like that people
> > can write and marshal objects that happen to
> > implement the buffer interface, only to find that
> > Python didn't do what they expected...
> >
> > >>> import unicode
> > >>> import marshal
> > >>> u = unicode.unicode
> > >>> s = u("foo")
> > >>> data = marshal.dumps(s)
> > >>> marshal.loads(data)
> > 'f\000o\000o\000'
> > >>> type(marshal.loads(data))
> > <type 'string'>

Why do Unicode objects implement the bf_getcharbuffer slot ? I thought
that unicode objects use a two-byte character representation.

Note that implementing the char buffer interface will also give
you strange results with other code that uses
PyArg_ParseTuple(...,"s#",...), e.g. you could search through
Unicode strings as if they were normal 1-byte/char strings (and
most certainly not find what you're looking for, I guess).

> Hmm again, I think I'd like it better if marshal.dumps() would barf on
> attempts to write unrepresentable data. Currently unrepresentable
> objects are written as TYPE_UNKNOWN (unless they have bufferness (or
> should I call that "a buffer-aspect"? :-)), which means you think you
> are writing correctly marshalled data but you'll be in for an
> exception when you try to read it back...

I'd prefer an exception on write too.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   147 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/