[Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c

Guido van Rossum guido at python.org
Mon Oct 24 20:06:43 CEST 2005


On 10/24/05, Phil Thompson <phil at riverbankcomputing.co.uk> wrote:
> I'm implementing a string-like object in an extension module and trying to
> make it as interoperable with the standard string object as possible. To do
> this I'm implementing the relevant slots and the buffer interface. For most
> things this is fine, but there are a small number of methods in
> stringobject.c that don't use the buffer interface - and I don't understand
> why.
>
> Specifically...
>
> string_contains() doesn't which means that...
>
>     MyString("foo") in "foobar"
>
> ...doesn't work.
>
> s.join(sequence) only allows sequence to contain string or unicode objects.
>
> s.strip([chars]) only allows chars to be a string or unicode object. Same for
> lstrip() and rstrip().
>
> s.ljust(width[, fillchar]) only allows fillchar to be a string object (not
> even a unicode object). Same for rjust() and center().
>
> Other methods happily allow types that support the buffer interface as well as
> string and unicode objects.
>
> I'm happy to submit a patch - I just wanted to make sure that this behaviour
> wasn't intentional for some reason.

A concern I'd have with fixing this is that Unicode objects also
support the buffer API. In any situation where either str or unicode
is accepted I'd be reluctant to guess whether a buffer object was
meant to be str-like or Unicode-like. I think this covers all the
cases you mention here.

We need to support this better in Python 3000; but I'm not sure you
can do much better in Python 2.x; subclassing from str is unlikely to
work for you because then too many places are going to assume the
internal representation is also the same as for str.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list