[Python-Dev] Proof of the pudding: str.partition()
jcarlson at uci.edu
Thu Sep 1 08:03:22 CEST 2005
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Josiah Carlson wrote:
> > A bit of free thought brings me to the (half-baked) idea that if string
> > methods accepted any object which conformed to the buffer interface;
> > mmap, buffer, array, ... instances could gain all of the really
> > convenient methods that make strings the objects to use in many cases.
> Not a bad idea, but they couldn't literally be string methods.
> They'd have to be standalone functions like we used to have in
> the string module before it got mercilessly deprecated. :-)
> Not sure what happens to this when the unicode/bytearray future
> arrives, though. Treating a buffer of bytes as a character
> string isn't going to be so straightforward then.
Here's my thought:
One could modify string methods to check the type of the input (string,
unicode, or other). That check turns on a flag for whether the method
returns are string, unicode, or buffers. One uses PyObject_AsBuffer()
methods to pull the char* and length for any input offering the buffer
Now here's the fun part: One makes the methods aware of the type of the
self parameter. One sets the 'split' method for the buffer object to be
Unicode does indeed get tricky, how does one handle buffers of unicode
objects? Right now, you get the raw pointer and underlying length ( len
(buffer(u'hello')) == 10 ). If there was a unicode buffer (perhaps
ubuffer), that would work, but I'm not sure I really like it.
I notice much of the discussion on 'string views', which to me seems
like another way of saying 'buffer', and if there is a 'string view',
there would necessarily need to be a 'unicode view'.
As for the bytes type, from what I understand, they should directly
support buffers without issue.
More information about the Python-Dev