[Python-ideas] Why don't CPython strings implement slicing using a view?

Skip Montanaro skip.montanaro at gmail.com
Thu May 7 22:12:58 CEST 2015


I haven't seen anyone else mention it, so I will point out:
interoperability with C. In C, strings are NUL-terminated. PyStringObject
instances do (or used to) have NUL-terminated strings in them. According to
unicodeobject.h, that seems still to be the case:

typedef struct {
    /* There are 4 forms of Unicode strings:
    ...
    wchar_t *wstr;              /* wchar_t representation (*null-terminated*)
*/
} PyASCIIObject;

and:

typedef struct {
    PyASCIIObject _base;
    Py_ssize_t utf8_length;     /* Number of bytes in utf8, *excluding the*
*                                 * terminating \0*. */
    char *utf8;                 /* UTF-8 representation (*null-terminated*)
*/
    Py_ssize_t wstr_length;     /* Number of code points in wstr, possible
                                 * surrogates count as two code points. */
} PyCompactUnicodeObject;

The raw string is NUL-terminated, precisely so copying isn't required in
most cases before passing to C. Making s[1:-1] a view onto the underlying
string data in s would require you to copy the data when you want to pass
the view into C so you could tack on that NUL. That happens a lot, so it's
likely you wouldn't save much work, and result in a lot more churn in
Python's memory allocator. The only place you could avoid the copy is if
the view you are dealing with is a strict suffix of s.

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150507/45214e69/attachment.html>


More information about the Python-ideas mailing list