[Python-ideas] str.startswith taking any iterator instead of just tuple

James Powell james at dontusethiscode.com
Thu Jan 2 21:29:21 CET 2014


Some functions and methods allow the provision of a tuple of arguments
which will be looped over internally. e.g.,

    'spam'.startswith(('s', 'z')) # 'spam' starts with 's' or with 'z'
    isinstance(42, (float, int))

In these cases, CPython uses PyTuple_Check and PyTuple_GET_ITEM to
perform this internal iteration.

As a result, the following are considered invalid:

    'spam'.startswith(['s', 'z'])
    'spam'.startswith({'s', 'z'})
    'spam'.startswith(x for x in 'sz')

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: startswith first arg must be str, unicode, or tuple

There are two common workarounds:

    'spam'.startswith(tuple({'s', 'z'}))
    any('spam'.startwith(c) for c in {'s', 'z'})

Of course, the following construction already has a clear, separate meaning:

   'spam'.startswith('sz') # 'spam' starts with 'sz'

In these cases, could we supplant the PyTuple_Check with one that would
allow any iterator? Alternatively, could add this as an additional branch?

The code would look something like:

    it = PyObject_GetIter(subobj);
    if (it == NULL)
        return NULL;

    iternext = *Py_TYPE(it)->tp_iternext;
    for(;;) {
        substring = iternext(it);
        if (substring == NULL)
            Py_RETURN_FALSE;
        result = tailmatch(self, substring, start, end, -1);
        Py_DECREF(substring);
        if (result)
            Py_RETURN_TRUE;
    }

Of course, in the case of methods like .startswith, this would need to
ensure the following behaviour remains unchanged. The following should
always check if 'spam' starts with 'sz' not starts with 's' or with 'z':

    'spam'.startswith('sz')

I searched bugs.python.org and python-ideas for any previous discussion
of this topic. If this seems reasonable, I can submit an enhancement to
bugs.python.org with a patch for unicodeobject.c:unicode_startswith

Cheers,
James Powell

follow: @dontusethiscode + @nycpython
attend: nycpython.org + flask-nyc.org
read: seriously.dontusethiscode.com



More information about the Python-ideas mailing list