[Python-ideas] oneof() and multi split and replace for stirngs

Josiah Carlson jcarlson at uci.edu
Tue Jan 23 09:29:47 CET 2007


"Calvin Spealman" <ironfroggy at gmail.com> wrote:
> I don't know if this would make sense to try to push to 2.6 or 3.0. I
> was talking with some people about how the ability to split or replace
> on multiple substrings would be added to python, without adding new
> methods or having ugly tuple passing requirents like s.split(('foo',
> 'bar'), 4). This idea came to mind, so I wanted to toss it out there
> for scrutination. It would be a builtin, but can be implemented in
> python like this, basically:

Whether it is a tuple being passed, or a "magic" container, I don't
think it matters; though I would lean towards a tuple because it is 5
less characters to type out, and one fewer data types to worry about.

> class oneof(list):
>     def __init__(self, *args):
>         list.__init__(self)
>         self.extend(args)
>     def __eq__(self, o):
>         return o in self
> 
> assert 'bar' == oneof('bar', 'baz')

If all you wanted to test was containment, it would be faster (in terms
of string searching) to use a set...and perhaps just use containment.

    class oneof(set):
        def __init__(self, *args):
            set.__init__(self)
            set.update(args)

    assert 'bar' in oneof('bar', 'baz')

The underlying string search implementation probably shouldn't concern
itself with specific types.  Though there are some ... gotchas if one
were to provide it with an iterator that isn't restartable.

> 1) Would the multi-substring operations be welcomed?

This has been discussed before in python-dev, I believe the general
consensus was that it would be convenient at times, but I also believe
the general consensus was "use re";

    def replace(options, becomes, source, count=0):
        pattern = '(%s)'%('|'.join(re.escape(i) for i in options))
        return re.sub(pattern, becomes, source, count)

> 2) Could this be a good way to add those to the API without breaking things?

No.  Adding a data structure that is so simple just to offer
multi-substring search, replace, split, etc., isn't worthwhile. Use a
tuple.

> 3) What version would it target?

If it were to be added, I would say 3.x.  The fewer API alterations in
2.5 -> 2.6, the better.

> 4) What all functions and methods should support this or generally
> might gain value from some similar solution?

If any, split.  Beyond that, it smells to me like a regular expression
job (though split still smells like a regular expression job to me).


 - Josiah




More information about the Python-ideas mailing list