oneof() and multi split and replace for stirngs
I don't know if this would make sense to try to push to 2.6 or 3.0. I was talking with some people about how the ability to split or replace on multiple substrings would be added to python, without adding new methods or having ugly tuple passing requirents like s.split(('foo', 'bar'), 4). This idea came to mind, so I wanted to toss it out there for scrutination. It would be a builtin, but can be implemented in python like this, basically: class oneof(list): def __init__(self, *args): list.__init__(self) self.extend(args) def __eq__(self, o): return o in self assert 'bar' == oneof('bar', 'baz') In addition to the new type, .replace, .split, and other appropriate functions would be updated to take this as the substring argument to locate and would match any one of the substrings it contains. I've asked a few people and gotten good responses on the general idea so far, but what do you all think? 1) Would the multi-substring operations be welcomed? 2) Could this be a good way to add those to the API without breaking things? 3) What version would it target? 4) What all functions and methods should support this or generally might gain value from some similar solution? -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://ironfroggy-code.blogspot.com/
"Calvin Spealman" <ironfroggy@gmail.com> wrote:
I don't know if this would make sense to try to push to 2.6 or 3.0. I was talking with some people about how the ability to split or replace on multiple substrings would be added to python, without adding new methods or having ugly tuple passing requirents like s.split(('foo', 'bar'), 4). This idea came to mind, so I wanted to toss it out there for scrutination. It would be a builtin, but can be implemented in python like this, basically:
Whether it is a tuple being passed, or a "magic" container, I don't think it matters; though I would lean towards a tuple because it is 5 less characters to type out, and one fewer data types to worry about.
class oneof(list): def __init__(self, *args): list.__init__(self) self.extend(args) def __eq__(self, o): return o in self
assert 'bar' == oneof('bar', 'baz')
If all you wanted to test was containment, it would be faster (in terms of string searching) to use a set...and perhaps just use containment. class oneof(set): def __init__(self, *args): set.__init__(self) set.update(args) assert 'bar' in oneof('bar', 'baz') The underlying string search implementation probably shouldn't concern itself with specific types. Though there are some ... gotchas if one were to provide it with an iterator that isn't restartable.
1) Would the multi-substring operations be welcomed?
This has been discussed before in python-dev, I believe the general consensus was that it would be convenient at times, but I also believe the general consensus was "use re"; def replace(options, becomes, source, count=0): pattern = '(%s)'%('|'.join(re.escape(i) for i in options)) return re.sub(pattern, becomes, source, count)
2) Could this be a good way to add those to the API without breaking things?
No. Adding a data structure that is so simple just to offer multi-substring search, replace, split, etc., isn't worthwhile. Use a tuple.
3) What version would it target?
If it were to be added, I would say 3.x. The fewer API alterations in 2.5 -> 2.6, the better.
4) What all functions and methods should support this or generally might gain value from some similar solution?
If any, split. Beyond that, it smells to me like a regular expression job (though split still smells like a regular expression job to me). - Josiah
Calvin Spealman wrote:
I don't know if this would make sense to try to push to 2.6 or 3.0. I was talking with some people about how the ability to split or replace on multiple substrings would be added to python, without adding new methods or having ugly tuple passing requirents like s.split(('foo', 'bar'), 4). This idea came to mind, so I wanted to toss it out there for scrutination. It would be a builtin, but can be implemented in python like this, basically:
class oneof(list): def __init__(self, *args): list.__init__(self) self.extend(args) def __eq__(self, o): return o in self
assert 'bar' == oneof('bar', 'baz')
'bar' in 'bar baz' ?
In addition to the new type, .replace, .split, and other appropriate functions would be updated to take this as the substring argument to locate and would match any one of the substrings it contains. I've asked a few people and gotten good responses on the general idea so far, but what do you all think?
1) Would the multi-substring operations be welcomed? 2) Could this be a good way to add those to the API without breaking things? 3) What version would it target? 4) What all functions and methods should support this or generally might gain value from some similar solution?
It doesn't feel right to me to have this as a built in. This falls in the category of mid level functionality, but I'm never sure where to draw the line between having simple objects to do more complex things on, vs more complex objects. I tend to prefer the first case. Having more functions to do more complex, but common, things to strings and lists of strings would be nice to have in the library. Examples of this are fnmatch.py, and textwrap.py. These both use re to do the work, but present an easier to use interface. The functions in textwrap could be extended to accept 'lists of lines'. And functions to do justifying, right and full, might also be useful. Having a simpler word matching alternative to do web style multi term searches on lists of strings, would be nice. (Something I could use to handle search requests right now.) It could be designed to be presentable to users like fnmatch. The simpler pattern matching would work in many programming situations as well. Like fnmatch, and textwrap, it would use re to do the actual work. Functions like split_pattern(), replace_pattern(), and partition_pattern() could also be available in the same module to handle the cases you suggested. Cheers, Ron
participants (3)
-
Calvin Spealman
-
Josiah Carlson
-
Ron Adam