[Python-Dev] Missing arguments in RE functions
noamr at myrealbox.com
Tue Sep 7 21:34:11 CEST 2004
I've now finished teaching Python to a group of people, and regular
expressions was a part of the course. I have encountered a few missing
features (that is, optional arguments) in RE functions. I've checked,
and it seems to me that they can be added very easily.
The first missing feature is the "flags" argument in the findall and
finditer functions. Searching for all occurances of an RE is, of course,
a legitimate action, and I had to use (?s) in my RE, instead of adding
re.DOTALL, which, to my opinion, is a lot clearer.
The solution is simple: the functions sub, subn, split, findall and
finditer all first compile the given RE, with the flags argument set to
0, and then run the appropriate method. As far as I can see, they could
all get an additional optional argument, flags=0, and compile the RE
The second missing feature is the ability to specify start and end
indices when doing matches and searches. This feature is available when
using a compiled RE, but isn't mentioned at all in any of the
straightforward functions (That's why I didn't even know it was
possible, until I now checked - I naturally assumed that all the
functionality is availabe when using the functions).
I think these should be added to the functions match, search, findall
and finditer. This feature isn't documented for the findall and finditer
methods, but I checked, and it seems to work fine.
(In case you are interested in the use case: the exercise was to parse
an XML file. It was done by first matching the beginning of a tag, then
trying to match attributes, and so on - each match starts from where the
previous successfull match ended. Since I didn't know of this feature,
it was done by replacing the original string with a substring after
every match, which is terribly unefficient.)
If you approve, I can create a patch in a few minutes and send it.
Have a good day,
More information about the Python-Dev