[Python-Dev] Missing arguments in RE functions
Noam Raphael
noamr at myrealbox.com
Fri Sep 10 01:03:05 CEST 2004
I've read the objections. I understand being careful about extending an
API, but I still think that there are things to improve, even when being
conservative about the API.
I think that the straightforward functions should be taken seriously.
The reason is that although you can write
re.compile(pattern).match(...), re.match(pattern, ...) is shorter and
just as clear - I think of the fact that REs are first compiled and then
applied as an implementation issue, which lets you save time when
applying the same RE many times. The documentation is with me - let me
quote:
=====================
The sequence
prog = re.compile(pat)
result = prog.match(str)
is equivalent to
result = re.match(pat, str)
but the version using compile() is more efficient when the expression
will be used several times in a single program.
=====================
findall(string)
Identical to the findall() function, using the compiled pattern.
=====================
Not only the straightforward functions are not being regarded as being
"only there for trivial cases", the methods of the compiled RE are
regarded as sometimes-more-efficient versions of the straightforward
functions. This is why I didn't even know, until I made my research
before sending my message to python-dev, that you could match from a
given start position - I studied the page documenting the functions,
because I didn't want on an early stage to bother my students with the
fact that REs are first compiled and then applied, and I didn't find any
mention of the start position option.
So, as I see it, there are two options.
The first one is to decide that the functions are a ligitimate way of
using REs in python, and add the optional parameters that I added in my
patch. In this way, anything you can do with the compiled pattern you
could do using the functions. (I'm not that big expert in REs, but I
checked through the documentation and didn't find any functionality that
was missing from the functions, after adding these parameters.)
The second option is to decide that the functions are only a shortcut,
meant for use in trivial cases. In that case, two things should be done,
IMHO: The main thing is to update the documentation, to make that clear.
It means at least adding a prominent note in the "module contents" page,
stating something like "these functions are here only as shortcuts; to
access the full functionality, use compiled patterns". I think that in
this case, the documentation should be further updated, by changing all
the function explanations to something like "equivalent to
re.compile(pattern, flags).match(string)", instead of the detailed
explanations now given. The second thing that should be done even if the
functions are considered shortcuts, is to add the "flags" parameter to
the findall() and finditer() functions - I really can't see any reason
why the search() and match() functions should have that parameter and
findall() and finditer() shouldn't - they all get two arguments, pattern
and string. Why should the optional parameter be available only for the
older functions?
And a final note: the parameters for start and end positions are already
available in the findall() and finditer() methods. Should this be left
an undocumented feature? It seems to me perfectly legitimate to search
for all the matches of a specific RE in a substring without actually
copying all the characters of the substring to another string.
Noam
(P.S. Can you please add me to the CC of your replies? It would make it
easier for me to reply, since I'm not a member of python-dev.)
More information about the Python-Dev
mailing list