why we have both re.match and re.string?

Hi, I hope the question is not too silly, but why I would like to understand the advantages of having both re.match() and re.search(). Wouldn't be more clear to have just one function with one additional parameters like this: re.search(regexp, text, from_beginning=True|False) ? In this way we prevent, as written in the documentation, people writing ".*" in front of the regexp used with re.match() Thanks.

Hi, Le 10/02/2016 22:59, Luca Sangiacomo a écrit :
Hi, I hope the question is not too silly, but why I would like to understand the advantages of having both re.match() and re.search(). Wouldn't be more clear to have just one function with one additional parameters like this:
re.search(regexp, text, from_beginning=True|False) ?
Actually you can just do re.search(^regexp, text) But with match you express the intent to match the text with something, while with search, you express that you look for something in the text. Maybe that was the idea?
In this way we prevent, as written in the documentation, people writing ".*" in front of the regexp used with re.match()
Thanks. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/desmoulin.michel%40gmail....

On Wed, Feb 10, 2016 at 10:59:18PM +0100, Luca Sangiacomo wrote:
Hi, I hope the question is not too silly, but why I would like to understand the advantages of having both re.match() and re.search(). Wouldn't be more clear to have just one function with one additional parameters like this:
re.search(regexp, text, from_beginning=True|False) ?
I guess the most important reason now is backwards compatibility. The oldest Python I have installed here is version 1.5, and it has the brand new "re" module (intended as a replacement for the old "regex" module). Both have search() and match() top-level functions. So my guess is that you would have to track down the author of the original "regex" module. But a more general answer is the principle, "Functions shouldn't take constant bool arguments". It is an API design principle which (if I remember correctly) Guido has stated a number of times. Functions should not take a boolean argument which (1) exists only to select between two different modes and (2) are nearly always given as a constant. Do you ever find yourself writing code like this? if some_calculation(): result = re.match(regex, string) else: result = re.search(regex, string) If you do, that would be a hint that perhaps match() and search() should be combined so you can write: result = re.search(regex, string, some_calculation()) But I expect that you almost never do. I would expect that if we combined the two functions into one, we would nearly always call them with a constant bool: # I always forget whether True means match from the start or not, # and which is the default... result = re.search(regex, string, False) which suggests that search() is actually two different functions, and should be split into two, just as we have now. It's a general principle, not a law of nature, so you may find exceptions in the standard library. But if I were designing the re module from scratch, I would either keep the two distinct functions, or just provide search() and let users use ^ to anchor the search to the beginning.
In this way we prevent, as written in the documentation, people writing ".*" in front of the regexp used with re.match()
I only see one example that does that: https://docs.python.org/3/library/re.html#checking-for-a-pair Perhaps it should be changed. -- Steve
participants (3)
-
Luca Sangiacomo
-
Michel Desmoulin
-
Steven D'Aprano