[Tutor] Consecutive_zeros
avi.e.gross at gmail.com
avi.e.gross at gmail.com
Tue Jul 5 12:43:09 EDT 2022
Peter,
It may be tinkering but is a useful feature. Had I bothered to find the
manual page for max() and seen a way to specify a default of zero, indeed I
would have been quite happy.
There are many scenarios where you could argue a feature is syntactic sugar
but when something is done commonly and can result in a bad result, it can
be quite nice to have a way of avoiding it with lots of code. For example, I
have seen lots of functions that can be told to remove NA values in R, or
not to drop a dimension when consolidating info so the result has a know
size even if not an optimal size.
In this case, I would think it to be a BUG in the implementation of max()
for it to return an error like this:
max([])
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
max([])
ValueError: max() arg is an empty sequence
Basically it was given nothing and mathematically the maximum and minimum
and many other such functions are meaningless when given nothing. So
obviously a good approach is to test what you are working on before trying
such a call, or wrapping it in something like a try(...) and catch the error
and deal with it then.
Those are valid approaches BUT we have many scenarios like the one being
solved that have a concept that zero is a floor for counting numbers and is
appropriate also as a description of the length of a pattern that does not
exist. In this case, an empty list still means no instances of the pattern
where found so the longest pattern is 0 units long.
So it is perfectly reasonable to be able to add a default so the function
does not blow up the program:
max([], default=0)
0
I hasten to add that in many situations this approach is not valid and the
code should blow up unless you take steps to avoid it as further work can be
meaningless.
I am curious how you feel about various other areas such as defaults when
fetching an item from a dictionary. Yes, you can write elaborate code that
prevents a mishap. Realistically, some people would create a slew of short
functions like safe_max() that probably would use one of the techniques
internally and return something more gently, so what is wrong with making a
small change in max() itself that does this upon request?
I know max() was not necessarily designed with regular expression matches
(and their length) in mind. But many real world problems share the same
concept albeit not always with a default of 0.
-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Tuesday, July 5, 2022 3:55 AM
To: tutor at python.org
Subject: Re: [Tutor] Consecutive_zeros
On 04/07/2022 18:49, avi.e.gross at gmail.com wrote:
> So I went back to my previous somewhat joking suggestion and thought
> of a way of shortening it as the "re" module also has a
> regular-expression version called re.split() that has the nice side
> effect of including an empty string when I ask it to split on all runs of
non-zero.
>
> re.split("[^0]", "1010010001000010000001") ['', '0', '00', '000',
> '0000', '000000', '']
>
> That '' at the end of the resulting list takes care of the edge
> condition for a string with no zeroes at all:
>
> re.split("[^0]", "No zeroes")
> ['', '', '', '', '', '', '', '', '', '']
>
> Yes, lots of empty string but when you hand something like that to get
> lengths, you get at least one zero which handles always getting a
> number, unlike my earlier offering which needed to check if it matched
anything.
>
> So here is a tad shorter and perhaps more direct version of the
> requested function which is in effect a one-liner:
>
> def consecutive_zeros(string):
> return(max([len(zs) for zs in re.split("[^0]", string)]))
Python's developers like to tinker as much as its users -- but they disguise
it as adding syntactic sugar or useful features ;)
One of these features is a default value for max() which allows you to stick
with re.findall() while following the cult of the one-liner:
>>> max(map(len, re.findall("0+", "1010010001000010000001")), default=-1)
6
>>> max(map(len, re.findall("0+", "ham spam")), default=-1)
-1
_______________________________________________
Tutor maillist - Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list