Mailman 3 Default return values to int and float - Python-ideas

newer
Re: [Python-ideas] [Python-Dev]...

Default return values to int and float

David Townshend

3 Oct 2011 3 Oct '11

2:52 a.m.

My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...

...
...
float('cannot convert this', default=0.0) 0.0

I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than: try: return float(s) except ValueError: return 0.0 Any takers? David

Show replies by date

Chris Rebert

3 Oct 3 Oct

4:57 a.m.

On Mon, Oct 3, 2011 at 12:52 AM, David Townshend wrote:

...

My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0) 0.0

I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than:

try: return float(s) except ValueError: return 0.0

Important consideration: Would the default value be typechecked or not? (i.e. Does something like `float(s, {})` raise TypeError?) It's not uncommon to use None as the result value when the input is invalid, but not typechecking would then leave the door open to strangeness like my example. Or would None just be inelegantly special-cased, or...? This could be one of those instances where Python is better off leaving people to write their own short one-off functions to get the /exact/ behavior desired in their individual circumstances. Cheers, Chris

Masklinn

5:41 a.m.

-0 on proposal, no big judgement (although it might cause issues: if `int` and `float` can take a default, why not `dict` or `Decimal` as well?), but On 2011-10-03, at 11:57 , Chris Rebert wrote:

...

Or would None just be inelegantly special-cased, or…?

Why inelegantly? isinstance(default, (cls, types.NoneType)) is pretty elegant and clearly expresses the type constraint, which is an anonymous sum type[0]. Only issue is that Sphinx has no support for sum types for the moment. [0] http://en.wikipedia.org/wiki/Sum_type

Chris Rebert

5:49 a.m.

On Mon, Oct 3, 2011 at 3:41 AM, Masklinn wrote:

...

-0 on proposal, no big judgement (although it might cause issues: if `int` and `float` can take a default, why not `dict` or `Decimal` as well?), but On 2011-10-03, at 11:57 , Chris Rebert wrote:

...
Or would None just be inelegantly special-cased, or…?

Why inelegantly?

There are use-cases (albeit relatively rare) for non-None null values. Special-casing NoneType would exclude those use-cases (which is a entirely reasonable trade-off option to choose). Cheers, Chris

David Townshend

7:39 a.m.

...

Important consideration: Would the default value be typechecked or not? (i.e. Does something like `float(s, {})` raise TypeError?) It's not uncommon to use None as the result value when the input is invalid, but not typechecking would then leave the door open to strangeness like my example. Or would None just be inelegantly special-cased, or...? This could be one of those instances where Python is better off leaving people to write their own short one-off functions to get the /exact/ behavior desired in their individual circumstances.

I would suggest not checking type for exactly that reason. One-off functions are fine if they are one-off, but most cases where this behaviour is needed the one-off function is exactly the same is every case. I don't think David is arguing for the default behavior to change -- merely

...

that you get a dict.get style default. Kinda similar to getattr/2 raising AttributeError, and getattr/3 returning the default value.

Yes, dict.get is exactly the sort of thing I was going for. I think that there are also a few other methods dotted throughout the stdlib that have this an optional "default" argument like this, so this isn't really a new idea, it's only new in as it applies to int and float. pv = float(ln['PV']) if ln['PV'] else None

...

pv = float(ln['PV'], default=None)

I wouldn't implement it this way because of the problems already pointed out. I would use a try statement (as in my first example), which would be more robust, but which cannot be written as a one-liner. If you were to write your example in a series of try statements, it would end up four times longer and much less readable!

Nick Coghlan

12:03 p.m.

On Mon, Oct 3, 2011 at 5:57 AM, Chris Rebert wrote:

...

This could be one of those instances where Python is better off leaving people to write their own short one-off functions to get the /exact/ behavior desired in their individual circumstances.

+1 We get into similar discussions when it comes to higher order itertools. Eventually, there are enough subtle variations that it becomes a better option to let users write their own utility functions. In this case, there are at least 2 useful variants: def convert(target_type, obj, default) if obj: return target_type(obj) return default def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default Exceptions potentially ignored include TypeError, AttributeError and ValueError. However, you may also want to include logging so that you can go through the logs later to find data that needs cleaning. Except even in Dirkjan's example code, we see cases that don't fit either model (they're comparing against a *particular* value and otherwise just calling float()). The question is whether there is a useful alternative that is as general purpose as getattr/3 (which ignores AttributeError) and dict.get/2 (which ignores KeyError). However, I think there are too many variations in conversion function APIs and the way are used for that to be of sufficiently general use - we'd be adding something new that everyone has to learn, but far too much of the time they'd have to do their own conversion anyway. I'd sooner see a getitem/3 builtin that could be used to ignore any LookupError the way dict.get/3 allows KeyError to be ignored. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Jan Kaliszewski

6:24 p.m.

New subject: getitem(obj, key, default) [was: Default return values to int and float]

Nick Coghlan dixit (2011-10-03, 13:03):

...

I'd sooner see a getitem/3 builtin that could be used to ignore any LookupError the way dict.get/3 allows KeyError to be ignored.

+1. It's probably quite common case. Regards. *j

Raymond Hettinger

9:04 p.m.

New subject: getitem(obj, key, default) [was: Default return values to int and float]

On Oct 3, 2011, at 7:24 PM, Jan Kaliszewski wrote:

...

Nick Coghlan dixit (2011-10-03, 13:03):

...
I'd sooner see a getitem/3 builtin that could be used to ignore any LookupError the way dict.get/3 allows KeyError to be ignored.

+1.

It's probably quite common case.

How many times does this silly idea have to get shot down? Do you see other languages implementing get defaults on sequences? Do you see lots of python users implementing this in a util module because it is an important operation? Can you find examples of real-world code that would be significantly improved with list.get() functionality? Does this make any semantic sense to users (i.e. they specifically want to the i-th item of sequence but don't even know long the sequence is)? Refuse to hypergeneralize dict.get() into a context where it doesn't make sense (it does make sense for mappings, but not for sequences; sequence indices are all about position while mapping keys have deeper relationship to the corresponding values). Raymond

Nick Coghlan

9:48 p.m.

New subject: getitem(obj, key, default) [was: Default return values to int and float]

On Mon, Oct 3, 2011 at 10:04 PM, Raymond Hettinger wrote:

...

Does this make any semantic sense to users (i.e. they specifically want to the i-th item of sequence but don't even know long the sequence is)?

I still occasionally want to do it with sys.argv to implement optional positional arguments before I remind myself to quit messing around reinventing the wheel and just import argparse. But yeah, being a better idea than the conversion function proposals is a far cry from being a good idea :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Greg Ewing

11:59 p.m.

New subject: getitem(obj, key, default) [was: Default return values to int and float]

Jan Kaliszewski wrote:

...

Nick Coghlan dixit (2011-10-03, 13:03):

...
I'd sooner see a getitem/3 builtin that could be used to ignore any LookupError the way dict.get/3 allows KeyError to be ignored.

Well, maybe. Some useful properties of dict.get() are that it avoids the overhead of catching an exception, and there is no danger of catching something coming from somewhere other than the indexing operation itself. You wouldn't get those from a generic version. -- Greg

Carl Matthew Johnson

11 p.m.

On Oct 3, 2011, at 7:03 AM, Nick Coghlan wrote:

...

def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default

This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming. A slightly more generalized form of try_convert might be useful (see below), but then again, we can't just pack every possible 5 line function into the standard library…

...

def catch(exception, f, *args, kwargs={}, default=None) try: return f(*args, **kwargs) except exception: return default

...

...
...
catch(ValueError, "abc".index, "z", default="Not Found") 'Not Found' catch(ValueError, float, "zero", default=0.0) 0.0

Steven D'Aprano

4 Oct 4 Oct

6:13 p.m.

Carl Matthew Johnson wrote:

...

This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming.

I beg to differ. Raising an exception *is* the One Obvious Way in Python. But OOW does not mean "Only One Way", and the existence of raise doesn't mean that there can't be a Second Not-So-Obvious Way, such as returning a "not found" value. However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be. -- Steven

Guido van Rossum

9:21 p.m.

On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote:

...

Carl Matthew Johnson wrote:

...
This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming.

I beg to differ. Raising an exception *is* the One Obvious Way in Python. But OOW does not mean "Only One Way", and the existence of raise doesn't mean that there can't be a Second Not-So-Obvious Way, such as returning a "not found" value.

However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be.

What works for re.match doesn't work for str.find. With re.match, the result when cast to bool is true when there's a match and false when there isn't. That's elegant. But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good. Hence the -1: the intent was that people should write "if s.find(x) >= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part. We also have str.index which raised an exception, but people dislike writing try/except blocks. We now have "if x in s" for situations where you don't care where the match occurred, but unfortunately if you need to check whether *and* where a match occurred, your options are str.find (easy to forget the ">= 0" part), str.index (cumbersome to write the try/except block), or "if x in s: i = s.index(x); ..." which looks compact but does a redundant second linear search. (It is also too attractive since it can be used without introducing a variable.) Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill, and returning an (index, success) tuple is begging for lots of mysterious occurrences of [0] or [1]. I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count. -- --Guido van Rossum (python.org/~guido)

Nick Coghlan

9:48 p.m.

On Tue, Oct 4, 2011 at 10:21 PM, Guido van Rossum wrote:

...

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

You're not the only one - there's a reason str.find/index discussions always seem to devolve into attempts to find tolerable expression syntaxes for converting a particular exception type into a default value for the expression :P Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Ron Adam

5 Oct 5 Oct

12:30 a.m.

On Tue, 2011-10-04 at 19:21 -0700, Guido van Rossum wrote:

...

On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote:

...
Carl Matthew Johnson wrote:

...
This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming.

I beg to differ. Raising an exception *is* the One Obvious Way in Python. But OOW does not mean "Only One Way", and the existence of raise doesn't mean that there can't be a Second Not-So-Obvious Way, such as returning a "not found" value.

However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be.

What works for re.match doesn't work for str.find. With re.match, the result when cast to bool is true when there's a match and false when there isn't. That's elegant.

But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good. Hence the -1: the intent was that people should write "if s.find(x) >= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part. We also have str.index which raised an exception, but people dislike writing try/except blocks. We now have "if x in s" for situations where you don't care where the match occurred, but unfortunately if you need to check whether *and* where a match occurred, your options are str.find (easy to forget the ">= 0" part), str.index (cumbersome to write the try/except block), or "if x in s: i = s.index(x); ..." which looks compact but does a redundant second linear search. (It is also too attractive since it can be used without introducing a variable.)

Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill, and returning an (index, success) tuple is begging for lots of mysterious occurrences of [0] or [1].

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

There is also the newer partition and rpartition methods, which I tend to forget about. I really don't like the '-1' for a not found case. They just get in the way. If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception.

...

...
...
s[len(s):] ''

Lets say we didn't have a split method and needed to write one. If s.find returned len(s) as the not found... def split(s, x): result = [] start = 0 while start < len(s): i = s.find(x, start) result.append(s[start:i]) # No check needed here. start = i + len(x) return result Of course you could you this same pattern for other things. cheers, Ron

Matt Joiner

2:02 a.m.

-1 to this idea unless it gives significant performance boosts for one or more of the python implementations On Oct 5, 2011 4:30 PM, "Ron Adam" wrote:

...

On Tue, 2011-10-04 at 19:21 -0700, Guido van Rossum wrote:

...
On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote:

...
Carl Matthew Johnson wrote:

...
This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming.

I beg to differ. Raising an exception *is* the One Obvious Way in Python. But OOW does not mean "Only One Way", and the existence of raise doesn't mean that there can't be a Second Not-So-Obvious Way, such as returning a "not found" value.

However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be.

What works for re.match doesn't work for str.find. With re.match, the result when cast to bool is true when there's a match and false when there isn't. That's elegant.

But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good. Hence the -1: the intent was that people should write "if s.find(x) >= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part. We also have str.index which raised an exception, but people dislike writing try/except blocks. We now have "if x in s" for situations where you don't care where the match occurred, but unfortunately if you need to check whether *and* where a match occurred, your options are str.find (easy to forget the ">= 0" part), str.index (cumbersome to write the try/except block), or "if x in s: i = s.index(x); ..." which looks compact but does a redundant second linear search. (It is also too attractive since it can be used without introducing a variable.)

Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill, and returning an (index, success) tuple is begging for lots of mysterious occurrences of [0] or [1].

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

There is also the newer partition and rpartition methods, which I tend to forget about.

I really don't like the '-1' for a not found case. They just get in the way.

If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception.

...
...
...
s[len(s):] ''

Lets say we didn't have a split method and needed to write one.

If s.find returned len(s) as the not found...

def split(s, x): result = [] start = 0 while start < len(s): i = s.find(x, start) result.append(s[start:i]) # No check needed here. start = i + len(x) return result

Of course you could you this same pattern for other things.

cheers, Ron

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Ethan Furman

10:06 a.m.

Ron Adam wrote:

...

I really don't like the '-1' for a not found case. They just get in the way.

If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception.

So every time we want to know if s.find() failed, we have to compare to len(s)? No thanks. ~Ethan~

Ron Adam

1:33 p.m.

On Wed, 2011-10-05 at 08:06 -0700, Ethan Furman wrote:

...

Ron Adam wrote:

...
I really don't like the '-1' for a not found case. They just get in the way.

If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception.

So every time we want to know if s.find() failed, we have to compare to len(s)?

I think probably None would have been better than -1. At least then you will get an error if you try to use it as an index. The problem with len(s) as a failure, is when you consider rfind(). It should return 1 before the beginning, which coincidentally it does, but you still can't use it as a slice index because it will give you 1 from the end instead. So until we find a another way to do negative sequence indexing. It won't quite work as nice as it should. Cheers, Ron

MRAB

1:48 p.m.

On 05/10/2011 19:33, Ron Adam wrote:

...

On Wed, 2011-10-05 at 08:06 -0700, Ethan Furman wrote:

...
Ron Adam wrote:

...
I really don't like the '-1' for a not found case. They just get in the way.

If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception.

So every time we want to know if s.find() failed, we have to compare to len(s)?

I think probably None would have been better than -1. At least then you will get an error if you try to use it as an index.

None will be rejected as an index, but not as part of a slice:

...

...
...
s = "abcdef" s[None : 4] 'abcd' s[4 : None] 'ef'

...

The problem with len(s) as a failure, is when you consider rfind(). It should return 1 before the beginning, which coincidentally it does, but you still can't use it as a slice index because it will give you 1 from the end instead.

So until we find a another way to do negative sequence indexing. It won't quite work as nice as it should.

Ethan Furman

12:38 a.m.

Guido van Rossum wrote:

...

On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote:

...
However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be.

What works for re.match doesn't work for str.find. With re.match, the result when cast to bool is true when there's a match and false when there isn't. That's elegant.

[snip]

...

But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good.

[snip]

...

Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill

[snip]

...

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

What's correct code worth? My contributions to other Open Source projects is so minor as to not register, but the first bug report/patch I ever submitted was a str.find issue. A structured object that behaved like an int /except/ for its boolean checks might do the trick here. Something like: class FindResult(int): def __bool__(self): return self != -1 Code that checks for -1 (like it should) will keep working, and code that doesn't will start working. ~Ethan~

Greg Ewing

1:08 a.m.

Guido van Rossum wrote:

...

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

Could a with-statement be used here somehow? with finding(x, s) as i: ... -- Greg

Ron Adam

7:42 a.m.

On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...

Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator. for i in finding(x, s): ...

Ethan Furman

9:31 a.m.

Ron Adam wrote:

...

On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...
Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count. Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator.

for i in finding(x, s): ...

How would the case of not found be handled in either of these proposals? with finding(x, s) as i: ... if not i: # same problem as str.find, unless i is not a simple int for i in finding(x, s): ... else: # unless for loop has a break, this will happen... # not a problem until you want more than just the first # occurrence of s in x ~Ethan~

Nick Coghlan

12:25 p.m.

On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote:

...

Ron Adam wrote:

...
On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...
Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator.

for i in finding(x, s): ...

How would the case of not found be handled in either of these proposals?

By never executing the body of the loop. It's still a thoroughly unnatural API for the 0 or 1 case, though. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual)

Ethan Furman

12:56 p.m.

Nick Coghlan wrote:

...

On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote:

...
Ron Adam wrote:

...
On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...
Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator.

for i in finding(x, s): ...

How would the case of not found be handled in either of these proposals?

By never executing the body of the loop. It's still a thoroughly unnatural API for the 0 or 1 case, though.

Let me rephrase: found = "I don't want to get into the cart!".find('z') if found >= 0: # do stuff if found else: # do stuff if not found or found = "I don't want to get into the cart!".find('n') while found >= 0: # do stuff if found found = "I don't want to get into the cart!".find('n', found+1) if found == -1: break else: print('false branch') # do stuff if not found How would we reliably get the false branch with the above proposals? ~Ethan~

MRAB

1:42 p.m.

On 05/10/2011 18:56, Ethan Furman wrote:

...

Nick Coghlan wrote:

...
On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote:

...
Ron Adam wrote:

...
On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...
Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator.

for i in finding(x, s): ...

How would the case of not found be handled in either of these proposals?

By never executing the body of the loop. It's still a thoroughly unnatural API for the 0 or 1 case, though.

Let me rephrase:

found = "I don't want to get into the cart!".find('z') if found >= 0: # do stuff if found else: # do stuff if not found

or

found = "I don't want to get into the cart!".find('n') while found >= 0: # do stuff if found found = "I don't want to get into the cart!".find('n', found+1) if found == -1: break else: print('false branch') # do stuff if not found

How would we reliably get the false branch with the above proposals?

We've had the discussion before about how to handle the case when the body of the loop isn't executed at all. I had the thought that a possible syntax could be: found = "I don't want to get into the cart!".find('n') while found >= 0: # do stuff if found found = "I don't want to get into the cart!".find('n', found+1) or: print('false branch') # do stuff if not found but I think I'll leave it there.

Ron Adam

12:58 p.m.

On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote:

...

Ron Adam wrote:

...
On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote:

...
Guido van Rossum wrote:

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count. Could a with-statement be used here somehow?

with finding(x, s) as i: ...

Or an iterator.

for i in finding(x, s): ...

How would the case of not found be handled in either of these proposals?

with finding(x, s) as i: ... if not i: # same problem as str.find, unless i is not a simple int

I'll let Nick answer this one because I'm not sure about it.

...

for i in finding(x, s): ... else: # unless for loop has a break, this will happen... # not a problem until you want more than just the first # occurrence of s in x

for i in finding(x, s): if i > 25: break <build or get a result> else: return result raise(ValueError("string 's' had an 'x' after position 25") Cheers, Ron

Ethan Furman

1:07 p.m.

Ron Adam wrote:

...

On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote:

...
for i in finding(x, s): ... else: # unless for loop has a break, this will happen... # not a problem until you want more than just the first # occurrence of s in x

for i in finding(x, s): if i > 25: break <build or get a result> else: return result raise(ValueError("string 's' had an 'x' after position 25")

And how did you decide on the magical number 25? ~Ethan~

Ron Adam

1:35 p.m.

On Wed, 2011-10-05 at 11:07 -0700, Ethan Furman wrote:

...

Ron Adam wrote:

...
On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote:

...
for i in finding(x, s): ... else: # unless for loop has a break, this will happen... # not a problem until you want more than just the first # occurrence of s in x

for i in finding(x, s): if i > 25: break <build or get a result> else: return result raise(ValueError("string 's' had an 'x' after position 25")

And how did you decide on the magical number 25?

I knew I should have used 42. ;-) Cheers, Ron

Terry Reedy

4:17 p.m.

On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...

But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good.

People would have to test that the result 'is None' or 'is not None'. That is no worse than testing '== -1' or '>= 0'. I claim it is better because continuing to compute with None as if it were a number will more likely quickly raise an error, whereas doing so with a '-1' that looks like a legitimate string position (the last), but does not really mean that, might never raise an error but lead to erroneous output. (I said 'more likely' because None is valid in slicings, same as -1.) Example: define char_before(s,c) as returning the character before the first occurance of c in s. Ignoring the s.startswith(c) case:

...

...
...
s='abcde' s[s.find('e')-1] 'd' # Great, it works s[s.find('f')-1] 'd' # Whoops, not so great. s[None] fails, as it should.

You usually try to avoid such easy bug bait. I cannot think of any other built-in function that returns such a valid but invalid result.

...

Hence the -1: the intent was that people should write "if s.find(x)>= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part.

As easy or easier than forgetting '== None'

...

We also have str.index which raised an exception, but people dislike writing try/except blocks.

Given that try/except blocks are routinely used for flow control in Python, and that some experts even advocate using them over if/else (leap first), I am tempted to ask why such people are using Python. I am curious, though, why this exception is more objectionable than all the others -- and why you apparently give such objections for this function more weight than for others. One could justify out-of-range IndexError on the basis that an in-range indexing could return any object, including None, so that the signal *must* not be a normal return (even of an exception object). However, Python comes with numerous, probably 100s of functions with restricted output ranges that raise exceptions (TypeError, ValueError, AttributeError, etc) instead of returning, for instance, None. For example, consider int('a'): why not None instead of ValueError? One reason is that s[:int('a')] would then return s instead of raising an error. I strongly suspect that if we did not have str.find now, we would not add it, and certainly not in its current form.

...

I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

So lets deprecate it for eventual removal, maybe in Py4. -- Terry Jan Reedy

Jim Jewett

6 Oct 6 Oct

3:32 p.m.

On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...

On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...

Given that try/except blocks are routinely used for flow control in Python, and that some experts even advocate using them over if/else (leap first), I am tempted to ask why such people are using Python. I am curious, though, why this exception is more objectionable than all the others

str.index is a "little" method that it is tempting to use inline, even as part of a comprehension. There isn't a good way to handle exceptions without a full statement. -jJ

Terry Reedy

5:04 p.m.

On 10/6/2011 4:32 PM, Jim Jewett wrote:

...

On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...
Given that try/except blocks are routinely used for flow control in Python, and that some experts even advocate using them over if/else (leap first), I am tempted to ask why such people are using Python. I am curious, though, why this exception is more objectionable than all the others

str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

That is an argument *for* raising an exception on error. If one uses .find or .index in a situation where success is certain, then it does not matter what would happen on failure. If failure is possible, and users are tempted to skip checking, then the interpreter should raise a fuss (exception), as it does for other 'little' methods like arithmetic and subscript operations. a.find(b) can raise an AttributeError or TypeError, so returning -1 instead of raising ValueError only partly avoids possible exceptions.

...

There isn't a good way to handle exceptions without a full statement.

Neither is there a good way to handle error return values without a full statement. The try/except form may require fewer lines than the if/else form. -- Terry Jan Reedy

Jim Jewett

7 Oct 7 Oct

2:18 p.m.

On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote:

...

On 10/6/2011 4:32 PM, Jim Jewett wrote:

...
On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...

...
...
...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...

...
...
... try/except blocks are routinely used for flow control in Python ... even advocate using them over if/else (leap first)

...

...
str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

...

That is an argument *for* raising an exception on error.

Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can. When I want an inline catch, it always meets the following criteria: (a) The "exception" is actually expected, at least occasionally. (b) The exception is caused by (bad/missing/irrelevant...) input -- nothing is wrong with my computational environment. (c) I do NOT need extra user input; I already know what to do with it. Typically, I just filter it out, though I may replace it with a placeholder and/or echo it to another output stream instead. (d) The algorithm SHOULD continue to process the remaining (mostly good) data. Sometimes, the "bad" data is itself in a known format (like a "." instead of a number); but ... not always. -jJ

Michael Foord

4:54 p.m.

On 7 October 2011 20:18, Jim Jewett wrote:

...

On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote:

...
On 10/6/2011 4:32 PM, Jim Jewett wrote:

...
On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
...
...
...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...
...
...
... try/except blocks are routinely used for flow control in Python ... even advocate using them over if/else (leap first)

...
...
str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

...
That is an argument *for* raising an exception on error.

Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can.

When I want an inline catch, it always meets the following criteria:

(a) The "exception" is actually expected, at least occasionally. (b) The exception is caused by (bad/missing/irrelevant...) input -- nothing is wrong with my computational environment. (c) I do NOT need extra user input; I already know what to do with it.

Typically, I just filter it out, though I may replace it with a placeholder and/or echo it to another output stream instead.

(d) The algorithm SHOULD continue to process the remaining (mostly good) data.

Sometimes, the "bad" data is itself in a known format (like a "." instead of a number); but ... not always.

Yeah, I've quite often worked on data sets where you just need to process what you can and ignore (or replace with placeholders) what you can't. Michael Foord

...

-jJ _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html

Terry Reedy

8:11 p.m.

On 10/7/2011 3:18 PM, Jim Jewett wrote:

...

On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote:

...
On 10/6/2011 4:32 PM, Jim Jewett wrote:

...
On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
...
...
...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...
...
...
... try/except blocks are routinely used for flow control in Python ... even advocate using them over if/else (leap first)

...
...
str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

...
That is an argument *for* raising an exception on error.

Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can.

There is nothing specific to finding the index of a substring in a string in the above statement. If one has a collection of strings, some of which represent ints and some not, the failure of int(item) on some of them is not unexpected.

...

When I want an inline catch, it always meets the following criteria:

Please define 'inline catch' and show how to do it with str.find 'inline' without calling the function twice.

...

(a) The "exception" is actually expected, at least occasionally. (b) The exception is caused by (bad/missing/irrelevant...) input -- nothing is wrong with my computational environment. (c) I do NOT need extra user input; I already know what to do with it.

Typically, I just filter it out, though I may replace it with a placeholder and/or echo it to another output stream instead.

(d) The algorithm SHOULD continue to process the remaining (mostly good) data.

Again, nothing specific to why finding a substring index should be rather unique in having near-duplicate functions, one of which is clearly badly designed by using an inappropriate unix/c-ism. -- Terry Jan Reedy

Ron Adam

10:52 p.m.

On Fri, 2011-10-07 at 15:18 -0400, Jim Jewett wrote:

...

On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote:

...
On 10/6/2011 4:32 PM, Jim Jewett wrote:

...
On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
...
...
...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...
...
...
... try/except blocks are routinely used for flow control in Python ... even advocate using them over if/else (leap first)

...
...
str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

...
That is an argument *for* raising an exception on error.

Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can.

When I want an inline catch, it always meets the following criteria:

(a) The "exception" is actually expected, at least occasionally.

Sometime I feel exceptions are overly general. Ok, so I got a ValueError exception from some block of code... But is it the one I expected, or is it one from a routine in a library I imported and wasn't caught or handled correctly. (ie.. my routine called a function in another module someone else wrote.) One answer to that is to put the try except around the fewest lines of code possible so that it doesn't catch exceptions that aren't related to some specific condition. That leads to possibly quite a few more try-except blocks, and possibly more nested try-except blocks. At some point, it may start to seem like it's a better idea to avoid them rather than use them. What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through... try: ... i = s.index('bar') ... except ValueError from s.index as exc: <handle s.index ValueError> In this case, the only ValueError the except will catch is one originating in s.index. So instead of creating more exception types to handle ever increasing circumstances, we increase the ability to detect them depending on the context. So then I can put a larger block of code inside a try-except and put as many excepts on after the try block to detect various exceptions of the same type, (or different types), raised from possibly different sub parts within that block of code. And if need be, let them bubble out, or handle them. Just a thought... Cheers, Ron

David Townshend

8 Oct 8 Oct

5:28 a.m.

Many of these issues might be solved by providing a one line alternative to the rather unwieldy try statement: try: return function() except ValueError: return default I can't settle on a good syntax though. Two suggestions return function() except(ValueError) default return default if except(ValueError) else function() The second is more readable, but seems a bit backwards, like it's handling the exception before it occurs. Is this idea worth pursuing if we can find the right syntax? On Oct 8, 2011 5:52 AM, "Ron Adam" wrote:

...

On Fri, 2011-10-07 at 15:18 -0400, Jim Jewett wrote:

...
On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote:

...
On 10/6/2011 4:32 PM, Jim Jewett wrote:

...
On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote:

...
On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
...
...
...
We also have str.index which raised an exception, but people dislike writing try/except blocks.

...
...
...
... try/except blocks are routinely used for flow control in Python ... even advocate using them over if/else (leap first)

...
...
str.index is a "little" method that it is tempting to use inline, even as part of a comprehension.

...
That is an argument *for* raising an exception on error.

Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can.

When I want an inline catch, it always meets the following criteria:

(a) The "exception" is actually expected, at least occasionally.

Sometime I feel exceptions are overly general. Ok, so I got a ValueError exception from some block of code... But is it the one I expected, or is it one from a routine in a library I imported and wasn't caught or handled correctly. (ie.. my routine called a function in another module someone else wrote.)

One answer to that is to put the try except around the fewest lines of code possible so that it doesn't catch exceptions that aren't related to some specific condition. That leads to possibly quite a few more try-except blocks, and possibly more nested try-except blocks. At some point, it may start to seem like it's a better idea to avoid them rather than use them.

What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through...

try: ... i = s.index('bar') ... except ValueError from s.index as exc: <handle s.index ValueError>

In this case, the only ValueError the except will catch is one originating in s.index.

So instead of creating more exception types to handle ever increasing circumstances, we increase the ability to detect them depending on the context.

So then I can put a larger block of code inside a try-except and put as many excepts on after the try block to detect various exceptions of the same type, (or different types), raised from possibly different sub parts within that block of code. And if need be, let them bubble out, or handle them.

Just a thought...

Cheers, Ron

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Steven D'Aprano

6:23 a.m.

Ron Adam wrote:

...

Sometime I feel exceptions are overly general. Ok, so I got a ValueError exception from some block of code... But is it the one I expected, or is it one from a routine in a library I imported and wasn't caught or handled correctly. (ie.. my routine called a function in another module someone else wrote.)

I can't say I've very often cared where the exception comes from. But that's because I generally wrap the smallest amount of code possible in an try...except block, so there's only a limited number of places it could come from.

...

One answer to that is to put the try except around the fewest lines of code possible so that it doesn't catch exceptions that aren't related to some specific condition.

Exactly.

...

That leads to possibly quite a few more try-except blocks, and possibly more nested try-except blocks. At some point, it may start to seem like it's a better idea to avoid them rather than use them.

Or refactor parts of your code into a function.

...

What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through...

try: ... i = s.index('bar') ... except ValueError from s.index as exc: <handle s.index ValueError>

I can't imagine that this would even be *possible*, but even if it is, I would say it certainly isn't *desirable*. (1) You're repeating code you expect to fail: you write s.index twice, even though it only gets called once. (2) The semantics are messy and unclear. Suppose you have this: try: ... i = s.index(a) j = s.index(b) + s.index(c) t = s k = t.index(d) method = s.index l = method(e) ... except ValueError from s.index: ... Which potential s.index exceptions will get caught? All of them? Some of them? Only i and j? What if you want to catch only some but not others? How will this construct apply if s or s.index is rebound, or deleted, inside the try block? s += "spam" m = s.index(f) What about this? alist = ['spam', 'ham', s, 'eggs'] results = [x.index('cheese') for x in alist] Should your proposal catch an exception in the list comp? What if you call a function which happens to call s.index? Will that be caught? x = some_function(spam, ham, s, eggs) # happens to call s.index -- Steven

Jan Kaliszewski

3:17 p.m.

Steven D'Aprano dixit (2011-10-08, 22:23):

...

Ron Adam wrote: [snip]

...
What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through...

try: ... i = s.index('bar') ... except ValueError from s.index as exc: <handle s.index ValueError>

I can't imagine that this would even be *possible*, but even if it is, I would say it certainly isn't *desirable*.

(1) You're repeating code you expect to fail: you write s.index twice, even though it only gets called once.

(2) The semantics are messy and unclear. Suppose you have this:

try: ... i = s.index(a) j = s.index(b) + s.index(c) t = s k = t.index(d) method = s.index l = method(e) ... except ValueError from s.index: ...

Which potential s.index exceptions will get caught? All of them? Some of them? Only i and j? What if you want to catch only some but not others? [snip]

Maybe labeling interesting lines of code could be more suitable? E.g.: try: ... 'risky one' i = s.index('bar') ... except ValueError from 'risky one' as exc: Cheers. *j

Ron Adam

9 Oct 9 Oct

1:31 a.m.

On Sat, 2011-10-08 at 22:23 +1100, Steven D'Aprano wrote:

...

Ron Adam wrote:

...

...
That leads to possibly quite a few more try-except blocks, and possibly more nested try-except blocks. At some point, it may start to seem like it's a better idea to avoid them rather than use them.

Or refactor parts of your code into a function.

Of course refactoring a bit of code so as to be sensitive to the context would help, but it's not always as straight forward as it seems. It's not a tool you would use everywhere.

...

...
What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through...

try: ... i = s.index('bar') ... except ValueError from s.index as exc: <handle s.index ValueError>

I can't imagine that this would even be *possible*, but even if it is, I would say it certainly isn't *desirable*.

(1) You're repeating code you expect to fail: you write s.index twice, even though it only gets called once.

Right, and if all you are interested in is just that, you would just wrap that part in regular try except and not do it this way.

...

(2) The semantics are messy and unclear. Suppose you have this:

try: ... i = s.index(a) j = s.index(b) + s.index(c) t = s k = t.index(d) method = s.index l = method(e) ... except ValueError from s.index: ...

Which potential s.index exceptions will get caught? All of them? Some of them? Only i and j? What if you want to catch only some but not others?

Any of them you put inside the try-except block. But it would not catch a ValueError caused by some other function or method in the '...' part of the example.

...

How will this construct apply if s or s.index is rebound, or deleted, inside the try block?

s += "spam" m = s.index(f)

The rebinding of the name doesn't matter as it is an object comparison. It may work more like... try: ... s += "spam" m = s.index(f) ... except ValueError as e: if e.__cause__ is str.index: ... raise

...

What about this?

alist = ['spam', 'ham', s, 'eggs'] results = [x.index('cheese') for x in alist]

Should your proposal catch an exception in the list comp?

Sure, why not?

...

What if you call a function which happens to call s.index? Will that be caught?

x = some_function(spam, ham, s, eggs) # happens to call s.index

The str.index method would be the source of the exception unless some_function() catches it. It could raise a new exception and then it would be reported as the cause. Cheers, Ron

Steven D'Aprano

7 Oct 7 Oct

8:45 p.m.

Terry Reedy wrote:

...

On 10/4/2011 10:21 PM, Guido van Rossum wrote:

...
But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good.

People would have to test that the result 'is None' or 'is not None'. That is no worse than testing '== -1' or '>= 0'. I claim it is better because continuing to compute with None as if it were a number will more likely quickly raise an error, whereas doing so with a '-1' that looks like a legitimate string position (the last), but does not really mean that, might never raise an error but lead to erroneous output. (I said 'more likely' because None is valid in slicings, same as -1.)

Agreed. But... [...]

...

...
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.

So lets deprecate it for eventual removal, maybe in Py4.

Let's not. Although I stand by my earlier claim that "raise an exception" is the Obvious Way to deal with error conditions in Python, for some reason that logic doesn't seem to apply to searching. Perhaps because "not found" is not an error, it just feels uncomfortable, to many people. Whenever I use list.index, I always find myself missing list.find. Perhaps it is because using try...except requires more effort. It just feels wrong to write (for example): try: n = s.index('spam') except ValueError: pass else: s = s[n:] instead of: n = s.find('spam') if n >= 0: s = s[n:] This is especially a factor when using the interactive interpreter. (I also wonder about the performance hit of catching an exception vs. testing the return code. In a tight loop, catching the exceptions may be costly.) I don't think there is any perfect solution here, but allowing people the choice between index and find seems like a good plan to me. Using -1 as the not-found sentinel seems to be a mistake though, None would have been better. That None is valid in slices is actually a point in it's favour for at least two use-cases: # extract everything after the substring (inclusive) # or the entire string if not found n = s.find('spam') substr = s[n:] # extract everything before the substring (exclusive) # or the entire string if not found n = s.find('spam') substr = s[:n] There are other cases, of course, but using None instead of -1 will generally give you an exception pretty quickly instead of silently doing the wrong thing. -- Steven

Jan Kaliszewski

6 Oct 6 Oct

10:47 a.m.

Guido van Rossum dixit (2011-10-04, 19:21):

...

Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill, and returning an (index, success) tuple is begging for lots of mysterious occurrences of [0] or [1].

A lightweight builtin type whose instances would have `index` attribute might do the job well (together with None as not-found). A naive pure-Python implementation: class Found(object): __slots__ = 'index', def __init__(self, index): self.index = index Example usage: found = s.find('foo') if found: # or more explicit: `if found is not None:` print('foo found at %d' % found.index) else: # found is None print('foo not found') Of course that would be probably a new method, not str.find(), say: str.search(). Then it could be possible to make it a bit more universal, accepting substring tuples (as startswith/endswith already do): Example usage: one_of = 'foo', 'bar', 'baz' found = s.search(one_of) if found: print('%s found at %d' % (found.substring, found.index)) else: print('None of %s found' % one_of) The 4th line could be respelled as: index, substring = found print('%s found at %d' % (substring, index)) A naive implementation of s.search() result type: class Found(object): __slots__ = 'index', 'substring' def __init__(self, index, substring): self.index = index self.substring = substring def __iter__(self): yield self.index yield self.substring Cheers, *j

Ron Adam

12:42 p.m.

On Thu, 2011-10-06 at 17:47 +0200, Jan Kaliszewski wrote:

...

A lightweight builtin type whose instances would have `index` attribute might do the job well (together with None as not-found).

A naive pure-Python implementation:

class Found(object): __slots__ = 'index', def __init__(self, index): self.index = index

...

... found = s.search(one_of) ...

It seems to me, the methods on the string object should be the lower level fast C methods that allow for efficient higher level functions to be built. A search class may be a good addition to string.py. It already has higher order stuff for composing strings, format and template, but nothing for decomposing strings. Cheers, Ron

Jan Kaliszewski

4:41 p.m.

Ron Adam dixit (2011-10-06, 12:42):

...

It seems to me, the methods on the string object should be the lower level fast C methods that allow for efficient higher level functions to be built.

I suggest to make it as a C method and type. The Python implementation I gave is an illustration only ("naive implementation"). Cheers. *j

David Townshend

4 Oct 4 Oct

1:58 a.m.

...

def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default

The problem with a general convert function is that to make it work, you would need to account for several variations and the signature gets rather clunky. Personally, I think that the try format: try: return float('some text') except ValueError: return 42 is more readable than try_convert('some text', float, 42, (ValueError,)) because it is clear what it does. The second form is shorter, but not as descriptive. However, float('some text', default=42) follows the existing syntax quite nicely, and is more readable than either of the other options. A generalised try_convert method would be useful, but I think I would rather see a one-line version of the try statements, perhaps something like this: x = try float('some text') else 42 if ValueError

Masklinn

2:37 a.m.

On 2011-10-04, at 08:58 , David Townshend wrote:

...

...
def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default

The problem with a general convert function is that to make it work, you would need to account for several variations and the signature gets rather clunky. Personally, I think that the try format:

try: return float('some text') except ValueError: return 42

is more readable than

try_convert('some text', float, 42, (ValueError,))

because it is clear what it does. The second form is shorter, but not as descriptive. However,

float('some text', default=42)

follows the existing syntax quite nicely, and is more readable than either of the other options.

A generalised try_convert method would be useful, but I think I would rather see a one-line version of the try statements, perhaps something like this:

x = try float('some text') else 42 if ValueError That's basically what the function you've rejected does (you got the arguments order wrong):

x = try_convert(float, 'some text', default=42, ignored=ValueError) Just rename an argument or two and you have the exact same thing.

David Townshend

3:41 a.m.

On Tue, Oct 4, 2011 at 9:37 AM, Masklinn wrote:

...

On 2011-10-04, at 08:58 , David Townshend wrote:

...
...
def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default

The problem with a general convert function is that to make it work, you would need to account for several variations and the signature gets rather clunky. Personally, I think that the try format:

try: return float('some text') except ValueError: return 42

is more readable than

try_convert('some text', float, 42, (ValueError,))

because it is clear what it does. The second form is shorter, but not as descriptive. However,

float('some text', default=42)

follows the existing syntax quite nicely, and is more readable than either of the other options.

A generalised try_convert method would be useful, but I think I would rather see a one-line version of the try statements, perhaps something like this:

x = try float('some text') else 42 if ValueError That's basically what the function you've rejected does (you got the arguments order wrong):

x = try_convert(float, 'some text', default=42, ignored=ValueError)

Just rename an argument or two and you have the exact same thing.

Same functionality, but try_convert is a function with lots of arguments

whereas my alternative is an expression. But to be honest, I don't really like either. In cases that require the level of control that try_convert provides, the try statement is cleaner. The point I'm really trying to make is that my initial proposal was for a specific but common use case (float and int), not a general-purpose conversion tool.

Paul Moore

3:51 a.m.

On 4 October 2011 09:41, David Townshend wrote:

...

Same functionality, but try_convert is a function with lots of arguments whereas my alternative is an expression. But to be honest, I don't really like either. In cases that require the level of control that try_convert provides, the try statement is cleaner. The point I'm really trying to make is that my initial proposal was for a specific but common use case (float and int), not a general-purpose conversion tool.

I think the point you're missing is that most people here don't see using a default in place of garbage input (as opposed to just for empty input) as a "common" use case. Certainly not common enough to warrant a language change rather than a private utility function... Paul.

Bruce Leban

9:55 p.m.

On Mon, Oct 3, 2011 at 11:58 PM, David Townshend wrote:

...

A generalised try_convert method would be useful, but I think I would rather see a one-line version of the try statements, perhaps something like this:

x = try float('some text') else 42 if ValueError

for parallelism with if/else operator I'd like float('some text') except ValueError then 42 which is equivalent to calling: def f(): try: return float('some text') except ValueError return 42 For example, float(foo) except ValueError then None if foo else 0 or equivalently: float(foo) if foo else 0 except ValueError then None Of course this requires a new keyword so the chances of this being added are slim. --- Bruce w00t! Gruyere security codelab graduated from Google Labs! http://j.mp/googlelabs-gruyere Not to late to give it a 5-star rating if you like it. :-)

Greg Ewing

3 Oct 3 Oct

5:40 a.m.

David Townshend wrote:

...

My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0)

I think I'd be more likely to want to report an error to the user than to blindly return a default value. If I did want to do this, I'd be happy to write my own function for it. It could even be made generic: def convert(text, func, default): try: return func(text) except ValueError: return default -- Greg

Laurens Van Houtven

5:42 a.m.

I don't think David is arguing for the default behavior to change -- merely that you get a dict.get style default. Kinda similar to getattr/2 raising AttributeError, and getattr/3 returning the default value. cheers lvh On 03 Oct 2011, at 12:40, Greg Ewing wrote:

...

David Townshend wrote:

...
My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0)

I think I'd be more likely to want to report an error to the user than to blindly return a default value.

If I did want to do this, I'd be happy to write my own function for it.

It could even be made generic:

def convert(text, func, default): try: return func(text) except ValueError: return default

-- Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Greg Ewing

5:54 a.m.

Laurens Van Houtven wrote:

...

I don't think David is arguing for the default behavior to change -- merely that you get a dict.get style default.

I know, but experience shows that the dict.get() default is very useful in practice. I'm skeptical that the proposed feature would -- or should -- be used often enough to justify complexifying the constructor signature of int et al. The big difference as I see it is that, very often, failing to find something in a dict is not an error, but an entirely normal occurrence. On the other hand, passing something that isn't a valid int representation to int() is most likely the result of a user entering something nonsensical. In that case, the principle that "errors should not pass silently" applies. What's worse, it could become an attractive nuisance, encouraging people to mask bad input rather than provide the user with appropriate feedback. -- Greg

Dirkjan Ochtman

6:07 a.m.

On Mon, Oct 3, 2011 at 12:54, Greg Ewing wrote:

...

The big difference as I see it is that, very often, failing to find something in a dict is not an error, but an entirely normal occurrence. On the other hand, passing something that isn't a valid int representation to int() is most likely the result of a user entering something nonsensical. In that case, the principle that "errors should not pass silently" applies.

Hmm, not really true in my experience. Here's some actual code from my codebase at work: v = float(row[dat]) if row[dat] else 0.0 d.append(float(row[t]) if row[t] else 0.0) gen = (float(i) if i != '.' else None for i in row[1:]) limits = [(float(i) if i != '.' else None) for i in ln[5:15]] line[i] = (None if line[i] == '.' else float(line[i])) ls.append(float(row[i]) if row[i] else None) data[row['s']] = float(val) if '.' in val else int(val) cur.append(float(ln[f]) if ln[f] else None) cur.append(float(ln['DL']) if ln['DL'] else None) pv = float(ln['PV']) if ln['PV'] else None mgn = float(ln['MGN']) if ln['MGN'] else None f = lambda x: float(x) if x else 1 data[sn] += float(row['PC']) if row['PC'] else 0.0, row['PCC'] ubsc = 1 if not row['CSCALE'] else float(row['CSCALE']) scale = float(row['ESCALE']) if row['ESCALE'] else 1.0 efp = float(row['FSCALE']) if row['FSCALE'] else 1.0 convert = lambda x: float(x) if x else None In other words, this happens a lot in code where you deal with data from a third party that you want to convert to some neater structure of Python objects (in cases where a null value occurs in that data, which I would suggest is fairly common out there in the Real World). Throwing a ValueError is usually not the right thing to do here, because you still want to use all the other data that you got even if one or two values are unavailable. Converting two of the above examples: pv = float(ln['PV']) if ln['PV'] else None pv = float(ln['PV'], default=None) d.append(float(row[t]) if row[t] else 0.0) d.append(float(row[t], default=0.0)) It's a little shorter and seems easier to read to me (less repetition). Cheers, Dirkjan

Mark Dickinson

7:06 a.m.

On Mon, Oct 3, 2011 at 12:07 PM, Dirkjan Ochtman wrote:

...

Converting two of the above examples:

pv = float(ln['PV']) if ln['PV'] else None pv = float(ln['PV'], default=None)

d.append(float(row[t]) if row[t] else 0.0) d.append(float(row[t], default=0.0))

It's a little shorter and seems easier to read to me (less repetition).

But the two versions you give aren't equivalent. With: pv = float(ln['PV']) if ln['PV'] else None we'll get a ValueError if ln['PV'] contains some non-float, non-empty garbage value. With: pv = float(ln['PV'], default=None) and (IIUC) the proposed semantics, that garbage value will be turned into None instead, which is definitely not what I'd want to happen in normal usage. -- Mark

Dirkjan Ochtman

7:11 a.m.

On Mon, Oct 3, 2011 at 14:06, Mark Dickinson wrote:

...

But the two versions you give aren't equivalent. With:

pv = float(ln['PV']) if ln['PV'] else None

we'll get a ValueError if ln['PV'] contains some non-float, non-empty garbage value. With:

pv = float(ln['PV'], default=None)

and (IIUC) the proposed semantics, that garbage value will be turned into None instead, which is definitely not what I'd want to happen in normal usage.

Yeah, I guess you're right, and I'd definitely not want unexpected garbage values to go unnoticed. Cheers, Dirkjan

Greg Ewing

3:52 p.m.

Dirkjan Ochtman wrote:

...

Hmm, not really true in my experience. Here's some actual code from my codebase at work:

v = float(row[dat]) if row[dat] else 0.0 d.append(float(row[t]) if row[t] else 0.0) gen = (float(i) if i != '.' else None for i in row[1:])

This is different. You're looking for a particular value (such as an empty string or None) and treating it as equivalent to zero. That's not what the OP suggested -- he wants *any* invalid string to return the default value. That's analogous to using a bare except instead of catching a particular exception. -- Greg

Michael Foord

8:55 a.m.

On 3 October 2011 08:52, David Townshend wrote:

...

My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0) 0.0

Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse http://msdn.microsoft.com/en-us/library/994c0zb1.aspx The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like: success, value = float.parse('thing') if success: ... Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails. All the best, Michael Foord

...

I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than:

try: return float(s) except ValueError: return 0.0

Any takers?

David _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Massimo Di Pierro

9:10 a.m.

+1 On Oct 3, 2011, at 8:55 AM, Michael Foord wrote:

...

On 3 October 2011 08:52, David Townshend wrote: My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0) 0.0

Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse

http://msdn.microsoft.com/en-us/library/994c0zb1.aspx

The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like:

success, value = float.parse('thing') if success: ...

Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails.

All the best,

Michael Foord

I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than:

try: return float(s) except ValueError: return 0.0

Any takers?

David _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

-- http://www.voidspace.org.uk/

May you do good and not evil May you find forgiveness for yourself and forgive others

May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Massimo Di Pierro

9:14 a.m.

or float("cannot convert this") or 0.0 if ValueError i.e. map x = [expression] or [value] if [exception] into try: x = [expression] except [exception] x = [value] On Oct 3, 2011, at 8:55 AM, Michael Foord wrote:

...

On 3 October 2011 08:52, David Townshend wrote: My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g:

...
...
...
float('cannot convert this', default=0.0) 0.0

Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse

http://msdn.microsoft.com/en-us/library/994c0zb1.aspx

The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like:

success, value = float.parse('thing') if success: ...

Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails.

All the best,

Michael Foord

I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than:

try: return float(s) except ValueError: return 0.0

Any takers?

David _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

-- http://www.voidspace.org.uk/

May you do good and not evil May you find forgiveness for yourself and forgive others

May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Jan Kaliszewski

6 Oct 6 Oct

4:54 p.m.

Michael Foord dixit (2011-10-03, 14:55):

...

http://msdn.microsoft.com/en-us/library/994c0zb1.aspx

The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like:

success, value = float.parse('thing') if success: ...

Nice. +1 from me. *j

4576

Age (days ago)

4582

Last active (days ago)

List overview

Download

59 comments

23 participants

participants (23)

Bruce Leban
Carl Matthew Johnson
Chris Rebert
David Townshend
Dirkjan Ochtman
Ethan Furman
Greg Ewing
Guido van Rossum
Jan Kaliszewski
Jim Jewett
Laurens Van Houtven
Mark Dickinson
Masklinn
Massimo Di Pierro
Matt Joiner
Michael Foord
MRAB
Nick Coghlan
Paul Moore
Raymond Hettinger
Ron Adam
Steven D'Aprano
Terry Reedy

Default return values to int and float

David Townshend

David Townshend

Jan Kaliszewski

Carl Matthew Johnson

David Townshend

Jan Kaliszewski

Jan Kaliszewski

Jan Kaliszewski

David Townshend

David Townshend

Jan Kaliszewski

tags

participants (23)