> In close to 10 years of experience with python I have never encountered anything like this.

Here's a small selection of the StackOverflow questions from people who encountered this exact issue:

https://stackoverflow.com/questions/25336726/why-cant-i-iterate-twice-over-the-same-iterator-how-can-i-reset-the-iterator
https://stackoverflow.com/questions/10255273/iterating-on-a-file-doesnt-work-the-second-time?noredirect=1&lq=1
https://stackoverflow.com/questions/3906137/why-cant-i-call-read-twice-on-an-open-file
https://stackoverflow.com/questions/17777219/zip-variable-empty-after-first-use
https://stackoverflow.com/questions/42246819/loop-over-results-from-path-glob-pathlib
https://stackoverflow.com/questions/21715268/list-returned-by-map-function-disappears-after-one-use
https://stackoverflow.com/questions/14637154/performing-len-on-list-of-a-zip-object-clears-zip
https://stackoverflow.com/questions/44420135/filter-object-becomes-empty-after-iteration

Note that questions usually get few votes, and "what's wrong with my code" questions are especially poorly received, so getting even a couple of votes is a strong signal. The questions above range from 10 to 124 (!) votes, and have a combined 250k+ views.

These are the people I'd like to help.

> If you could give a full real-life scenario, then it might expose the problem (if it exists) better.

Open a log file, count the number of lines, then find both the longest and number of unique "error" entries. Implemented in the most obvious way I can, using builtin functions, it has *two* such bugs (reusing the exhausted "f" and "error_lines").

import re
error_regex = re.compile('^ERROR: ')

with open('logs.txt') as f:
    n_lines = len(list(f))
    error_lines = filter(error_regex.match, f)
    longest_error = max(error_lines, key=len, default='')
    n_unique_errors = len(set(error_lines))

print(f'{n_lines=}\n{longest_error=}\n{n_unique_errors=}')


Is it hard to fix? No, not all, just store "list(f)" and replace "filter" with a longer list comprehension. Is it easy to spot? For an experienced developer, in this short example, with all the parts introduced together, yes. But having a natural solution silently give wrong answers is dangerous. At least having a warning would break the false sense of security.

> If I wanted sorted numbers, then ValueError wouldn’t help, because I do not get sorted numbers.

I do want sorted numbers, but what can Python do in the face of broken code? There's a reason it raises errors for 1/0, str.invalid, and len(None). It's not "helpful" to the program, but it stops execution from continuing with a bad state.

I understand that backwards compatibility will probably prevent us from raising a new error. But a warning could help a lot of people.

I'm tempted to patch the Python interpreter and test some popular packages, to verify if doing this on purpose is as rare as I think it is.

On Tue, Jun 13, 2023, at 6:50 PM, Dom Grigonis wrote:
In close to 10 years of experience with python I have never encountered anything like this.

If I need to use a list later I never do ANY assignments to it. Why would I?

In the last example I would:
```
strings = ['aa', '', 'bbb', 'c’]
longest = max(filter(bool, strings), key=len)
n_unique = len(set(strings))
```

And in initial example I don’t see why would I ever do this. It is very unclear what is the scenario here:
```???
numbers = (i for i in range(5))
assert 5 not in numbers
sorted(numbers)
```
1. If I wanted sorted numbers, then ValueError wouldn’t help, because I do not get sorted numbers.
2. If I wanted unmodified list and if it was modified then it is an error, your solution doesn’t work either.
3. If sorting is ok only on non-empty iterator, then just `assert sorted` after sorting.

If you could give a full real-life scenario, then it might expose the problem (if it exists) better.
"There should be one-- and preferably only one --obvious way to do it.

There is either: something to be improved or you are not using that "one obvious" way.

On 13 Jun 2023, at 18:05, BoppreH via Python-ideas <python-ideas@python.org> wrote:

@ChrisA: Shadowing "iter()" would only help with Barry's example.

@Jonathan: Updating documentation is helpful, but I find an automated check better. Too often the most obvious way to accomplish something silently triggers this behavior:

strings = ['aa', '', 'bbb', 'c']
strings = filter(bool, strings) # Adding this step makes n_unique always 0.
longest = max(strings, key=len)
n_unique = len(set(strings))

I feel like a warning here would save time and prevent bugs, and that my is_exhausted proposal, if implemented directly in the generators, is an easy way to accomplish this.

And I have to say I'm surprised by the responses. Does nobody else hit bugs like this and wish they were automatically detected? To be clear, raising ValueError is just an example; logging a warning would already be helpful, like Go's race condition detector.


--
BoppreH
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org