[Python-ideas] Re: Incremental step on road to improving situation around iterable strings

24 Feb 2020

      I would like to reiterate a point that I think is very important and many
people seem to be brushing aside. We don't have to *break* existing code.
We can get a lot of value, at least in terms of aiding debugging, just by
adding a warning. That warning never has to evolve into an exception,
certainly not anytime soon. The most damage it would do is some clutter in
stderr, and that would only be for some time while libraries adapted.
People add deprecation warnings all the time.

Consider again the example of taking the boolean of a numpy array or pandas
Series. That certainly broke some existing code. And it broke consistency
where bool() is usually determined by len(). But most importantly, it was a
reversible change. Right now, the maintainers could look at the community
reaction and decide to make bool(array) work again as expected, or maybe in
a new way. Doing so wouldn't break any working code because no working code
uses bool(array). But they have chosen not to, presumably because they
believe the current behaviour is still for the best. So here's what I take
from all this:

1. The 'experiment' to force users to state their intentions explicitly to
avoid subtle logical bugs is deemed a success.
2. If our 'experiment' failed and users were really offended by seeing
warnings, we could undo it. We'd leave chars() behind as a noop. No code
would be broken by the reversal. So the extent of the damage in the worst
case scenario would be even more limited. You might complain that now
there'd be two ways to iterate over characters, but similarly I always
choose to add .keys() when I iterate over a dict even though it's
redundant, because it makes the code clearer.
3. Regarding a point made by Chris: introducing the error in bool() is
considered OK even though it's sometimes hard to see where bool() is being
used, such as when a user writes `df[0 < df.val < 1]` which is the
equivalent of `df[0 < df.val and df.val < 1]` when they want the behaviour
of `df[0 < df.val & df.val < 1]`.

On Mon, Feb 24, 2020 at 10:12 PM Brandt Bucher <brandtbucher@gmail.com>
wrote:
...
I agree with the numerous posters who have brought up the
backward-compatibility concern. This change *would* break lots of code. At
the same time, this bites me consistently, so I'd like to do something
soon... at least sooner than 6.0 ;).
I believe that this is better solved by static analysis. I suggested some
time ago on typing-sig that we explore adding a `Chr` type to typing, and
type `str` as a `Sequence[Chr]` rather than a `Sequence[str]`. You can read
the proposal here (it's not very complex at all, and should be
backward-compatible for all but the hairiest cases, which just need either
a cast or an annotation):
https://mail.python.org/archives/list/typing-sig@python.org/thread/OLCQHSNCL...
With it, we have a path forward where type-checkers like mypy assure us
that we're really doing what we think we're doing with that string, and
require explicit annotations or casts for the ambiguous cases. That
discussion fizzled out, but I'm still very much interested in exploring the
idea if it seems like a realistic alternative. I think it makes much more
sense than changing the mostly-sensible, well-known, often-used runtime
behavior of strings.
Brandt
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/IZAH5A...
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Incremental step on road to improving situation around iterable strings

Alex Hall