[Python-ideas] Re: Adding pep8-casing-compliant aliases for the entire stdlib

Nov. 11, 2021

      Okay, so from the replies so far it looks like this is very quickly going
into the 'never gonna happen' dumpster, so in the interests of salvaging
*something* out of it:
...
I'm a -1 on this proposal, as I don't see any way of doing it that
wouldn't cause a huge amount of disruption. Yes, the situation — especially
with regard to unittest and logging — is far from ideal. But, it's what
we've got.
See, I just dislike having to settle for 'it's what we've got'. With these
two modules in particular, a lot of the arguments that have been made so
far either don't apply or are not as strong (such as the confusion of
having builtins.list, builtins.List and typing.List). Additionally, these
are significantly more ancillary portions of the stdlib than the builtins,
much less likely to cause as severe of a disruption (I personally don't
believe a backward-compatible change like this which only adds aliases
would be as disruptive as many people claim, but concede that that's
subjective), and much less likely to have the implementation change so
drastically as to want to change out types for factory functions or
vice-versa.

So perhaps we could narrow the scope of this down to just adding snake_case
aliases to the logging and unittest modules (and any other places in the
stdlib where camelCase names still exist), and leave the lowercase names
alone. I'm realizing that I actually just plain forgot to list converting
camelCase modules to snake_case as one of the bullet-points in the original
post. That was an oversight, I did fully intend that to also be in scope.

I'm very much of the philosophy of not letting perfect be the enemy of good
(and yes, I know that my 'good' might be your 'evil', that's life), so if
we can at least get modules like logging and unittest offering snake_case
aliases for all their names I would call that at least a moderate win.

With that out of the way, I'll just address a few other points:

Yes, it IS an argument against any change, and the question is: how
...
strong are the arguments in favour of the change? With new syntax or
APIs, the advantage is the increased expressiveness; with this, it's
the exact same thing, but spelled differently. So your choices are (a)
the thing that works all the way back to Python 2.x and will continue
to work in the future; or (b) the otherwise-identical thing that only
works from version X onwards. There's no incentive to move, so people
won't move, so the bulk of code out there will continue to use the
existing names.
I think saying there's no advantage is definitely a bit uncharitable.
There's a gain in consistency and clarity, less surprises, and a lower
cognitive load (that's the entire point of casing conventions, just by
looking at a name you can glean some information about what sort of thing
it references). And I think you're 100% wrong about people not switching to
new names over time, especially if linters started flagging the old names
as warnings and the 'best-practices' recommendation became using the new
names.

Someone mentioned a 20-year-old codebase too large to ever refactor,
because the clients would be unwilling to pay for refactors that add no
functionality. This is the exception, not the rule. I doubt even 5% of
applications written in python are run for 20 years, and long-lived
libraries usually have at least one or two major versions with large
internal refactors over timescales that large. But that's all irrelevant
anyway, because I'm not proposing removing the existing names. No
backwards-compatibility will be harmed, and such projects will be able to
happily go on using legacy names forever if they choose.

Since most code is a lot more short-lived than that we would likely be in a
position where a majority of code being actively *run* in the wild would
use the new names within 15-20 years, and the vast majority of *new code*
being written would be using them.

They're not the strongest argument. The strongest argument is churn -
...
lots and lots of changes for zero benefit.
I must not be understanding what you mean by churn in this context, because
to me this seems quite minor in terms of changes.

From the implementation side:
- A one-time addition of aliases to the stdlib
- If something is deprecated/removed/renamed in the future it's alias would
also have to be removed
- That's it, there wouldn't be any maintenance needed beyond that, new
additions would just use pep8-compliant casing

From the usage side:
- When writing new code, prefer the new names. You'll even be helpfully
nudged along by your IDE
- *If* you choose to do so, optionally do a refactor pass to change
existing names in your current projects/codebases. If you choose not to,
just relax your linter's legacy-name check so it doesn't bother you.

The last bullet-point is the big one where I can see the argument that
there would be a lot of churn. But the key thing is it's fully optional. As
long as *new code* tends to use the aliases more than 50% of the time,
we'll be trending in the right direction.

That's an unknowable, because things change. If str can change from
...
being a function to a type early in Python 2.x, and range can change
from being a function to a type in 3.0, then what next? Will compile
become a type some day? Or chr? Would an equally hypothetical "Python
invented in 2032" need to rename another bunch of things?
But changing a type to a function and vice-versa *is* a
backwards-incompatible change. If someone has written code that subclasses
`collections.deque` and a new python version converts  `collections.deque`
into a factory function, that code will, with certainty, break when that
person upgrades their interpreter to the new version.

I see no problem changing a name as part of an already-breaking change. In
fact, it's probably safer, with less risk of subtle bugs.

The distinction between "this is a type" and "this is a function" is
...
often relatively insignificant. The crux of your proposal is that it
should be more significant, and that the fundamental APIs of various
core Python callables should reflect this distinction. This is a lot
of churn and only a philosophical advantage, not a practical one.
As mentioned above, I don't think the distinction between functions and
types is anywhere near as minor as you're suggesting. There's also the fact
that reliably being able to tell if something is a type based on its casing
immediately tells you useful implementation details you may be able to use,
without needing to go read the documentation.

For example, when I look at `builtins.range` it's basically a black box
unless I dig deeper. Since I can't rely on it being a type or a function
from its name alone I don't know if it will return a `range` object, or a
generic generator, or something else. If it were called `builtins.Range`,
I'd immediately know that I could, for example, check if something is a
`Range` object with `isinstance()`. I also instantly know that I can
potentially subclass it if I want to extend its functionality. That alone
has value. It's not just a purely philosophical advantage, as you're
suggesting.

But at this stage I'm not so naive as to think this is a battle I can win,
so could we refocus the discussion on the new scope which is:

- Add pep8-compliant aliases for camelCased public-facing names in the
stdlib (such as logging and unittest) in a similar manner as was done with
threading

Cheers everyone :)

On Thu, Nov 11, 2021 at 5:51 PM Chris Angelico <rosuav@gmail.com> wrote:
...
...
...
ISTM that this indicates that you're putting too much focus on PEP 8
too early. At no time does the document ever state that all Python
code ever written must comply with it. New Python programmers should
not feel like they're being forced into a naming convention.
That's fair enough for people learning python as a hobby or in a context
On Fri, Nov 12, 2021 at 2:22 AM Matt del Valle <matthewgdv@gmail.com>
wrote:
that's a bit more casual than agile. I generally find myself training
junior/graduate people with OOP backgrounds other than Python (Java, C#,
etc.) for my data engineering team and pep8-compliance is quite important,
because they won't even be able to successfully make commits unless they
can pass the flake8 pre-commit hook, let alone get as far as a merge
review. So I'd say that pep8 is quite important from the get-go.
...
To clarify: Your pre-commit hook is what is mandating this. Not
Python. If your organization has this requirement, then it is your
organization's decision how to do things.
I suspect that the vast majority of Python programmers do not have
pre-commit hooks that run flake8 (though I don't have stats). PEP 8 is
incredibly important precisely because your organization has made it
so.
...
...
Pop quiz: Which of these are types and which are functions (or
something else)?
bool, classmethod, divmod, enumerate, globals, map, property, sorted,
super, zip
collections.deque, collections.namedtuple
Does it even matter? Especially: does it matter enough to force a name
change if a function is replaced with a type, or vice versa?
I'd argue this is actually a point *in favor* of the proposal rather
than against. In going through your list I actually discovered lots of
things I found extremely surprising (collections.deque is a type as I
expected whereas collections.namedtuple is actually a factory function).
Not all of those have been the same at all times. For instance, range
is a function in Python 2, but a class in 3. Should it be renamed? And
I'll have to get someone else to confirm, but I believe that str, int,
etc were functions for a lot of Python's history.
...
If pep8 had been introduced at the dawn of (python-)time and the stdlib
had been designed with pep8-compliance in mind, there would be no surprise
at all on the day that I think to myself: 'Hmm, I want to subclass
namedtuple, let's try:'
class MyNamedTuple(collections.namedtuple):
    ...
and discover that it doesn't work.
A lot of people want this (or the similar concept "isinstance(x,
collections.namedtuple)"), and it doesn't work because namedtuple is a
category of classes, rather than a class itself. But suppose that were
to change in the future - if namedtuple becomes a superclass or
metaclass, and is then a type instead of a function. Does it
immediately have to be renamed NamedTuple, or should it remain as it
is? The churn has no value.
...
...
Wait, are you deprecating them or not? I'm confused. Earlier you were
saying that you clearly wanted to stop people from using the old
names, but now you're saying there's no rush to deprecate them.
This is a semantic misunderstanding, I should have been clearer. In my
head, 'officially deprecating' something means saying that it will
*definitely* be removed in the future, and optionally specifying the
precise date at which it will be removed. Since my suggestion doesn't
necessarily involve ever actually removing the current names from the
language (only potentially, if a future steering council decides to as part
of a future proposal), I didn't consider this to be deprecating them,
exactly. Hopefully that's cleared up. Sorry for the confusing wording.
Understood. In this case, it sounds like you ARE deprecating them, but
without a specific removal date.
...
...
Suppose that these aliases were added in Python 3.11. Anyone who wants
to write code compatible with 3.10 would want to continue using the
existing names, and since the existing names would keep on being
supported for the foreseeable future, there would be little incentive
to change until several versions have passed. At that point, both
names might start being used in parallel (we're talking probably 2025
or thereabouts, although it might start sooner if people are also
using syntactic features from 3.11+), but it would take a VERY long
time for adoption to the level that the older names are "barely ever
even seen". Probably never.
This is an argument that can be made for any change ever made to python,
and consequently I think it's an extremely weak argument. Why change
anything if people won't be confident using it for at least 5 years because
of compatibility concerns with current language versions? Well, as it turns
out time passes and eventually that distant 5-year mark is in the past. I
just checked and... wow yeah f-strings were introduced in 2016! Feels like
only yesterday :)
Yes, it IS an argument against any change, and the question is: how
strong are the arguments in favour of the change? With new syntax or
APIs, the advantage is the increased expressiveness; with this, it's
the exact same thing, but spelled differently. So your choices are (a)
the thing that works all the way back to Python 2.x and will continue
to work in the future; or (b) the otherwise-identical thing that only
works from version X onwards. There's no incentive to move, so people
won't move, so the bulk of code out there will continue to use the
existing names.
...
...
Absolutely no value in adding aliases for everything, especially
things that can be shadowed. It's not hugely common, but suppose that
you deliberately shadow the name "list" in your project - now the List
alias has become disconnected from it, unless you explicitly shadow
that one as well. Conversely, a much more common practice is to
actually use the capitalized version as a variant:
class List(list):
    ...
This would now be shadowing just one, but not the other, of the
built-ins. Confusion would abound.
I think this is a fair point, but the same can be said of people
accidentally shadowing any number of other builtins (it's probably most
common with builtins like `type` or `max`). Fortunately every linter/IDE
I've used has had warnings for this so it should be something that happens
rarely, if ever.
This isn't about accidental shadowing - it's deliberate shadowing.
...
I'd also say that subclassing list as `List` is probably just bad style
that should be discouraged anyway, because presumably if you're subclassing
list you're doing it to extend it in some way, and you should pick a more
descriptive name like `ChainableList` (if you're implementing a list where
inplace methods return self and allow chaining, for example), rather than
just `List`
Maybe. Sometimes, you really just want a perfectly ordinary list, but
instrumented in some way. Who knows. In any case, having two names for
the same thing would make this very confusing.
...
I think what it boils down to for me is this:
If we went to a disconnected alternate universe where python had never
been invented and introduced it today, in 2021, would we introduce it with
a uniform naming convention, or the historical backwards-supporting
mishmash of casing we've ended up with? Since I think the answer is pretty
clear, I'm strongly in favor of making this minimally-invasive change that
at least works towards uniform casing, even if that dizzying utopia is far
beyond the horizon. Our grandchildren might thank us :p
That's an unknowable, because things change. If str can change from
being a function to a type early in Python 2.x, and range can change
from being a function to a type in 3.0, then what next? Will compile
become a type some day? Or chr? Would an equally hypothetical "Python
invented in 2032" need to rename another bunch of things?
...
I do concede that some awkward shadowing edge-cases are the strongest
argument against this proposal. I personally don't think they're that
strong of an argument compared to the eventual payoff, but that's just my
subjective opinion.
They're not the strongest argument. The strongest argument is churn -
lots and lots of changes for zero benefit.
The distinction between "this is a type" and "this is a function" is
often relatively insignificant. The crux of your proposal is that it
should be more significant, and that the fundamental APIs of various
core Python callables should reflect this distinction. This is a lot
of churn and only a philosophical advantage, not a practical one.
This isn't the first time someone has had false expectations about PEP
8 and the standard library. I'm seriously wondering if flake8 does
more harm than good by being so strict.
ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/OXKNTU...
Code of Conduct: http://python.org/psf/codeofconduct/