On Thu, May 23, 2019 at 5:45 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 23, 2019 at 02:06:13PM -0700, Brett Cannon wrote:
> On Wed, May 22, 2019 at 1:23 PM Sean Wallitsch <
> sean.wallitsch@dreamworks.com> wrote:
>
> > My apologies for that oversight. My understanding is that many of the
> > methods present in aifc depend heavily on audioop for reading and writing.
> >
>
> But are people using audioop directly?

Yes they are.

https://old.reddit.com/r/Python/comments/brgl8v/pep_594_removing_dead_batteries_from_the_standard/eodvexl/

Shouldn't the PEP be responsable for establishing (as well as any
negative can be proven) that these modules aren't used, rather than
merely assuming that they aren't?

Of course it is hard to establish that a module isn't being used. Not
all code can be searched for on the Internet, there's huge amounts of
non-open source code written by users that never leaves their computer,
or behind corporate firewalls.

Exactly, hence why we are going through the PEP process on this and not simply deprecating everything outright without public discussion.
 

The difficulty of proving this negative requires that we be especially
conservative about removal. At the very least, we should demonstrate
that the package is *an active burden*.

Its not enough to say "nobody has touched this module for ages" since
stable, working code doesn't rot unless we change the language (or the
std lib!) and break it.

So far, the PEPs record hasn't been too good: out of the 31 modules
listed in the original draft, the PEP marks four as "keep", leaving 27
to be deprecated. So far, we've found:

I don't think it's fair to be saying Christian isn't doing "too good" simply because he took a stab at trying to figure out some way to know which modules would make sense to remove and he got it wrong for some of them from some people's perspective. As you pointed out, there's no scientific way to do this ahead of time due to closed source code (e.g. the VFX industry is not publishing all of their asset pipeline tools so there was no way to know ahead of time without asking like we are with the PEP), so he had to start somewhere. And this initial list was discussed at the PyCon US 2018 language summit as a good one to start from so he isn't entirely guessing without some initial support to try this list out.
 

- the sound libraries are in heavy use by hobbyists and the professional
audio-visual industry; aifc in particular is a mature, stable library
that isn't break and doesn't need much attention;

- cgi and cgitb are used by people who don't need a heavyweight HTML
solution (see Reddit);

- colorsys, fileinput and nntplib (at least) have been added to the
"keep" list;

- the removal of spwd (and crypt?) has been controversial.

So far, nearly 30% of the so-called "dead batteries" turn out to be not
so dead after all.

I may have missed some. Nor have I even talked much about modules which
I personally use occasionally, like binhex, because it's not about *me*
its about the community. As much as I would prefer binhex to remain, if
it is removed I will be disappointed but I'll make do.

I personally think it's about both us and the community. The community can and does ask for stuff all the time, but we have to balance that with what we as a team are capable of handling and in my opinion the stdlib is too big right now for us to maintain appropriately. Plus there's an asymmetric ask here when the community says they want something while we have to keep it going.
 

Speaking of binhex, I tried to get a sense of how much of a burden it
actually is. There was a comment by Guido in *2007*:

https://github.com/python/cpython/commit/34a042d301d6ab88645046a6dfa6c38265ca4b39

"This is the last time I fix binhex. If it breaks again it goes in the
dustbin"

which of course is Guido's perogative to say Won't Fix. Its now 2019 and
it hasn't broken again, or at least it hasn't broken again sufficiently
to make any difference. Serhey found one issue:

https://bugs.python.org/issue29566

but I'm not sure if that's actually a bug or not. There was a
documentation bug (in the source code) in 2017:

https://bugs.python.org/issue29557

and one enhancement request to clean up the code:

https://bugs.python.org/issue34063

Other than that, Serhey touched binhex.py as part of a mass patch to
almost the whole stdlib to modernise file handling to use "with".

So by my count, in 12 years since Guido's frustrated comment about
binhex, it has seen:

- removal of an obsolete and ambiguous comment from the source code;
- addition of a few with blocks to modernise file handling;
- one enhancement request still outstanding;
- one possibly a bug (or maybe not) still outstanding.


I may have missed some, but we're talking about one issue per three
years. How is this a maintenance burden?

Then multiply that by however many modules stay in the PEP -- and which we know don't all have such small stats compared to binhex -- and it starts to become death by a thousand paper cuts. Numerically all of that seems small, but all of that required time that we all have a limited amount of (both in the open source contribution sense and in the living-and-breathing sense). My point is none of this is free no matter how minuscule we think the cost is.

You can also flip that question around and ask why are there any enhancement requests or open issues if maintenance is so easy for binhex?  Why has no one signed up for the module in the experts index if it's such a minor load to maintain? My guess is we all have other priorities that we value more, which makes total sense, but that also says to me that other things are taking higher priority. And for me, even ignoring a module still takes effort.

For me personally, a module not being in this PEP means we consider it critical to a large portion of users of Python where being "in the box" is an important aspect as well as having us directly maintain the module. If a module is in this PEP it says to me that we consider it a nice-to-have for the community, but not at the cost of the modules not listed, or the CPython interpreter, or the language itself which is the other stuff we are planning to keep maintaining after this PEP.

Anyway, in the end we are never all going to agree to what the definition is of what is or is not enough effort in maintenance to warrant keeping something in the stdlib. The best I think we can do is try to reach consensus and then try to figure out what guidelines we want to use for what does belong in the stdlib going forward along with trying to figure out how to maintain what we do choose to keep.