The two use cases you describe (scripters and teachers) leave me luke-warm -- scripters live in the wild west and can just pip install whatever (that's what it means to be scripting) and teachers tend to want a customized bundle anyway -- let the edu world get together and create their own recommended bundle.

As long as it's not going to be bundled, i.e. there's just going to be some list of packages that we recommend to 3rd party repackagers, then I'm fine with it. But they must remain clearly marked as 3rd party packages in whatever docs we provide, and live in site-packages.

I would really like to see what you'd add to the list besides requests -- I really don't see why the teaching use case would need the regex module (unless it's a class in regular expressions).

--Guido

On Sun, Oct 29, 2017 at 12:54 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 29 October 2017 at 15:16, Guido van Rossum <guido@python.org> wrote:
Why? What's wrong with pip install?

At a technical level, this would just be a really thin wrapper around 'pip install' (even thinner than ensurepip in general, since these libraries *wouldn't* be bundled for offline installation, only listed by name).
 
Why complicate things? Your motivation is really weak here. "beneficial"? "difficult cases"?

The main recurring problems with "pip install" are a lack of discoverability and a potential lack of availability (depending on the environment).

This then causes a couple of key undesirable outcomes:

- folks using Python as a teaching language have to choose between teaching with just the standard library APIs, requiring that learners restrict themselves to a particular preconfigured learning environment, or make a detour into package management tools in order to ensure learners have access to the APIs they actually want to use (this isn't hypothetical - I was a technical reviewer for a book that justified teaching XML-RPC over HTTPS+JSON on the basis that xmlrpc was in the standard library, and requests wasn't)
- folks using Python purely as a scripting language (i.e without app level dependency management) may end up having to restrict themselves to the standard library API, even when there's a well-established frequently preferred alternative for what they're doing (e.g. requests for API management, regex for enhanced regular expressions)

The underlying problem is that our reasons for omitting these particular libraries from the standard library relate mainly to publisher side concerns like the logistics of ongoing bug fixing and support, *not* end user concerns like software reliability or API usability. This means that if educators aren't teaching them, or redistributors aren't providing them, then they're actively doing their users a disservice (as opposed to other cases like web frameworks and similar, where there are multiple competing options, you're only going to want one of them in any given application, and the relevant trade-offs between the available options depend greatly on exactly what you're doing)

Now, the Python-for-data-science community have taken a particular direction around handling this, and there's an additional library set beyond the standard library that's pretty much taken for granted in a data science context. While conda has been the focal point for those efforts more recently, it started a long time ago with initiatives like Python(x, y) and the Enthought Python Distribution.

Similarly, initiatives like Raspberry Pi are able to assume a particular learning environment (Raspbian in the Pi's case), rather than coping with arbitrary starting points.

Curated lists like the "awesome-python" one that Stephan linked don't really help that much with the discoverability problem, since they become just another thing for people to learn: How do they find out such lists exist in the first place? Given such a list, how do they determine if the recommendations it offers are actually relevant to their needs? Since assessing a published package API against your needs as a user is a skill that has to be learned like any other, it can be a lot easier to get started in a more prescriptive environment that says "This is what you have to work with for now, we'll explain more about your options for branching out later".

The proposal in this thread thus stems from asking the question "Who is going to be best positioned to offer authoritative advice on which third party modules may be preferable to their standard library counterparts for end users of Python?" and answering it with "The standard library module maintainers that are already responsible for deciding whether or not to place appropriate See Also links in the module documentation".

All the proposal does is to suggest taking those existing recommendations from the documentation and converting them into a more readibly executable form.

I'm not particularly wedded to any particular approach to making the recommendations available in a more machine-friendly form, though - it's just the "offer something more machine friendly than scraping the docs for recommendation links" aspect that I'm interested in. For example, we could skip touching ensurepip or venv at all, and instead limit this to a documentation proposal to collect these recommendations from the documentation, and publish them within the `venv` module docs as a "recommended-libraries.txt" file (using pip's requirements.txt format). That would be sufficient to allow straightforward 3rd party automation, without necessarily committing to providing such automation ourselves.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia



--
--Guido van Rossum (python.org/~guido)