[Python-ideas] Defining an easily installable "Recommended baseline package set"

Sun Oct 29 03:54:22 EDT 2017

On 29 October 2017 at 15:16, Guido van Rossum <guido at python.org> wrote:

> Why? What's wrong with pip install?
>

At a technical level, this would just be a really thin wrapper around 'pip
install' (even thinner than ensurepip in general, since these libraries
*wouldn't* be bundled for offline installation, only listed by name).

> Why complicate things? Your motivation is really weak here. "beneficial"?
> "difficult cases"?
>

The main recurring problems with "pip install" are a lack of
discoverability and a potential lack of availability (depending on the
environment).

This then causes a couple of key undesirable outcomes:

- folks using Python as a teaching language have to choose between teaching
with just the standard library APIs, requiring that learners restrict
themselves to a particular preconfigured learning environment, or make a
detour into package management tools in order to ensure learners have
access to the APIs they actually want to use (this isn't hypothetical - I
was a technical reviewer for a book that justified teaching XML-RPC over
HTTPS+JSON on the basis that xmlrpc was in the standard library, and
requests wasn't)
- folks using Python purely as a scripting language (i.e without app level
dependency management) may end up having to restrict themselves to the
standard library API, even when there's a well-established frequently
preferred alternative for what they're doing (e.g. requests for API
management, regex for enhanced regular expressions)

The underlying problem is that our reasons for omitting these particular
libraries from the standard library relate mainly to publisher side
concerns like the logistics of ongoing bug fixing and support, *not* end
user concerns like software reliability or API usability. This means that
if educators aren't teaching them, or redistributors aren't providing them,
then they're actively doing their users a disservice (as opposed to other
cases like web frameworks and similar, where there are multiple competing
options, you're only going to want one of them in any given application,
and the relevant trade-offs between the available options depend greatly on
exactly what you're doing)

Now, the Python-for-data-science community have taken a particular
direction around handling this, and there's an additional library set
beyond the standard library that's pretty much taken for granted in a data
science context. While conda has been the focal point for those efforts
more recently, it started a long time ago with initiatives like Python(x,
y) and the Enthought Python Distribution.

Similarly, initiatives like Raspberry Pi are able to assume a particular
learning environment (Raspbian in the Pi's case), rather than coping with
arbitrary starting points.

Curated lists like the "awesome-python" one that Stephan linked don't
really help that much with the discoverability problem, since they become
just another thing for people to learn: How do they find out such lists
exist in the first place? Given such a list, how do they determine if the
recommendations it offers are actually relevant to their needs? Since
assessing a published package API against your needs as a user is a skill
that has to be learned like any other, it can be a lot easier to get
started in a more prescriptive environment that says "This is what you have
to work with for now, we'll explain more about your options for branching
out later".

The proposal in this thread thus stems from asking the question "Who is
going to be best positioned to offer authoritative advice on which third
party modules may be preferable to their standard library counterparts for
end users of Python?" and answering it with "The standard library module
maintainers that are already responsible for deciding whether or not to
place appropriate See Also links in the module documentation".

All the proposal does is to suggest taking those existing recommendations
from the documentation and converting them into a more readibly executable
form.

I'm not particularly wedded to any particular approach to making the
recommendations available in a more machine-friendly form, though - it's
just the "offer something more machine friendly than scraping the docs for
recommendation links" aspect that I'm interested in. For example, we could
skip touching ensurepip or venv at all, and instead limit this to a
documentation proposal to collect these recommendations from the
documentation, and publish them within the `venv` module docs as a
"recommended-libraries.txt" file (using pip's requirements.txt format).
That would be sufficient to allow straightforward 3rd party automation,
without necessarily committing to providing such automation ourselves.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171029/8f4e0212/attachment-0001.html>