[Python-ideas] Defining an easily installable "Recommended baseline package set"

Nathaniel Smith njs at pobox.com
Wed Nov 1 22:36:42 EDT 2017

On Wed, Nov 1, 2017 at 7:41 AM, Guido van Rossum <guido at python.org> wrote:
> Can you write 1-2 paragraphs with the argument for each?
> On Tue, Oct 31, 2017 at 10:01 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> - lxml

My impression (probably others are more knowledgeable) is that lxml
has more or less replaced the stdlib 'xml' package as the de facto
standard -- sort of similar to the urllib2/requests situation. AFAIK
lxml has never been proposed for stdlib inclusion and I believe the
fact that it's all in Cython would be a barrier even if the
maintainers were amenable. But it might be helpful to our users to put
a box at the top of the 'xml' docs suggesting people check out 'lxml',
similar to the one on the urllib2 docs.

>> - numpy

Numpy's arrays are a foundational data structure and de facto
standard, and would probably fit naturally in the stdlib semantically,
but for a number of logistical/implementational reasons it doesn't
make sense to merge. Probably doesn't make much difference whether
python-dev "blesses" it or not in practice, since there aren't any
real competitors inside or outside the stdlib; it'd more just be an
acknowledgement of the status quo.

>> - cryptography

Conceptually, core cryptographic operations are the kind of
functionality that you might expect to see in the stdlib, but the
unique sensitivity of crypto code makes this a bad idea. Historically
there have been a variety of basic crypto packages for Python, but at
this point IIUC the other ones are all considered
obsolete-and-potentially-dangerous and the consensus is everyone
should move to 'cryptography', so documenting that in this PEP might
help send people in the right direction.

>> - idna

This is a bit of a funny one. IDNA functionality is pretty fundamental
-- you need it to do unicode<->bytes conversions on hostnames, so
basically anyone doing networking needs it. Python ships with some
built-in IDNA functionality (as the "idna" codec), but it's using an
obsolete standard (IDNA2003, versus the current IDNA2008, see
bpo-17305), and IIRC Christian thinks the whole codec-based design is
the wrong approach... basically what we have in the stdlib has been
broken for most of a decade and there doesn't seem to be anyone
interested in fixing it. So... in the long run the stdlib support
should either be fixed or deprecated. I'm not sure which is better.
(The argument for deprecating it would be that IIUC you need to update
the tables whenever a new unicode standard comes out, and since it's a
networking thing you want to keep in sync with the rest of the world,
which is easier with a standalone library. I don't know how much this
matters in practice.) But right now, this library is just better than
the stdlib functionality, and it wouldn't hurt to document that.


Nathaniel J. Smith -- https://vorpus.org

More information about the Python-ideas mailing list