On Thu, Apr 15, 2021 at 9:36 AM David Mertz <mertz@gnosis.cx> wrote:
I was so hopeful about this, but in the end... not really.  I have not used this capability before.  Here are a few different situations I know of:

...
 
...
re PackageNotFoundError('re')
statistics PackageNotFoundError('statistics')
pandas 1.2.4
vaex 4.1.0
bs4 PackageNotFoundError('bs4')

It seems like (somehow or another) importlib.metadata is doing something perhaps more useful for vaex.  But it is doing distinctly worse for re, statistics, and bs4.

funny you should try bs4, which I also used as an example in another post.

But what you have found is that there is a difference between a pacakge (which you can import) and a "distribution" which is something you install.

importlib.metadata is looking for distributions.

re and statistics are not distributions, but rather, built in pacakges (modules).

bs4 is not a distributiuion, it is a package that is provided by the "beautifulsoup4" distribution.

In [3]: importlib.metadata.version("beautifulsoup4")
Out[3]: '4.9.3'

Frankly, I'm still a bit confused about the distiction, but I do know for sure that a single distribution (maybe somethign that you can install via PyPi) can install more than one top-level package -- and certainly the packages/modules installed can have a different name than the distribution name.

And if a distribution installs more than one package, they may have different version numbers.

I'm not sure what to make of all this, though I'm leaning toward better suporting the distiction by asking for __version__ strings in top-level packages -- and maybe making importlib.metadata.version a bit smarter about looking for packages, and not just distributions.

If you look at the docstring of metadata.version:

"""
In [2]: importlib.metadata.version?
Signature: importlib.metadata.version(distribution_name)
Docstring:
Get the version string for the named package.

:param distribution_name: The name of the distribution package to query.
:return: The version string for the package as defined in the package's
    "Version" metadata key.
File:      ~/miniconda3/envs/py3/lib/python3.9/importlib/metadata.py
Type:      function
"""

It's a bit inconsistent with the use of the term "distribution" vs "package". That should get cleaned up if nothing else.

Also, the Exception raised is "PackageNotFoundError" -- which should maybe be "DistributionNotFoundError"?


Version is arguably useful from the package user side. As I believe Victor mentioned, there are two uses for version information: display to the user -- for which version strings are fine, or programmatic comparison -- for which something like the Version object is very helpful. Do we only need to use version information programmatically when we are creating (or installing) packages? I don't think so -- I know I have code that (poorly) does version checking programmatically.
 
Or rather, the below is what I would find really nice to be able to do.

ver = robust_version(module)
if ver >= (5, 2, 1):
    doit_modern_style()
elif ver < (5, 2, 1):
    doit_old_style
else:
    doit_unversioned_style()

Exactly -- and I htink we are close, if pacakges(modules) had compliant __version__ strings, thenyou could do
(with the Version object from packaging -- maybe why it should be in the stdlib)

ver = Version(module.__version__)
if ver >= Version("5.2.1"):
    doit_modern_style()
elif ver < Version("5.2.1"):
    doit_old_style
else:
    doit_unversioned_style()

And if my PR is accepted (which looks unlikley) to allow camparison between Version objects and strings:

ver = Version(module.__version__)
if ver >= "5.2.1":
    doit_modern_style()
elif ver < "5.2.1":
    doit_old_style
else:
    doit_unversioned_style()

A key distiction here from the importlib.metadata approach is the level of indirection -- I like being able to ask a module itself what version it is, rather than asking some other part of teh system to go look it up for me. So I could do:

import numpy as np

print("using np version:", np.__version)

And this is pretty consitent with the rest of Python, where many objects (functions, classes, modules) have a __name__ attribute if things "know" their name, shouldn't they "know" their version as well?

-CHB

--
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython