Re: [Distutils] distlib updated - comments sought
On 5 October 2012 10:27, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
+1 on this. Can you share a little more on your hint idea? I have a
specific use case in mind, and would like to know if it's covered. It's an intranet webpage that hosts packages in a very odd format, unfortunately, so I need to write code to get the packages, just providing URLs isn't possible.
For example, setuptools uses "dependency_links" which indicate URLs which lead to downloadable packages. I haven't thought it through in detail, but my package.yaml format preserves those links. I would aim to design in such a way that custom schemes aren't hard to support. One more area I need to look at is pip's PackageFinder, which is the minimum standard we need to support.
[Sorry, I hadn't meant to go off-list - blame gmail...] Right, that's not going to be enough for me. I have a web page where files are hosted at http://my.domain/datastore/VARYING_ENCODED_ID/filename.tar.gz. The problem is that VARYING_ENCODED_ID changes regularly. It's possible to calculate it in code, but only based on the current date and the package name :-( It really isn't worth asking why, but I'm stuck with it. I'd envisage a design consisting of independent "Locators", one for a local directory, one for PyPI using XMLRPC, one for PyPI's "simple" webpage interface (IMO, we need both PyPI types, as the simple interface generates a lot of useless web scraping - look at the simple page for lxml, for example...). Users could add locators for any custom stores as needed. I wrote a 30-minute proof of concept, which is at https://bitbucket.org/pmoore/tools/src/67b33c15efad/bin/Locator.py if you're interested. Plan B for something like this would be to use one of the "run a local PyPI server" packages in existence, and write a front-end to the web store that I can run locally, and point to as an extra PyPI index. That may well be a more general solution, but I don't think any of the existing PyPI server packages supports this type of usage, so it would need some work. Paul.
Paul Moore <p.f.moore <at> gmail.com> writes:
Right, that's not going to be enough for me. I have a web page where files are hosted at http://my.domain/datastore/VARYING_ENCODED_ID/filename.tar.gz. The problem is that VARYING_ENCODED_ID changes regularly. It's possible to calculate it in code, but only based on the current date and the package name It really isn't worth asking why, but I'm stuck with it.
I'd envisage a design consisting of independent "Locators", one for a local directory, one for PyPI using XMLRPC, one for PyPI's "simple" webpage interface (IMO, we need both PyPI types, as the simple interface generates a lot of useless web scraping - look at the simple page for lxml, for example...). Users could add locators for any custom stores as needed. I wrote a 30-minute proof of concept, which is at https://bitbucket.org/pmoore/tools/src/67b33c15efad/bin/Locator.py if you're interested.
I'll take a look - it sounds like the sort of approach I would take, to have a high level interface for locating distributions that allows enumeration of distributions, means of fetching them etc., with the ability to slot in implementations for unusual cases like yours. Regards, Vinay Sajip
Check out the reasonably modular and cleanly written pyg installer. https://github.com/rubik/pyg/ It has its own independent package database / metadata parser. https://github.com/rubik/pkgtools/blob/master/pkgtools/pkg.py It has a pypi interface too.
On Fri, Oct 5, 2012 at 6:29 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Paul Moore <p.f.moore <at> gmail.com> writes:
Right, that's not going to be enough for me. I have a web page where files are hosted at http://my.domain/datastore/VARYING_ENCODED_ID/filename.tar.gz. The problem is that VARYING_ENCODED_ID changes regularly. It's possible to calculate it in code, but only based on the current date and the package name It really isn't worth asking why, but I'm stuck with it.
I'd envisage a design consisting of independent "Locators", one for a local directory, one for PyPI using XMLRPC, one for PyPI's "simple" webpage interface (IMO, we need both PyPI types, as the simple interface generates a lot of useless web scraping - look at the simple page for lxml, for example...). Users could add locators for any custom stores as needed. I wrote a 30-minute proof of concept, which is at https://bitbucket.org/pmoore/tools/src/67b33c15efad/bin/Locator.py if you're interested.
I'll take a look - it sounds like the sort of approach I would take, to have a high level interface for locating distributions that allows enumeration of distributions, means of fetching them etc., with the ability to slot in implementations for unusual cases like yours.
Is this supposed to go into the stdlib? Simple reference implementations (wsgiref) fare better there than frameworks (distutils). The pluggable installer metaframework belongs on pypi.
Daniel Holth <dholth <at> gmail.com> writes:
Is this supposed to go into the stdlib? Simple reference implementations (wsgiref) fare better there than frameworks (distutils). The pluggable installer metaframework belongs on pypi.
I'm thinking of distlib as a library rather than a framework. However, it's probably too early to say definitively which parts might belong in the stdlib and which mightn't. Not every library with a service provider interface is automatically a framework. Regards, Vinay Sajip
On 5 October 2012 14:47, Daniel Holth <dholth@gmail.com> wrote:
Is this supposed to go into the stdlib? Simple reference implementations (wsgiref) fare better there than frameworks (distutils). The pluggable installer metaframework belongs on pypi.
I disagree. Having an installer depend on external packages is a practical problem. And having every installer invent (or not) its own means of allowing users to add custom package repositories is also an issue. Having a basic implementation supporting the PSF-supported repository (PyPI) but including simple hooks to allow users to add extra ones gives the benefit of a reference implementation as well as encouraging by example the provision of flexibility. No-one could try to claim that the sort of web-scraping that easy_install/pip does is a "simple" reference implementation, either. If you take that viewpoint, I'd say the stdlib implementation should *only* use the XMLRPC interface to PyPI. Code to use the "simple" interface and trawl all those links looking for distribution files can't be justified in the stdlib for any *other* reason than to save anyone else ever having to write it again :-) Paul. PS If you want to start over-engineering the flexibility, users should have a way of choosing whether to use the webscraper or XMLRPC interfaces to PyPI. The former finds more packages (as I understand it) whereas the latter is much faster. As someone who's never needed a package that can't be found using both interfaces (or neither :-() I deeply resent the speed penalties imposed by the "simple" interface (hence my repeated insistence on quoting the word "simple", as I find it far from simple :-)) PPS If my locator interface ever matures enough, I'm happy to release it on PyPI. But I don't want to compete with Vinay or a stdlib implementation, so I'd rather co-operate on a unified view of how to approach the problem.
Paul Moore <p.f.moore <at> gmail.com> writes:
No-one could try to claim that the sort of web-scraping that easy_install/pip does is a "simple" reference implementation, either. If you take that viewpoint, I'd say the stdlib implementation should *only* use the XMLRPC interface to PyPI. Code to use the "simple" interface and trawl all those links looking for distribution files can't be justified in the stdlib for any *other* reason than to save anyone else ever having to write it again [...] PS If you want to start over-engineering the flexibility, users should have a way of choosing whether to use the webscraper or XMLRPC interfaces to PyPI. The former finds more packages (as I understand it) whereas the latter is much faster. As someone who's never needed a package that can't be found using both interfaces (or neither ) I
Is that really the case? I'd assumed that the simple pages were generated from the package database created from uploads to PyPI, so I would have expected querying the XML-RPC interface to produce the same results as from scraping the HTML (allowing for the possibility that, if the HTML pages are generated periodically as static files from the database, they might be stale at times). I thought that pip needed to scrape pages because people host distribution archives on servers other than PyPI (e.g. Google code, GitHub, BitBucket or their own servers), with the links to those archives navigable through e.g. the "dependency_links" argument to setup(), or the URLs mentioned in the PyPI metadata. Regards, Vinay Sajip
On 5 October 2012 15:37, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
PS If you want to start over-engineering the flexibility, users should have a way of choosing whether to use the webscraper or XMLRPC interfaces to PyPI. The former finds more packages (as I understand it) whereas the latter is much faster. As someone who's never needed a package that can't be found using both interfaces (or neither ) I
Is that really the case? I'd assumed that the simple pages were generated from the package database created from uploads to PyPI, so I would have expected querying the XML-RPC interface to produce the same results as from scraping the HTML (allowing for the possibility that, if the HTML pages are generated periodically as static files from the database, they might be stale at times).
Well, yes. But the static files don't make it easy to distinguish the different categories they contain (see below)
I thought that pip needed to scrape pages because people host distribution archives on servers other than PyPI (e.g. Google code, GitHub, BitBucket or their own servers), with the links to those archives navigable through e.g. the "dependency_links" argument to setup(), or the URLs mentioned in the PyPI metadata.
I don't know how true that is these days - I don't think I've ever personally encountered a package that wasn't either available from the PyPI download URL (release_urls() in the XMLRPC interface) or unavailable via pip. But my range of packages tried is fairly limited... The static pages merge all of the following information: 1. The download URLs you can get from the XMLRPC interface release_urls, but with all releases covered in a single place. 2. release_data[download_url] which is available from the XMLRPC interface 3. Other URLs from release_data (home_page, project_url). The first ones are fine, as they point to files. The second is often a file, and seems to frequently duplicate the first. I'm not sure how useful it is. The final one often points to a further webpage - I presume that's what you plan to scrape. That's where the issue lies, though, as at least some of those links time out (lxml's does, IIRC) and as I say, I don't think I know of a case where it's actually worth doing. But this is based on a very superficial and limited experience. I'll happily bow to better information. On the other hand, is manually parsing the static page any faster in a practical sense than using XMLRPC? Paul.
Paul Moore <p.f.moore <at> gmail.com> writes:
The first ones are fine, as they point to files. The second is often a file, and seems to frequently duplicate the first. I'm not sure how useful it is. The final one often points to a further webpage - I presume that's what you plan to scrape. That's where the issue lies, though, as at least some of those links time out (lxml's does, IIRC) and as I say, I don't think I know of a case where it's actually worth doing.
But this is based on a very superficial and limited experience. I'll happily bow to better information.
On the other hand, is manually parsing the static page any faster in a practical sense than using XMLRPC?
Well, XML-RPC is of course preferable; the current code in distlib is just whatever I copied across from packaging, but the next step will be to look at the releases which are available from the different sources (XML-RPC, PyPI metadata URLs, dependency_links etc.) to see what sorts of things wouldn't be accessible if we restricted to say, just using XML-RPC. Since all the information in the static pages seems to be available via XML-RPC, what is the point of the simple interface, other than for occasional viewing by a human? Regards, Vinay Sajip
On Fri, Oct 5, 2012 at 11:39 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Paul Moore <p.f.moore <at> gmail.com> writes:
The first ones are fine, as they point to files. The second is often a file, and seems to frequently duplicate the first. I'm not sure how useful it is. The final one often points to a further webpage - I presume that's what you plan to scrape. That's where the issue lies, though, as at least some of those links time out (lxml's does, IIRC) and as I say, I don't think I know of a case where it's actually worth doing.
But this is based on a very superficial and limited experience. I'll happily bow to better information.
On the other hand, is manually parsing the static page any faster in a practical sense than using XMLRPC?
Well, XML-RPC is of course preferable; the current code in distlib is just whatever I copied across from packaging, but the next step will be to look at the releases which are available from the different sources (XML-RPC, PyPI metadata URLs, dependency_links etc.) to see what sorts of things wouldn't be accessible if we restricted to say, just using XML-RPC. Since all the information in the static pages seems to be available via XML-RPC, what is the point of the simple interface, other than for occasional viewing by a human?
IIRC the most practical limitation is that the XML-RPC interface doesn't exist on the mirrors.
Daniel Holth <dholth <at> gmail.com> writes:
IIRC the most practical limitation is that the XML-RPC interface doesn't exist on the mirrors.
Of course. It's easier to replicate a static website than an XML-RPC server to Web Scale ;-), but I presume it's only a question of hosting resources and developer time to make mirrors which are XML-RPC capable? Regards, Vinay Sajip
On 5 October 2012 16:40, Daniel Holth <dholth@gmail.com> wrote:
Well, XML-RPC is of course preferable; the current code in distlib is just whatever I copied across from packaging, but the next step will be to look at the releases which are available from the different sources (XML-RPC, PyPI metadata URLs, dependency_links etc.) to see what sorts of things wouldn't be accessible if we restricted to say, just using XML-RPC. Since all the information in the static pages seems to be available via XML-RPC, what is the point of the simple interface, other than for occasional viewing by a human?
IIRC the most practical limitation is that the XML-RPC interface doesn't exist on the mirrors.
That's a good point. Actually, writing a "local PyPI server" is much easier if all you have to implement is the simple static page interface. So I take back some of my objection - both XML-RPC and the static page interface make sense to support. Although it would be nice to have a better definition of precisely what comprises the "simple interface" than looking at the source code of the scraper. (Oh, and I still think the actual PyPI static pages include more links than are necessary, but that's a different issue, and one which would be alleviated by an option to ignore offsite links). Paul.
On Fri, Oct 5, 2012 at 2:38 PM, Paul Moore <p.f.moore@gmail.com> wrote:
That's a good point. Actually, writing a "local PyPI server" is much easier if all you have to implement is the simple static page interface. So I take back some of my objection - both XML-RPC and the static page interface make sense to support. Although it would be nice to have a better definition of precisely what comprises the "simple interface" than looking at the source code of the scraper.
*cough* http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-avail... Admittedly, not all of the specific heuristics for determining what kind of distribution a file is are documented; the rules for identifying distutils sdist and bdist_dumb files are unfortunately somewhat arcane, because distutils' filename conventions are inherently ambiguous for parsing. Setuptools' rules for unambiguous filename generation and parsing, however, are documented here: http://peak.telecommunity.com/DevCenter/EggFormats#filename-embedded-metadat... http://peak.telecommunity.com/DevCenter/PkgResources#parsing-utilities
(Oh, and I still think the actual PyPI static pages include more links than are necessary, but that's a different issue, and one which would be alleviated by an option to ignore offsite links).
See also: http://peak.telecommunity.com/DevCenter/EasyInstall#restricting-downloads-wi... ;-)
On Fri, Oct 5, 2012 at 10:23 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 5 October 2012 14:47, Daniel Holth <dholth@gmail.com> wrote:
PPS If my locator interface ever matures enough, I'm happy to release it on PyPI. But I don't want to compete with Vinay or a stdlib implementation, so I'd rather co-operate on a unified view of how to approach the problem.
"The Problem" Bootstrapping is kinda annoying because Python doesn't include an installer for pip or buildout or ... and it can be hard to choose between the many excellent installers that are available on and off of pypi. ~1300 of the ~20000 packages on pypi have trouble using setup.py as their build system / metadata source format. For the ~1300 broken packages, distutils is awful because it is not really extensible, though setuptools tried. People have to install setuptools against their will because there is only one implementation of the pkg_resources API and 75% of the packages on pypi require setuptools. Packaging has been in turmoil for years waiting for something. In my estimation we're not saving the world here. The goal should be to fix 1,300 packages without breaking 19,000, to make bootstrapping easier, and to make setuptools optional but neither required nor prohibited. It is wonderful to have distlib. I support it. I'm playing with the competing distlib2 implementation so that both APIs can be better, so we can find out which parts provide functionality that does not just have a different name in pkg_resources, and so that it can be possible to replace the implementation without changing the API. If your goal is to avoid "implementation defined behavior" it's a good idea to have two.
On 5 October 2012 17:04, Daniel Holth <dholth@gmail.com> wrote:
~1300 of the ~20000 packages on pypi have trouble using setup.py as their build system / metadata source format.
That's interesting information. Do you know in what way they have trouble with setup.py? Do they not use it at all, do they need features it doesn't provide, or what?
For the ~1300 broken packages, distutils is awful because it is not really extensible, though setuptools tried.
Yeah, that's the common complaint. Plus, "it's too extensible" :-) (From people trying to change it who have to deal with all the fancy hacks people have done).
People have to install setuptools against their will because there is only one implementation of the pkg_resources API and 75% of the packages on pypi require setuptools.
I wish we could separate pkg_resources and setuptools. I'd love to know which packages needed each (but I suspect that's not a question that can be answered without looking at the actual code). Ignoring the egg support aspects, pkg_resources is something that could be replaced - a reasonable proportion of distlib offers alternatives to the pkg_resources code, and more could be added. On the other hand, setuptools per se is almost entirely a build time facility, so shouldn't be needed at runtime (and so using it for build should be relatively unimportant). Paul
On Fri, Oct 5, 2012 at 1:24 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 5 October 2012 17:04, Daniel Holth <dholth@gmail.com> wrote:
~1300 of the ~20000 packages on pypi have trouble using setup.py as their build system / metadata source format.
That's interesting information. Do you know in what way they have trouble with setup.py? Do they not use it at all, do they need features it doesn't provide, or what?
I'm basing this only on Vinay's numbers of how-many-packages-can-generate-a-yaml. It's probably mostly packages that import something he didn't have installed inside setup.py, but I don't have a good way to find out exactly what is wrong with each one. An awful lot of packages do work fine; I don't have an automated way to detect that either.
People have to install setuptools against their will because there is only one implementation of the pkg_resources API and 75% of the packages on pypi require setuptools.
I wish we could separate pkg_resources and setuptools. I'd love to know which packages needed each (but I suspect that's not a question that can be answered without looking at the actual code). Ignoring the egg support aspects, pkg_resources is something that could be replaced - a reasonable proportion of distlib offers alternatives to the pkg_resources code, and more could be added. On the other hand, setuptools per se is almost entirely a build time facility, so shouldn't be needed at runtime (and so using it for build should be relatively unimportant).
And there is a good bit of pkg_resources.py that is only needed at install time, if you don't mind giving up [console_scripts] dependency loading. It would probably be a win to only populate working_set lazily at least. Whatever you did would probably mean the loss of a feature for some users, but that might be OK if they knew what to expect.
Daniel Holth <dholth <at> gmail.com> writes:
I'm basing this only on Vinay's numbers of how-many-packages-can-generate-a-yaml. It's probably mostly packages that import something he didn't have installed inside setup.py, but I don't have a good way to find out exactly what is wrong with each one. An awful lot of packages do work fine; I don't have an automated way to detect that either.
See this gist of the current crop of errors: https://gist.github.com/3841557 There are 1609 lines, after processing 17956 PyPI archives. Of the failures, 306 are because the source archive I generated from package.yaml didn't match the results of "python setup.py sdist". The remaining 1303 are failures to generate package.yaml. A depressing number are because the packagers failed to include files like 'README.rst' or 'CHANGES' in the distribution, but setup.py bails if they aren't there. Some of these will be false positives, i.e. bugs in my code. But many of the ones I've investigated hold up as a problem with the archive on PyPI. Regards, Vinay Sajip
On Fri, Oct 5, 2012 at 2:40 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Daniel Holth <dholth <at> gmail.com> writes:
I'm basing this only on Vinay's numbers of how-many-packages-can-generate-a-yaml. It's probably mostly packages that import something he didn't have installed inside setup.py, but I don't have a good way to find out exactly what is wrong with each one. An awful lot of packages do work fine; I don't have an automated way to detect that either.
See this gist of the current crop of errors:
https://gist.github.com/3841557
There are 1609 lines, after processing 17956 PyPI archives. Of the failures, 306 are because the source archive I generated from package.yaml didn't match the results of "python setup.py sdist". The remaining 1303 are failures to generate package.yaml. A depressing number are because the packagers failed to include files like 'README.rst' or 'CHANGES' in the distribution, but setup.py bails if they aren't there.
Some of these will be false positives, i.e. bugs in my code. But many of the ones I've investigated hold up as a problem with the archive on PyPI.
Neat. Packages that forgot to include README.rst in the sdist aren't that interesting. Tragic though. What would be really cool would be a web interface to all the setup.py on pypi. Detect whether the sdist definitely does not use conditional dependencies and ignore. If it might use conditional dependencies, interested Pythoneers read setup.py and type them in. Installer checks out of band metadata to install the build deps and make better decisions about the regular deps.
Paul Moore <p.f.moore <at> gmail.com> writes:
- a reasonable proportion of distlib offers alternatives to the pkg_resources code, and more could be added.
The next thing for me to look at is "entry points", which should be relatively straightforward to implement. Regards, Vinay Sajip
On 05.10.2012 19:24, Paul Moore wrote:
On 5 October 2012 17:04, Daniel Holth <dholth@gmail.com> wrote:
~1300 of the ~20000 packages on pypi have trouble using setup.py as their build system / metadata source format.
That's interesting information. Do you know in what way they have trouble with setup.py? Do they not use it at all, do they need features it doesn't provide, or what?
For the ~1300 broken packages, distutils is awful because it is not really extensible, though setuptools tried.
Yeah, that's the common complaint. Plus, "it's too extensible" :-) (From people trying to change it who have to deal with all the fancy hacks people have done).
People have to install setuptools against their will because there is only one implementation of the pkg_resources API and 75% of the packages on pypi require setuptools.
I wish we could separate pkg_resources and setuptools. I'd love to know which packages needed each (but I suspect that's not a question that can be answered without looking at the actual code). Ignoring the egg support aspects, pkg_resources is something that could be replaced - a reasonable proportion of distlib offers alternatives to the pkg_resources code, and more could be added. On the other hand, setuptools per se is almost entirely a build time facility, so shouldn't be needed at runtime (and so using it for build should be relatively unimportant).
Debian and Ubuntu have a (deb) binary package python-pkg-resources. 817 packages do use python-setuptools for the build, and 340 binary packages do depend on python-pkg-resources. so it's not just a few, but nearly 50%. see apt-cache rdepends python-pkg-resources for the full list. Matthias PS: are there really ~20000 packages on pypi, or do you count old versions too?
Matthias Klose <doko <at> ubuntu.com> writes:
PS: are there really ~20000 packages on pypi, or do you count old versions too?
Not counting old versions, my current list of packages where an archive is hosted on PyPI numbers a little under 18,000. While not bang up to date, it's unlikely to be too far out. There are additional packages which are listed on PyPI but not hosted there. I added a file, archives.txt, to the Gist at https://gist.github.com/3841557 This file has all the archives I processed - you can't see the text in the Gist (it's too big, 17956 lines), but the raw file is available via a link. Regards, Vinay Sajip
Daniel Holth <dholth <at> gmail.com> writes:
Bootstrapping is kinda annoying because Python doesn't include an installer for pip or buildout or ... and it can be hard to choose between the many excellent installers that are available on and off of pypi.
That was the point of packaging - to have something better than distutils in the stdlib. The original goal was too ambitious to achieve in the 3.3 timeframe, and perhaps issue could be taken with some use cases which weren't fully considered. Since packaging is an infrastructure concern, I believe (as others do) that there's a place for *something* in the stdlib that's better than distutils, because distutils failed to meet changing needs. That something may be distlib, or something like it, or nothing like it. It's too early to say.
~1300 of the ~20000 packages on pypi have trouble using setup.py as their build system / metadata source format.
I'm not sure what you mean. Packages don't have trouble, people do. For example, it may be possible for me to install a particular package from PyPI on my system (=> "no trouble"), but the package may be hard to package for Linux distros, because some of the installation logic happens in setup.py or code called from it.
For the ~1300 broken packages, distutils is awful because it is not really extensible, though setuptools tried.
Valiant effort by setuptools, but it could be considered a band-aid. Obviously there are different opinions about setuptools, but it's hard to argue against the fact that setuptools and pkg_resources are not considered worthy of inclusion in the stdlib by python-dev.
People have to install setuptools against their will because there is only one implementation of the pkg_resources API and 75% of the packages on pypi require setuptools.
Well, isn't that what packaging was (and distlib is) trying to remedy?
Packaging has been in turmoil for years waiting for something.
Mainly, people with the time and inclination :-)
In my estimation we're not saving the world here.
Just trying to improve the world a teeny little bit is a worthy goal; saving the world is beyond the reach of most of us.
The goal should be to fix 1,300 packages without breaking 19,000, to make bootstrapping easier, and to make setuptools optional but neither required nor prohibited.
I'm not sure that anyone is anticipating, or working towards, breaking thousands of packages. Nothing is "prohibited" - people can use whatever they want. But setuptools (or other third-party package) can't be truly optional while the stdlib lacks functionality which people need, and setuptools provides. So, perhaps the goal is just to offer more choices. People don't like change, but change can't always be avoided. We can be optimistic that strategies will be in place for mitigating the pain of migration for those who choose to migrate (no coercion). Just like 2to3 eases the transition from 2.x to 3.x, while forcing no-one to move over. I certainly don't believe the answer is to keep pkg_resources and setuptools APIs as some kind of fossilised, inviolable standard. I do believe we have to move away from custom installation code, which distutils by its nature forced people to produce, and which leads to problems for some people. Although I haven't published all my results, as it's still work in progress, I managed to extract all the metadata from thousands of packages hosted on PyPI from setup() to package.yaml, and then was able to generate a source archive from package.yaml which was semantically the same as the original. So I'm optimistic that by working on refining the metadata extraction mechanism, most of the packages on PyPI can be represented by an alternative metadata format, which allows the existence of a multiplicity of solutions to build, package and install Python software.
It is wonderful to have distlib. I support it. I'm playing with the competing distlib2 implementation so that both APIs can be better, so we can find out which parts provide functionality that does not just have a different name in pkg_resources, and so that it can be possible to replace the implementation without changing the API. If your goal is to avoid "implementation defined behavior" it's a good idea to have two.
Let a thousand flowers bloom :-) Regards, Vinay Sajip
participants (5)
-
Daniel Holth -
Matthias Klose -
Paul Moore -
PJ Eby -
Vinay Sajip