[Distutils] Mirroring PyPI JSON Locally

Paul Moore p.f.moore at gmail.com
Sat Aug 19 14:38:10 EDT 2017


If this were to be done, then IMO yes, a PEP would be the right way to
standardise the JSON API. But tools like pip don't use the JSON API
much, and tools like devpi that expose the index API don't bother with
the JSON API (so making it less likely that consumers that want to
work with indexes other than PyPI will use it). So you may not get
much interest.

On the other hand, a PEP that simply documents the API and says "Index
providers that choose to support the JSON API must do so with this
interface" would probably be useful, and unlikely to get a lot of
pushback (assuming you document what Warehouse and PyPI provide, and
allow other providers to simply not provide anything).

Paul

On 13 August 2017 at 07:53, Cooper Ry Lees <lists at cooperlees.com> wrote:
> Hi all,
>
> First time emailer, so please be kind. Also, if this is not the right
> mailing list for PyPA talk, I apologize. Please point me in the right
> direction if so (Brett Canon pointed me here). The main reason I have
> emailed here is I believe it may be PEP time to standardize the JSON
> metadata that PyPI makes available, like what was done for the `'simple API`
> described in PEP503.
>
> I've been doing a bit of work on `bandersnatch` (I didn't name it), which is
> a PEP 381 mirroring package and wanted to enhance it to also mirror the
> handy JSON metadata PyPI generates and makes available @
> https://pypi.python.org/pypi/PKG_NAME/json.
>
> I've done a PR on bandersnatch as a POC that mirrors both the PyPI directory
> structure (URL/pypi/PKG_NAME/json) and created a standardizable
> URL/json/PKG_NAME that the former symlinks to (to be served by NGINX / some
> other proxy). I'm also contemplating naming the directory 'metadata' rather
> than JSON so if some new hotness / we want to change the format down the
> line we're not stuck with json as the dirname. This PR can be found here:
> https://bitbucket.org/pypa/bandersnatch/pull-requests/33/save-json-metadata-to-mirror
>
> My main use case is to write a very simple async 'verifier' tool that will
> crawl all the JSON files and then ensure the packages directory on each of
> my internal mirrors (I have a mirror per region / datacenter) have all the
> files they should. I sync centrally (to save resource on the PyPI
> infrastructure) and then rsync out all the diffs to each region /
> datacenter, and under some failure scenarios I could miss a file or many. So
> I feel using JSON pulled down from the authoritative source will allow an
> async job to verify the MD5 of all the package files on each mirror.
>
> What are peoples thoughts here? Is it worth a PEP similar to PEP503 going
> forward? Can people enhance / share some thoughts on this idea.
>
> Thanks,
> Cooper Ry Lees
> me at cooperlees.com
> https://cooperlees.com/
>
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>


More information about the Distutils-SIG mailing list