[Distutils] dump of all PyPI project metadata available?
Brett Cannon
bcannon at gmail.com
Thu Jul 23 00:12:32 CEST 2015
On Wed, Jul 22, 2015 at 2:19 PM Wes Turner <wes.turner at gmail.com> wrote:
> https://github.com/dstufft/pypi-stats
>
> https://github.com/dstufft/pypi-external-stats
>
I'm not quite sure what I'm supposed to get from those links, Wes, as that
code still scrapes every project individually and downloads them while all
I'm trying to avoid having to scrape PyPI and instead just download a
single file (plus I don't want the files but just the metadata already
returned by the JSON API).
-Brett
> - [ ] a flat bigquery w/ pandas.io.gbq ala GitHub Archive would be great
> - [ ] it's probably worth it to add RDFa to PyPi and warehouse pages (in
> addition to the auxiliary executed/extracted JSON) for #search
> On Jul 22, 2015 4:08 PM, "Brett Cannon" <bcannon at gmail.com> wrote:
>
>> When I wrote https://nothingbutsnark.svbtle.com/python-3-support-on-pypi
>> I wrote a script to download every project's JSON metadata by scraping the
>> simple index and then making the appropriate GET request for the JSON
>> metadata. It worked, but somewhat of a hassle.
>>
>> Is there some dump somewhere that is built daily, weekly, or monthly of
>> all the metadata on PyPI for offline analysis?
>>
>> _______________________________________________
>> Distutils-SIG maillist - Distutils-SIG at python.org
>> https://mail.python.org/mailman/listinfo/distutils-sig
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20150722/4e647de9/attachment.html>
More information about the Distutils-SIG
mailing list