[Distutils] Data on requirement files on GitHub

Nick Timkovich prometheus235 at gmail.com
Wed Mar 8 11:36:16 EST 2017


Looks like a fun chunk of data, what's the query you used? Can you add a
README to the repo with some description if others want to iterate on it
(maybe look into setup.py's?)

Nick

On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer <ja.geb at me.com> wrote:

> Hi,
>
> I ran a couple of queries against GitHubs public big query dataset [0]
> last week. I’m interested in requirement files in particular, so I ran a
> query extracting all available requirement files.
>
> Since queries against this dataset are rather expensive ($7 on all repos),
> I thought I’d share the raw data here [1]. The data contains the repo name,
> the requirements file path and the contents of the file. Every line
> represents a JSON blob, read it with:
>
> with open('data.json') as f:
>     for line in f.readlines():
>         data = json.loads(line)
>
> Maybe that’s of interest to some of you.
>
> If you have any ideas on what to do with the data, please let me know.
>
>>
> Jannis Gebauer
>
>
>
> [0]: https://cloud.google.com/bigquery/public-data/github
> [1]: https://github.com/jayfk/requirements-dataset
>
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170308/43f04ea4/attachment.html>


More information about the Distutils-SIG mailing list