https://en.wikipedia.org/wiki/BigQuery BigQuery Dashboards - http://bigqueri.es/c/github-archive - https://redash.io/data-sources/google-bigquery - https://github.com/getredash/redash - https://github.com/getredash/redash/blob/master/requirements.txt - https://github.com/getredash/redash/blob/master/Dockerfile - https://github.com/docker/docker/blob/master/builder/dockerfile/parser/parse... - https://github.com/DBuildService/dockerfile-parse/issues - https://github.com/getredash/redash/blob/master/docker-compose.yml Software Configuration Management / Dependency Management applications for BigQuery: - https://opensource.googleblog.com/2017/03/operation-rosehub.html - "Googlers used BigQuery and GitHub to patch thousands of vulnerable projects" https://www.reddit.com/r/bigquery/comments/5x0x5z/googlers_used_bigquery_and... BigQuery Python Libraries google-cloud-bigquery - | Src: https://github.com/GoogleCloudPlatform/google-cloud-python - | Pypi: https://pypi.python.org/pypi/google-cloud-bigquery - | Docs: https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-... google-api-python-client - | Src: https://github.com/google/google-api-python-client - | Pypi: https://pypi.python.org/pypi/google-api-python-client - pandas.io.gbq uses google-api-python-client: - Docs: http://pandas.pydata.org/pandas-docs/stable/io.html#google-bigquery-experime... - read_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.read_gbq... - to_gbq() http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.to_gbq.h... Open Source Big Data Components for things like BigQuery: Apache Drill - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Drill - Apache Drill is similar to Google Dremel (which powers Google BigQuery) - https://pypi.python.org/pypi/drillpy Apache Beam - | Wikipedia: https://en.wikipedia.org/wiki/Apache_Beam - | Src: https://github.com/apache/beam - | Docs: https://beam.apache.org/documentation/sdks/python/ - | Docs: https://beam.apache.org/get-started/quickstart-py/ - | Docs: https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples - Google Cloud Dataflow is now of Apache Beam - https://cloud.google.com/dataflow/model/bigquery-io Parsing (and MAINTAINING) Pip Requirements.txt Files: - | Src: https://github.com/pypa/pip/tree/master/pip/req - https://github.com/pypa/pip/issues/3884#issuecomment-236454008 - https://github.com/pypa/pip/issues/1479 - -> Pipfile, Pipfile.lock (``pipenv install pkgname --dev``) - https://github.com/pyupio/safety-db#tools - https://pyup.io/ - https://libraries.io/github/librariesio/pydeps - https://github.com/librariesio/pydeps - https://libraries.io/ - Pipfile, Pipfile.lock - | PyPI: https://pypi.python.org/pypi/pipenv - | PyPI: https://pypi.python.org/pypi/requirements-parser - | PyPI: https://pypi.python.org/pypi/pipfile - | Src: https://github.com/kennethreitz/pipenv - These save to the Pipfile: - ``pipenv install pkgname`` - ``pipenv install pkgname --dev`` - https://github.com/kennethreitz/pipenv/blob/master/pipenv/utils.py - pip reqs.txt <--> Pipfile ... Thought I'd get these together; hopefully they're useful. Cool Jupyter notebook! ( https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb ) On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer <ja.geb@me.com> wrote:
Hi,
I ran a couple of queries against GitHubs public big query dataset [0] last week. I’m interested in requirement files in particular, so I ran a query extracting all available requirement files.
Since queries against this dataset are rather expensive ($7 on all repos), I thought I’d share the raw data here [1]. The data contains the repo name, the requirements file path and the contents of the file. Every line represents a JSON blob, read it with:
with open('data.json') as f: for line in f.readlines(): data = json.loads(line)
Maybe that’s of interest to some of you.
If you have any ideas on what to do with the data, please let me know.
—
Jannis Gebauer
[0]: https://cloud.google.com/bigquery/public-data/github [1]: https://github.com/jayfk/requirements-dataset
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig