Hi Thomas,
2017-05-20 13:23 GMT-04:00 Thomas Kluyver
Hi Luis,
Awesome, thanks for this :-). It was me posting before about indexing PyPI.
I'm intrigued: how do you keep it up to date using Travis? When I looked into this, I was pretty sure you need to download every package to index it. Do you have some way to only download the new releases? Or is Travis able to download every package every day? Or have you found another way round it?
I divided the index processing alphabetically, so that each letter is processed in a separate travis job. I also placed memory and time limits to avoid abusing Travis. The first run it has to download each package until it reaches the maximum time limit for each job, which is 40min. The next time, the script will only process packages that have been updated since the last run.
Does the index only include the latest version of each package, or does it also include older versions? The wifi on the train I'm on at the moment isn't fast enough to download 60 MB to find out. ;-)
It only includes the current versions.
Does your indexing tool prefer to use wheels or sdists? Is it capable of using either for packages which don't have both available? Do you do anything to cope with modules which may be included for one platform but not another?
It supports ['.whl', '.egg', '.zip', '.tgz', '.tar.gz', '.tar.bz2'] formats, and it extracts the data using any available. I wasn't aware of the fact that some modules may be on one platform and not in another. I guess there's room for improvement.
I'm excited to see someone actually doing this!
Thank you. I made this because I wanted to have an app that guessed python dependencies from code by scaning module imports and then looking up the Index. That app is called Pip Sala Bim and you can check it out here: https://github.com/LuisAlejandro/pipsalabim
Thomas
On Sat, May 20, 2017, at 03:01 AM, Luis Alejandro Martínez Faneyth wrote:
Hi everyone,
I'm new to this list but I've been reading some threads in the archive.
Around february, an idea about indexing modules from PyPI packages was brought up. I've been working on something similar for quite a while.
PyPIContents is an index of PyPI packages that lists its modules and command line scripts in JSON format, like this:
[ ...
"1337": { "cmdline": [], "modules": [ "1337", "1337.1337" ], "version": "1.0.0" },
...
]
You can check it out here:
https://github.com/LuisAlejandro/pypicontents
And some use cases:
https://github.com/LuisAlejandro/pypicontents#use-cases
The actual index lives here, its around 60MB:
https://raw.githubusercontent.com/LuisAlejandro/pypicontents /contents/pypi.json
Is updated daily with the help of Travis:
https://github.com/LuisAlejandro/pypicontents/blob/contents/.travis.yml
Anyway, I hope is useful and I'll be around for any comments or questions.
Cheers!
Luis Alejandro Martínez Faneyth Blog: http://huntingbears.com.ve Github: http://github.com/LuisAlejandro Twitter: http://twitter.com/LuisAlejandro
CODE IS POETRY
*_______________________________________________* Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- Luis Alejandro Martínez Faneyth Blog: http://huntingbears.com.ve Github: http://github.com/LuisAlejandro Twitter: http://twitter.com/LuisAlejandro CODE IS POETRY