On Mon, Apr 6, 2015 at 5:44 PM, francis <francismb@email.de> wrote:
On 03/23/2015 01:06 PM, anatoly techtonik wrote:
> Hi,
>
> I am doing an exercise as a part of agile ux data mining
> team, and I need to get a list of Python modules:
>
> https://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules
>
> But this gives only the modules that were compiled into
> specific interpreter, and I need a list of modules that are
> de-facto included in stdlib standard.
>
> I also need this for all Python versions, and be able to
> fetch it as csv, json or html table format over webm so
> that result of my work could be validated and experiment
> repeated as necessary.
>
>
> I see the data as the necessary step to organize a work
> around "externally evolving standard library", so a way
> to query it should be somewhat sustainable and obvious.
>
> It might be possible to generate something from docs, like:
>
> https://docs.python.org/2.7.2/dataset/modules.json
>
> This way you get static information without ability to
> version or refresh the info (still good to have anyway to
> compare docs and other sources).

+1 for the idea to publish the final results to avoid "reparsing the wheel".

IMHO it could be interesting for new versions to have some kind
of "sys.stdlib_module_names" (as stated in SO). Why not proposing
it on python-ideas?

Done. But I omitted the `sys.stdlib_module_names` part, because for
my use case inĀ https://bitbucket.org/techtonik/python-stdlib project I need
more data exported than just names. For example, I collect the paths to
the module sources, so that further processing can be done on real
module files:

https://bitbucket.org/techtonik/python-stdlib/src/tip/stdlib.json?at=default

--
anatoly t.