Re: [Python-ideas] PEP x: Static module/package inspection

(reposting this from Google Group once more as the previous post missed Mailing List, because I was not subscribed in Mailman) *Static module/package inspection* Abstract: - static: without execution (as opposed to dynamic) - module/package: .py or __init__.py file - inspection: get an overview of the contents *What should this do?* * * The proposal to add a mechanism to Python interpreter to get an outline of module/package contents without importing or executing module/package. The outline includes names of classes, functions, variables. It also should contain values for variables that could be provided without sophisticated calculations (e.g. a string, integer, but probably not expressions as it may lead to security leaks). *Why?* * * *user story PEPx.001:* As a Python package maintainer, I find it bothersome to repeatedly write bolierplate code (e.g. setup.py) to package my single file module. The reason I should write setup.py is to provide version and description info. This info is already available in my module source code. So I need to either copy/paste the info from the module manually, or to import (and hence execute) my module during packaging and installation, which I don't want either, because modules are often installed with root privileges. With this PEP, packing tool will be able to extract meta information from my module without executing it or without me manually copying version fields into some 'package configuration file'. *user story PEPx.002:* As a Python Application developer, I find it really complicated to provide plugin extension subsystem for my users. Users need a mechanism to switch between different versions of the plugin, and this mechanism is usually provided by external tool such as setuptools to manage and install multiple versions of plugins in local Python package repository. It is rather hard to create an alternative approach, because you are forced to maintain external meta-data about your plugin modules even in case it is already available inside the module. With this PEP, Python Application will be able to inspect meta-data embedded inside of plugins before choosing which version to load. This will also provide a standard mechanism for applications to check modules returned by packaging tools without executing them. This will greatly simplify writing and debugging custom plugins loaders on different platforms. *Feedback goal* At this stage I'd like to a community response to two separate questions: 1. If everybody feels this functionality will be useful for Python 2. If the solution is technically feasible

On 28 December 2011 10:15, anatoly techtonik <techtonik@gmail.com> wrote:
On a simple level, all of this is already "obtainable" by using the ast module that can parse Python code. I would love to see a "python-object" layer on top of this that will take an ast for a module (or other object) and return something that represents the same object as the ast. So all module level objects will have corresponding objects - where they are Python objects (builtin-literals) then they will represented exactly. For classes and functions you'll get an object back that has the same attributes plus some metadata (e.g. for functions / methods what arguments they take etc). That is certainly doable and would make introspecting-without-executing a lot simpler. I think your specific use cases are better served by adding functionality to the packaging (distutils2) package however. I'd particularly like to see plugin support in packaging (a cutdown version of setuptools entry points). All the best, Michael
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html

On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord <fuzzyman@gmail.com> wrote:
The existing 'clbr' (class browser) module in the stdlib also attempts to play in this same space. I wouldn't say it does it particularly *well* (since it's easy to confuse with valid Python constructs), but it tries. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Feb 2, 2012 at 4:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Unfortunately http://docs.python.org/library/pyclbr.html misses info about variables. In the meanwhile I've patches my `astdump` module even further: - function to query top level variables changed from get_top_vars() to top_level_vars(), which is now accepts filename as a parameter. Now it will be even more convenient to use it for generating `setup.py` for simple modules. Sample `setup.py` generator is included. http://pypi.python.org/pypi/astdump/1.0 -- anatoly t.

On Wed, Dec 28, 2011 at 5:15 AM, anatoly techtonik <techtonik@gmail.com> wrote:
I agree this is a pain. I also agree with Micheal that this is moreso a packager issue. Part of the problem is that I don't believe there is a strong enough convention around writing modules with an eye to being package tools accessible. If there was a PEP on module metadata for packaging tools to use for introspection, that might motivate package tool authors to support automated packaging :) *HINT HINT* Sphinx could also take advantage of some of it too.
See above. Maintaining the same information twice is definitely a bad thing, but we already have the ability to do everything required. What is missing is good, strong conventions on module metadata annotation that tool creators write to.
Having more nuanced import behavior is something I can get behind. Sure, I can wrap an import in a try except, and check the __version__ if is defined (after determining if it is a string/tuple/etc, and possibly parsing it), but more nuanced behavior would certainly be nice. Being able to specify version in the import line (and have multiple versions installed), being able to get fine grained exception beyond ImportError (ParseError, anyone?), not having to worry that the same file is being imported twice, that sort of stuff. I'm +1 getting a module level metadata conventions PEP draft started. I'm also +1 on taking a look at import behavior (though that is tangential here). Nathan

A rather user friendly proof of the concept with `ast` module is ready. http://pypi.python.org/pypi/astdump/ `astdump` contains get_top_vars() method, which extracts sufficient information from module's AST to generate setup.py for itself. This capability can already be reused for plugin version discovery mechanisms. ISTM the working library should motivate authors better than a PEP convention. =) `astdump` doesn't provide complete module introspection capabilities. I've primarily focused on getting the output done, so for a proper API it would be nice to study use case examples first. `astdump` contains tree walker with filtering capabilities by node type and level. What "python-object" should expose and how to make this convenient is not completely clear for me. -- anatoly t.

On 28 December 2011 10:15, anatoly techtonik <techtonik@gmail.com> wrote:
On a simple level, all of this is already "obtainable" by using the ast module that can parse Python code. I would love to see a "python-object" layer on top of this that will take an ast for a module (or other object) and return something that represents the same object as the ast. So all module level objects will have corresponding objects - where they are Python objects (builtin-literals) then they will represented exactly. For classes and functions you'll get an object back that has the same attributes plus some metadata (e.g. for functions / methods what arguments they take etc). That is certainly doable and would make introspecting-without-executing a lot simpler. I think your specific use cases are better served by adding functionality to the packaging (distutils2) package however. I'd particularly like to see plugin support in packaging (a cutdown version of setuptools entry points). All the best, Michael
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html

On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord <fuzzyman@gmail.com> wrote:
The existing 'clbr' (class browser) module in the stdlib also attempts to play in this same space. I wouldn't say it does it particularly *well* (since it's easy to confuse with valid Python constructs), but it tries. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Feb 2, 2012 at 4:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Unfortunately http://docs.python.org/library/pyclbr.html misses info about variables. In the meanwhile I've patches my `astdump` module even further: - function to query top level variables changed from get_top_vars() to top_level_vars(), which is now accepts filename as a parameter. Now it will be even more convenient to use it for generating `setup.py` for simple modules. Sample `setup.py` generator is included. http://pypi.python.org/pypi/astdump/1.0 -- anatoly t.

On Wed, Dec 28, 2011 at 5:15 AM, anatoly techtonik <techtonik@gmail.com> wrote:
I agree this is a pain. I also agree with Micheal that this is moreso a packager issue. Part of the problem is that I don't believe there is a strong enough convention around writing modules with an eye to being package tools accessible. If there was a PEP on module metadata for packaging tools to use for introspection, that might motivate package tool authors to support automated packaging :) *HINT HINT* Sphinx could also take advantage of some of it too.
See above. Maintaining the same information twice is definitely a bad thing, but we already have the ability to do everything required. What is missing is good, strong conventions on module metadata annotation that tool creators write to.
Having more nuanced import behavior is something I can get behind. Sure, I can wrap an import in a try except, and check the __version__ if is defined (after determining if it is a string/tuple/etc, and possibly parsing it), but more nuanced behavior would certainly be nice. Being able to specify version in the import line (and have multiple versions installed), being able to get fine grained exception beyond ImportError (ParseError, anyone?), not having to worry that the same file is being imported twice, that sort of stuff. I'm +1 getting a module level metadata conventions PEP draft started. I'm also +1 on taking a look at import behavior (though that is tangential here). Nathan

A rather user friendly proof of the concept with `ast` module is ready. http://pypi.python.org/pypi/astdump/ `astdump` contains get_top_vars() method, which extracts sufficient information from module's AST to generate setup.py for itself. This capability can already be reused for plugin version discovery mechanisms. ISTM the working library should motivate authors better than a PEP convention. =) `astdump` doesn't provide complete module introspection capabilities. I've primarily focused on getting the output done, so for a proper API it would be nice to study use case examples first. `astdump` contains tree walker with filtering capabilities by node type and level. What "python-object" should expose and how to make this convenient is not completely clear for me. -- anatoly t.
participants (4)
-
anatoly techtonik
-
Michael Foord
-
Nathan Rice
-
Nick Coghlan