Hi all. Now that 2.4 is out and everything maybe it's about time to start discussing the "use the __source__ Luke" feature which IMO will really boost python into a new domain of exciting possibilities. I've prepared a pre-PEP which is not very good but it is a base. In short, the feature is good and it enables editing of python code at runtime instead of the runfile-exit-edit-run-exit-edit-run cycle. We have the following possibilities as to whether __source__ data is marshalled and the feature is always enabled. [1] Command line switch and not marshalled [2] Always on and not marshalled [3] Always on and marshalled There is also [4] which doesn't make much sense. If I was BDFL I'd go for [1] so whoever wants it can enable it and whoever doesn't can't complain, and they'll all leave me alone. Phillip J. Eby expressed some concerns that the modules that depend on __source__ will eventually take over and it will become a standard. Anyway, the PEP is attached. You can mail me with votes on the feature and if you want on your preferred option from 1,2,3. If I get votes I'll post the results later. If this is accepted I'll try to come up with a good patch vs 2.4 Thanks, St. -------------------ATTACHED PYTHON ENHANCEMENT PROPOSAL--- PEP: XXX Title: The __source__ attribute Version: $Revision: 1.10 $ Last-Modified: $Date: 2003/09/22 04:51:49 $ Author: Stelios Xanthakis Status: Draft Type: Standards Track Content-Type: text/plain Created: 19-Nov-2004 Python-Version: 2.4.1 Post-History: Abstract This PEP suggests the implementation of __source__ attribute for functions and classes. The attribute is a read-only string which is generated by the parser and is a copy of the original source code of the function/class (including comments, indentation and whitespace). Motivation It is generally a tempting idea to use python as an interface to a program. The developers can implement all the functionality and instead of designing a user interface, provide a python interpreter to their users. Take for example one of the existing web browsers: they have everything that would be needed to write a script which downloads pages automatically or premutates the letters of web pages before they are displayed, but it is not possible for the user to do these things because the interface of these applications is static. A much more powerful approach would be an interface which is dynamically constructed by the user to meet the user's needs. The most common development cycle of python programs is: write .py file - execute .py file - exit - enhance .py file - execute .py file - etc. With the implementation of the __source__ attribute though the development/modification of python code can happen at run-time. Functions and classes can be defined, modified or enhanced while the python shell is running and all the changes can be saved by saving the __source__ attribute of globals before termination. Moreover, in such a system it is possible to modify the "code modification routines" and eventually we have a self-modifying interface. Using a program also means improving its usability. The current solution of using 'inspect' to get the source code of functions is not adequate because it doesn't work for code defined with "exec" and it doesn't have the source of functions/classes defined in the interactive mode. Generally, a "file" is something too abstract. What is more real is the data received by the python parser and that is what is stored in __source__. Specification The __source__ attribute is a read-only attribute of functions and classes. Its type is string or None. In the case of None it means that the source was not available. The indentation of the code block is the original identation obeying nested definitions. For example: >>> class A: ... def foo (self): ... print """Santa-Clauss ... is coming to town""" >>> def spam (): ... def closure (): ... pass ... return closure >>> print A.foo.__source__ def foo (self): print """Santa-Clauss is coming to town""" >>> print spam().__source__ def closure (): pass The attribute is not marshaled and therefore not stored in ".pyc" files. As a consequence, functions and classes of imported modules have __source__==None. We propose that the generation of __source__ will be controlled by a command line option. In the case this feature is not activated by the command line option, the attribute is absent. Rationale Generally, "import" refers to modules that either have a file in a standard location or they are distributed in ".pyc" form only. Therefore in the case of modules, getting the source with "inspect" is adequate. Moreover, it does not make sense saving __source__ in ".pyc" because the point would be to save modifications in the original ".py" file (if available). On the issue of the command-line option controlling the generation of __source__, please refer to the section about the overhead of this feature. The rationale is that those applications that do not wish to use this feature can avoid it (cgi scripts in python benchmarked against another language). Overhead The python's parser is not exactly well-suited for such a feature. Execution of python code goes through the stages of lexical analysis, tokenization, generation of AST and execution of bytecode. In order to implement __source__, the tokenizer has to be modified to store the lines of the current translation unit. Those lines are then attached the root node of the AST. While the AST is compiled we have to keep a reference of the current node in order to be able to find the next node after the node for which we wish to generate __source__, get the first and the last line of our block and then refer to the root node to extract these lines and make a string. All these actions add a minor overhead to some heavily optimized parts of python. However, once compilation to bytecode is done, this feature no longer affects the performance of the execution of the bytecode. There is also the issue of the memory spent to store __source__. In our opinion, this is worth the tradeoff for those who are willing to take advantage of it. Implementation There is a sample implementation at [2] which consists of a patch against python 2.3.4. The patch has to be improved to avoid generating __source__ for the case we are importing modules for the first time (not from .pyc). In the sample implementation there is also included a sample shell that takes advantage of __source__ and demonstrates some aspects that motivated us towards patching python and submitting this PEP. References [1] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton http://www.python.org/peps/pep-0001.html [2] Sample implementation http://students.ceid.upatras.gr/~sxanth/ISYSTEM/python-PIESS.tar.gz Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:
Hi Stelios, [Stelios Xanthakis Fri, Dec 03, 2004 at 11:54:25AM +0200]
Abstract
This PEP suggests the implementation of __source__ attribute for functions and classes. The attribute is a read-only string which is generated by the parser and is a copy of the original source code of the function/class (including comments, indentation and whitespace).
I've had similar ideas in the past as we are doing dynamic code generation in PyPy, as well as in other projects. After some discussion with Armin i think there is another possibility for storing "source" or any other such meta information with code/module objects: make __file__ and co_filename instances of a subclass of 'str' providing an extra attribute. For a simple example, they could have a 'source' attribute, which could be tried first by appropriate inspect functions and traceback related functionality. We are about to test out this approach with the py lib (http://codespeak.net/py) and want to have it work for for Python 2.2, 2.3. and 2.4. I suspect there may be some issues lurking (also in your proposed PEP) especially with respect to encodings. Also we have some use cases where we want to retrieve source code from non-local locations and want to integrate this seemlessly with the introspection facilities of Python which obviously is an important part of the equation. I can report back if there is interest. cheers, holger
On Fri, 3 Dec 2004, holger krekel wrote:
... there is another possibility for storing "source" or any other such meta information with code/module objects: make __file__ and co_filename instances of a subclass of 'str' providing an extra attribute. For a simple example, they could have a 'source' attribute, which could be tried first by appropriate inspect functions and traceback related functionality.
Attaching such info on 'code objects' is indeed a more general case. But, OTOH, AFAIK, a class is not a code object. At least by what I was able to figure out from python sources. It seems reasonable to make 'source' a dynamic object which will get its info from file/line if available. Now the thing is that if we had __source__ from the start, 'inspect' would have been much different. So the fact that we have some functionality with inspect does not mean that it's good enough. Probably inspect will be rewritten/improved if __source__ is implemented.
We are about to test out this approach with the py lib (http://codespeak.net/py) and want to have it work for for Python 2.2, 2.3. and 2.4.
Do you plan hacking python ? It appears that tok_nextc() is the best place to catch all the source passed to the interpreter. A patch would be interesting. Stelios
[Stelios Xanthakis Fri, Dec 03, 2004 at 11:59:30PM +0200]
On Fri, 3 Dec 2004, holger krekel wrote:
We are about to test out this approach with the py lib (http://codespeak.net/py) and want to have it work for for Python 2.2, 2.3. and 2.4.
Do you plan hacking python ? It appears that tok_nextc() is the best place to catch all the source passed to the interpreter.
Well, as we want to have the library work on past python versions modifying CPython 2.5 does not make much sense. It's more about (like Martin pointed out) organizing dynamic code generation so that Python's introspect and traceback logic works as much as possible - with tiny runtime "monkey" patches if needed. Now Martin also correctly pointed out that you can store source code before/after you pass it to compile/parse. We are doing this already with an external dictionary. This has multithreading issues, though. So we think that hooking onto code's objects co_filename or a module's __file__ might be an interesting idea. cheers, holger
Stelios Xanthakis wrote:
Now that 2.4 is out and everything maybe it's about time to start discussing the "use the __source__ Luke" feature which IMO will really boost python into a new domain of exciting possibilities.
I'm opposed to this idea. It creates overhead in the size of .pyc files, for no additional value that couldn't be obtained otherwise. As the rationale, the PEP lists: 1.
It is generally a tempting idea to use python as an interface to a program.
I cannot see how this rationale is related to the PEP. You can use Python as interface to a program with or without __source__. 2.
The developers can implement all the functionality and instead of designing a user interface, provide a python interpreter to their users.
This does not require __source, either. 3.
A much more powerful approach would be an interface which is dynamically constructed by the user to meet the user's needs.
Dynamic code generation doesn't require __source__, either. 4.
The most common development cycle of python programs is: write .py file - execute .py file - exit - enhance .py file - execute .py file - etc. With the implementation of the __source__ attribute though the development/modification of python code can happen at run-time.
This works just fine as well at the moment; see IDLE for an example.
Functions and classes can be defined, modified or enhanced while the python shell is running and all the changes can be saved by saving the __source__ attribute of globals before termination.
Currently, you can define classes dynamically, and you can also save the source code of the class to a file in case you need it later.
Moreover, in such a system it is possible to modify the "code modification routines" and eventually we have a self-modifying interface. Using a program also means improving its usability.
Self-modifying source code is currently also possible. Just read the old source code from a .py file, modify it, and recompile it.
The current solution of using 'inspect' to get the source code of functions is not adequate because it doesn't work for code defined with "exec" and it doesn't have the source of functions/classes defined in the interactive mode.
I fail to see why it isn't adequate. Anybody who wants to modify source code that was originally passed to exec just needs to preserve a copy of the source code, separately.
Generally, a "file" is something too abstract. What is more real is the data received by the python parser and that is what is stored in __source__.
Not at all. A file is precisely the level of granularity that is burnt into the Python language. A module is *always* a file, executed from top to bottom. It is not possible to recreate the source code of a module if you have only the source code of all functions, and all classes. Regards, Martin
I'm opposed to this idea. It creates overhead in the size of .pyc files,
No it doesn't.
for no additional value that couldn't be obtained otherwise.
Martin: I know it is possible to do all this with existing python facilities. I did write such a dynamic code framework in python. Specifically I used a function 'deyndef(code)' which was exactly like 'def' but also stored the source string in a dictionary. The key point is that I think think should be the job of the parser and the functionality provided at the interactive prompt w/o the user having to write 'dyndef' or store the code of exec's or request from himself to use specific commands to create functions. It should be transparent built into python.
A file is precisely the level of granularity that is burnt into the Python language. A module is *always* a file, executed from top to bottom. It is not possible to recreate the source code of a module if you have only the source code of all functions, and all classes.
That's exactly the rationale for NOT combining __source__ with import. It's in the PEP. It appears that there are the 'module people' who find this feature irrelevant. Indeed. If we are interested in distributing modules and increasing the number of people who use python programs,then __source__ is redundant. OTOH, programming python is easy and fun and I think the proposed feature will make it even more fun and it aims in increasing the number of people who program python for their every day tasks. It'd be interesting to hear if the developers of IDLE/ipython/etc could use this. Oh well. I guess I'm ahead of my time again:) St.
Stelios Xanthakis wrote:
The key point is that I think think should be the job of the parser and the functionality provided at the interactive prompt w/o the user having to write 'dyndef' or store the code of exec's or request from himself to use specific commands to create functions. It should be transparent built into python.
For the case of the interactive prompt, if preserving the source code is somehow desirable (which I doubt), it should be the job of the development environment to offer saving interactively-defined methods. Such IDE support is necessary even if __source__ was available, since users probably would not want to write open("demo.py").write(myfunc.__source__ + "\n" + myclass.__source)
OTOH, programming python is easy and fun and I think the proposed feature will make it even more fun and it aims in increasing the number of people who program python for their every day tasks. It'd be interesting to hear if the developers of IDLE/ipython/etc could use this.
I think it is the other way 'round. If this is *only* for interactive mode, than you should *first* change the interactive environments. If you then find you absolutely need this feature to implement the IDEs correctly, I'd like to hear the (new) rationale for the change. Regards, Martin
[Resend, since a minor brain explosion caused me to send this to c.l.p instead of python-dev] Stelios Xanthakis wrote:
It appears that there are the 'module people' who find this feature irrelevant. Indeed. If we are interested in distributing modules and increasing the number of people who use python programs,then __source__ is redundant. OTOH, programming python is easy and fun and I think the proposed feature will make it even more fun and it aims in increasing the number of people who program python for their every day tasks. It'd be interesting to hear if the developers of IDLE/ipython/etc could use this.
The feedback here (and the initial response on py-dev a while back) suggests to me that you should look at making this a feature of the interactive mode. Something that affects both Python's main interactive shell, plus the relevant class in the standard library (CommandInterpreter or whatever it is called). A late-night-train-of-thought example of what might be handy is below - keep in mind that I haven't looked at what enhanced Python shells like IPython can do, so it may be there are tools out there that do something like this already. It would be handy to have a standard library module that supported "on-the-fly" editing, though (this capability would then be available to anyone embedding Python as a scripting engine). Cheers, Nick. ==============================
import source class bob: ... def mary(): ... pass ... def tim(): ... print 'Tim' ... print bob.__source__ class bob: def mary(): pass def tim(): print 'Tim'
print bob.mary.__source__ def mary(): pass
source.edit(bob.mary) bob.mary(1)>def mary(text): # [1] bob.mary(2)> print "Mary:", text bob.mary(3)>\save source.edit(bob.tim) bob.tim(1)>\help Commands: \help \cancel \save \deleteline bob.tim(2)>\cancel print bob.__source__ "class bob: def mary(text): print "Mary:", text def tim(): print 'Tim' " bob().mary("Hi!") Mary: Hi!
The basic ideas of the above: "import source" triggers the storage of the __source__ attributes (e.g. via installation of appropriate hooks in the class and function definition process) The 'edit' function is then able to take advantage of the stored source code to present each line of the original source for modification (e.g. to fix a minor bug in one function of a class definition). When the 'edit' is complete, it can be saved or cancelled. 1. The feature mentioned in the last paragraph is hard to show in the expected output :) -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net
participants (5)
-
"Martin v. Löwis" -
holger krekel -
hpk@trillke.net -
Nick Coghlan -
Stelios Xanthakis