Questionable AST wibbles

Jeremy, There are a bunch of mods from the AST branch that got integrated into head. Hopefully, by doing this on python-dev more people will get involved. I'll describe high level things first, but there will be a ton of details later on. If people don't want to see this crap on python-dev, I can take this offline. Highlevel overview of code size (rough decrease of 300 C source lines): * Python/compile.c -2733 (was 6822 now 4089) * Python/Python-ast.c +2281 (new file) * Python/asdl.c +92 (new file) * plus other minor mods symtable.h has lots of changes to structs and APIs. Not sure what needs to be doc'ed. I was very glad to see that ./python compileall.py Lib took virtually the same time before and after AST. Yeah! Unfortunately, I can't say the same for memory usage for running compileall: Before AST: [10120 refs] After AST: [916096 refs] I believe there aren't that many true memory leaks from running valgrind. Though there are likely some ref leaks. Most of this is probably stuff that we are just hanging on to that is not required. I will continue to run valgrind to find more problems. A bunch of APIs changed and there is some additional name pollution. Since these are pretty internal APIs, I'm not sure that part is a big deal. I will try to find more name pollution and eliminate it by prefixing with Py. One API change which I think was a mistake was _Py_Mangle() losing 2 parameters (I think this was how it was a long time ago). See typeobject.c, Python.h, compile.c. pythonrun.h has a bunch of changes. I think a lot of the APIs changed, but there might be backwards compatible macros. I'm not sure. I need to review closely. symtable.h has lots of changes to structs and APIs. Not sure what needs to be doc'ed. Some #defines are history (I think they are in the enum now): TYPE_*. code.h was added, but it mostly contains stuff from compile.h. Should we remove code.h and just put everything in compile.h? This will remove lots little changes. code.h & compile.h are tightly coupled. If we keep them separate, I would like to see some other changes. This probably is not a big deal, but I was surprised by this change: +++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 @@ -123,7 +123,7 @@ def test_lambda(self): self.failUnless(repr(lambda x: x).startswith( - "<function <lambda")) + "<function lambda")) This one may be only marginally worse (names w/parameter unpacking): test_grammar.py - verify(f4.func_code.co_varnames == ('two', '.2', 'compound', - 'argument', 'list')) + vereq(f4.func_code.co_varnames, + ('two', '.1', 'compound', 'argument', 'list')) There are still more things I need to review. These were the biggest issues I found. I don't think most are that big of a deal, just wanted to point stuff out. n

On 10/21/05, Neal Norwitz <nnorwitz@gmail.com> wrote:
There are a bunch of mods from the AST branch that got integrated into head. Hopefully, by doing this on python-dev more people will get involved. I'll describe high level things first, but there will be a ton of details later on. If people don't want to see this crap on python-dev, I can take this offline.
Thanks for all the notes and questions, Neal. There were a lot of changes made over a long time, and it's good to discuss some of them.
Highlevel overview of code size (rough decrease of 300 C source lines): * Python/compile.c -2733 (was 6822 now 4089) * Python/Python-ast.c +2281 (new file) * Python/asdl.c +92 (new file) * plus other minor mods
symtable.h has lots of changes to structs and APIs. Not sure what needs to be doc'ed.
The old symtable wasn't well documented and the API it exposed to Python programmers was lousy. We need to figure out a good Python API and document it.
I was very glad to see that ./python compileall.py Lib took virtually the same time before and after AST. Yeah! Unfortunately, I can't say the same for memory usage for running compileall:
Before AST: [10120 refs] After AST: [916096 refs]
That's great news! That is, I expected it to be a lot slower to compile and didn't have any particulary good ideas about how to speed it up. I expected there to be a lot of memory bloat and think we can fix that without undue effort :-).
A bunch of APIs changed and there is some additional name pollution. Since these are pretty internal APIs, I'm not sure that part is a big deal. I will try to find more name pollution and eliminate it by prefixing with Py.
Right. The code isn't binary compatible with Python 2.4 right now, but given the APIs that changed I wasn't too concerned about that. I'm not sure who should make the final decision there.
One API change which I think was a mistake was _Py_Mangle() losing 2 parameters (I think this was how it was a long time ago). See typeobject.c, Python.h, compile.c.
I don't mind this one since it's an _Py function. I don't think code outside the core should use it.
pythonrun.h has a bunch of changes. I think a lot of the APIs changed, but there might be backwards compatible macros. I'm not sure. I need to review closely.
We should double-check. I tried to get rid of the nest of different functions that call each other by replacing the old ones with macros that call the newest ones (the functions that take the most arguments). It's not really a related change, except that it seemed like cleanup of compiler-related code. Also, a bunch of functions started taking const char* instead of char*. I think that's a net win, too.
code.h was added, but it mostly contains stuff from compile.h. Should we remove code.h and just put everything in compile.h? This will remove lots little changes. code.h & compile.h are tightly coupled. If we keep them separate, I would like to see some other changes.
I would like to keep them separate. The compiler produces code objects, but consumers of code objects don't need to know anything about the compiler. You did remind me that I intended to remove the #include "compile.h" lines from a bunch of files that merely consume code objects. What other changes would you like to see?
This probably is not a big deal, but I was surprised by this change:
+++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 @@ -123,7 +123,7 @@
def test_lambda(self): self.failUnless(repr(lambda x: x).startswith( - "<function <lambda")) + "<function lambda"))
This one may be only marginally worse (names w/parameter unpacking):
test_grammar.py
- verify(f4.func_code.co_varnames == ('two', '.2', 'compound', - 'argument', 'list')) + vereq(f4.func_code.co_varnames, + ('two', '.1', 'compound', 'argument', 'list'))
There are still more things I need to review. These were the biggest issues I found. I don't think most are that big of a deal, just wanted to point stuff out.
I don't have a strong sense for how important these changes are. I don't think the old behavior was documented, but I can imagine some code depending on these implementation details. Jeremy

On 10/21/05, Jeremy Hylton <jeremy@alum.mit.edu> wrote:
On 10/21/05, Neal Norwitz <nnorwitz@gmail.com> wrote:
This probably is not a big deal, but I was surprised by this change:
+++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 @@ -123,7 +123,7 @@
def test_lambda(self): self.failUnless(repr(lambda x: x).startswith( - "<function <lambda")) + "<function lambda"))
if this means that the __name__ attribute of a lambda now says "lambda" instead of "<lambda>", please change it back. The angle brackets make it stand out more, and I imagine people might be checking for this to handle it specially.
This one may be only marginally worse (names w/parameter unpacking):
test_grammar.py
- verify(f4.func_code.co_varnames == ('two', '.2', 'compound', - 'argument', 'list')) + vereq(f4.func_code.co_varnames, + ('two', '.1', 'compound', 'argument', 'list'))
This doesn't bother me. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

At 11:13 AM 10/21/2005 -0400, Jeremy Hylton wrote:
I don't have a strong sense for how important these changes are. I don't think the old behavior was documented, but I can imagine some code depending on these implementation details.
I'm pretty sure I've seen code in the field (e.g. recipes in the online Python cookbook) that checked for a function's name being '<lambda>'. That's also a thing that's likely to show up in people's doctests.

Jeremy Hylton wrote:
I would like to keep them separate. The compiler produces code objects, but consumers of code objects don't need to know anything about the compiler.
Please do keep these separate - the only reason I've ever had to muck with code objects is to check if a function is a generator or not, and including the entire compiler header just for that seemed like overkill. It's not a huge issue for me, but the separate header files do give better 'separation of concerns'. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com

Neal Norwitz wrote:
Jeremy,
There are a bunch of mods from the AST branch that got integrated into head. Hopefully, by doing this on python-dev more people will get involved. I'll describe high level things first, but there will be a ton of details later on. If people don't want to see this crap on python-dev, I can take this offline.
Highlevel overview of code size (rough decrease of 300 C source lines): * Python/compile.c -2733 (was 6822 now 4089) * Python/Python-ast.c +2281 (new file) * Python/asdl.c +92 (new file) * plus other minor mods
FYI, I'm getting these warnings: Python/Python-ast.c: In function `marshal_write_expr_context': Python/Python-ast.c:1995: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_boolop': Python/Python-ast.c:2070: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_operator': Python/Python-ast.c:2085: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_unaryop': Python/Python-ast.c:2130: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_cmpop': Python/Python-ast.c:2151: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_keyword': Python/Python-ast.c:2261: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_alias': Python/Python-ast.c:2270: warning: unused variable `i' -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
participants (6)
-
Guido van Rossum
-
Jeremy Hylton
-
M.-A. Lemburg
-
Neal Norwitz
-
Nick Coghlan
-
Phillip J. Eby