terminology for "free variables" in Python
The execution model section of the Python reference manual defines free variables as follows: "If a variable is used in a code block but not defined there, it is a free variable" This makes sense and fits the academic definition. The documentation of the symtable module supports this definition - it says about is_free(): "return True if the symbol is referenced in its block but not assigned to". However, it appears that in the CPython front-end source code (in particular the parts dealing with the symbol table), a free variables has a somewhat stricter meaning. For example, in this chunk of code: def some_func(myparam): def internalfunc(): return cc * myparam CPython infers that in 'internalfunc', while 'myparam' is free, 'cc' is global because 'cc' isn't bound in the enclosing scope, although according to the definitions stated above, both should be considered free. The bytecode generated for loading cc and myparam is different, of course. Is there a (however slight) inconsistency of terms here, or is it my misunderstanding? Thanks in advance, Eli
On Thu, Sep 9, 2010 at 9:43 AM, Eli Bendersky <eliben@gmail.com> wrote:
The execution model section of the Python reference manual defines free variables as follows:
"If a variable is used in a code block but not defined there, it is a free variable"
This makes sense and fits the academic definition. The documentation of the symtable module supports this definition - it says about is_free(): "return True if the symbol is referenced in its block but not assigned to".
However, it appears that in the CPython front-end source code (in particular the parts dealing with the symbol table), a free variables has a somewhat stricter meaning. For example, in this chunk of code:
def some_func(myparam): def internalfunc(): return cc * myparam
CPython infers that in 'internalfunc', while 'myparam' is free, 'cc' is
What exactly do you mean by "infers" ? How do you know that it infers that? How does it matter for your understanding of the code?
global because 'cc' isn't bound in the enclosing scope, although according to the definitions stated above, both should be considered free. The bytecode generated for loading cc and myparam is different, of course.
Is there a (however slight) inconsistency of terms here, or is it my misunderstanding?
That remains to be seen (please answer the questions above for a better understanding of your question). Maybe this helps though: global variables are a subset of free variables, and they are treated different for various reasons (some historic, some having to do with optimizations in the code -- I think you saw the latter in the bytecode). -- --Guido van Rossum (python.org/~guido)
def some_func(myparam):
def internalfunc(): return cc * myparam
CPython infers that in 'internalfunc', while 'myparam' is free, 'cc' is
What exactly do you mean by "infers" ? How do you know that it infers that? How does it matter for your understanding of the code?
The easiest way I found to see what CPython thinks is use the 'symtable' module. With its help, it's clear that in the function above, myparam is considered free while cc is considered global. When querying symtable about the symbol myparam, the is_free method returns True while the is_global method returns False, and vice versa for cc. Of course it can also be seen in the code of symtable.c in function analyze_name, and as Nick showed in his message it also affects the way bytecode is generated for the two symbols. My intention in this post was to clarify whether I'm misunderstanding something or the term 'free' is indeed used for different things in different places. If this is the latter, IMHO it's an inconsistency, even if a small one. When I read the code I saw 'free' I went to the docs only to read that 'free' is something else. This was somewhat confusing. Eli
On Fri, Sep 10, 2010 at 12:00 AM, Eli Bendersky <eliben@gmail.com> wrote:
def some_func(myparam):
def internalfunc(): return cc * myparam
CPython infers that in 'internalfunc', while 'myparam' is free, 'cc' is
What exactly do you mean by "infers" ? How do you know that it infers that? How does it matter for your understanding of the code?
The easiest way I found to see what CPython thinks is use the 'symtable' module. With its help, it's clear that in the function above, myparam is considered free while cc is considered global. When querying symtable about the symbol myparam, the is_free method returns True while the is_global method returns False, and vice versa for cc.
Of course it can also be seen in the code of symtable.c in function analyze_name, and as Nick showed in his message it also affects the way bytecode is generated for the two symbols.
My intention in this post was to clarify whether I'm misunderstanding something or the term 'free' is indeed used for different things in different places. If this is the latter, IMHO it's an inconsistency, even if a small one. When I read the code I saw 'free' I went to the docs only to read that 'free' is something else. This was somewhat confusing.
I'm still not clear if my explanation that globals are a subset of free variables got rid of the confusion. The full name for what CPython marks as "free" would be "free but not global" but that's too much of a mouthful. Also you're digging awfully deep into the implementation here -- AFAIC CPython could have called them "type A" and "type B" and there would not have been any problem for compliance with the langage reference. -- --Guido van Rossum (python.org/~guido)
My intention in this post was to clarify whether I'm misunderstanding something or the term 'free' is indeed used for different things in different places. If this is the latter, IMHO it's an inconsistency, even if a small one. When I read the code I saw 'free' I went to the docs only to read that 'free' is something else. This was somewhat confusing.
I'm still not clear if my explanation that globals are a subset of free variables got rid of the confusion. The full name for what CPython marks as "free" would be "free but not global" but that's too much of a mouthful.
Yes, I understand it now. The source code of symtable.c has a long comment above the SET_SCOPE macro which says, among other things: "An implicit global is a free variable for which the compiler has found no binding in an enclosing function scope", which is in tune with what you said.
Also you're digging awfully deep into the implementation here --
Indeed, it all started when I set to understand how symbol tables are implemented in CPython. The inconsistency in the usage of "free" confused me, so I consulted pydev for clarification. I'm no longer confused :-) Regards, Eli
On Fri, Sep 10, 2010 at 2:43 AM, Eli Bendersky <eliben@gmail.com> wrote:
def some_func(myparam): def internalfunc(): return cc * myparam
CPython infers that in 'internalfunc', while 'myparam' is free, 'cc' is global because 'cc' isn't bound in the enclosing scope, although according to the definitions stated above, both should be considered free. The bytecode generated for loading cc and myparam is different, of course.
Is there a (however slight) inconsistency of terms here, or is it my misunderstanding?
There's a slight inconsistency. The names a code object explicitly calls out as free variables (i.e. references to cells in outer scopes) are only a subset of the full set of free variables (every referenced name that isn't a local variable or an attribute).
from dis import show_code def outer(): ... x, y = 1, 2 ... def inner(): ... print (x, y, a, b, c.e) ... return inner ... f = outer() show_code(f) Name: inner Filename: <stdin> Argument count: 0 Kw-only arguments: 0 Number of locals: 0 Stack size: 6 Flags: OPTIMIZED, NEWLOCALS, NESTED Constants: 0: None Names: 0: print 1: a 2: b 3: c 4: e Free variables: 0: y 1: x
a, b, and c are also free variables in the more general sense, but the code object doesn't explicitly flag them as such since it doesn't need to do anything special with them. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
There's a slight inconsistency. The names a code object explicitly calls out as free variables (i.e. references to cells in outer scopes) are only a subset of the full set of free variables (every referenced name that isn't a local variable or an attribute).
from dis import show_code def outer(): ... x, y = 1, 2 ... def inner(): ... print (x, y, a, b, c.e) ... return inner ... f = outer() show_code(f)
Nick, did you know that dis.show_code is neither exported by default from the dis module, nor it's documented in its help() or .rst documentation? Neither is code_info(), which is used by show_code(). I wonder if this is intentional. Eli
On Fri, Sep 10, 2010 at 5:06 PM, Eli Bendersky <eliben@gmail.com> wrote:
Nick, did you know that dis.show_code is neither exported by default from the dis module, nor it's documented in its help() or .rst documentation? Neither is code_info(), which is used by show_code(). I wonder if this is intentional.
code_info is in the normal documentation. I even remembered the versionadded tag without Georg reminding me ;) The omission from __all__ (and hence the module help text) was accidental and is now fixed. The omission of show_code from the documentation was deliberate, and I've now added a comment to that effect (the history is that dis.show_code has been around, but undocumented, for a while. The fact that it printed directly to stdout rather than producing a formatted string was mildly irritating, so I refactored the formatting part out into code_info, leaving just a single print call in show_code. Since I only kept show_code around for backwards compatibility reasons, I don't see any point in advertising its existence - better for people to just call code_info and print the result themselves. Although it *is* somewhat handy for quick introspection at the interpreter prompt... maybe I should document it after all. Thoughts? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Sep 10, 2010 at 15:41, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri, Sep 10, 2010 at 5:06 PM, Eli Bendersky <eliben@gmail.com> wrote:
Nick, did you know that dis.show_code is neither exported by default from the dis module, nor it's documented in its help() or .rst documentation? Neither is code_info(), which is used by show_code(). I wonder if this is intentional.
code_info is in the normal documentation. I even remembered the versionadded tag without Georg reminding me ;)
When you say "is in the normal documentation", do you mean you added it recently ? Although I see it here: http://docs.python.org/dev/py3k/library/dis.html, it's neither in the docs of 3.1.2 (http://docs.python.org/py3k/library/dis.html), nor in 2.7, nor in a build of 3.2 I have lying around from a couple of weeks ago. Although it *is* somewhat handy for quick introspection at the
interpreter prompt... maybe I should document it after all. Thoughts?
I mostly use the dis module for quick-n-dirty exploration of the results of compilation into bytecode, and I'm sure many people use for the same effect. Thus show_code seems like a convenient shortcut, although not a necessary one. The string returned by code_info isn't interactive-shell friendly, and show_code saves the print(...). Personally I think that if it's there, it should be documented. If it's better not to use it, it should be removed or at least marked deprecated in the documentation/docstring. Eli
On Fri, Sep 10, 2010 at 11:23 PM, Eli Bendersky <eliben@gmail.com> wrote:
When you say "is in the normal documentation", do you mean you added it recently ? Although I see it here: http://docs.python.org/dev/py3k/library/dis.html, it's neither in the docs of 3.1.2 (http://docs.python.org/py3k/library/dis.html), nor in 2.7, nor in a build of 3.2 I have lying around from a couple of weeks ago.
The module and docs changes both went in on August 17 as part of the same commit (r84133), so I'm not sure how you could have a local checkout with the module changes but not the doc changes. A checkout from early August wouldn't have either, of course.
I mostly use the dis module for quick-n-dirty exploration of the results of compilation into bytecode, and I'm sure many people use for the same effect. Thus show_code seems like a convenient shortcut, although not a necessary one. The string returned by code_info isn't interactive-shell friendly, and show_code saves the print(...).
Personally I think that if it's there, it should be documented. If it's better not to use it, it should be removed or at least marked deprecated in the documentation/docstring.
Yeah, I changed my mind and have now documented it properly. The 3.2 versionadded tag on show_code is currently a little questionable though. Guido actually checked in the original (undocumented) version of show_code before 3.0 was released. The only thing new about it in 3.2 is it being mentioned in the documentation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I mostly use the dis module for quick-n-dirty exploration of the results of compilation into bytecode, and I'm sure many people use for the same effect. Thus show_code seems like a convenient shortcut, although not a necessary one. The string returned by code_info isn't interactive-shell friendly, and show_code saves the print(...).
Personally I think that if it's there, it should be documented. If it's better not to use it, it should be removed or at least marked deprecated in the documentation/docstring.
Yeah, I changed my mind and have now documented it properly. The 3.2 versionadded tag on show_code is currently a little questionable though. Guido actually checked in the original (undocumented) version of show_code before 3.0 was released. The only thing new about it in 3.2 is it being mentioned in the documentation.
Looks good to me. Eli
Yeah, I changed my mind and have now documented it properly. The 3.2 versionadded tag on show_code is currently a little questionable though. Guido actually checked in the original (undocumented) version of show_code before 3.0 was released. The only thing new about it in 3.2 is it being mentioned in the documentation.
versionadded marks the addition of a feature (see docs.python.org/documenting), so it should be removed here. Regards
Am 10.09.2010 14:41, schrieb Nick Coghlan:
On Fri, Sep 10, 2010 at 5:06 PM, Eli Bendersky <eliben@gmail.com> wrote:
Nick, did you know that dis.show_code is neither exported by default from the dis module, nor it's documented in its help() or .rst documentation? Neither is code_info(), which is used by show_code(). I wonder if this is intentional.
code_info is in the normal documentation. I even remembered the versionadded tag without Georg reminding me ;)
The omission from __all__ (and hence the module help text) was accidental and is now fixed.
The omission of show_code from the documentation was deliberate, and I've now added a comment to that effect (the history is that dis.show_code has been around, but undocumented, for a while. The fact that it printed directly to stdout rather than producing a formatted string was mildly irritating, so I refactored the formatting part out into code_info, leaving just a single print call in show_code. Since I only kept show_code around for backwards compatibility reasons, I don't see any point in advertising its existence - better for people to just call code_info and print the result themselves.
Although it *is* somewhat handy for quick introspection at the interpreter prompt... maybe I should document it after all. Thoughts?
IMO show_code() is not a good name, because the only thing it doesn't do is to -- show the code. I'd rather call it "codeinfo" (which also is more in line with current dis module function names). Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
On Sat, Sep 11, 2010 at 6:46 AM, Georg Brandl <g.brandl@gmx.net> wrote:
[me]
Although it *is* somewhat handy for quick introspection at the interpreter prompt... maybe I should document it after all. Thoughts?
IMO show_code() is not a good name, because the only thing it doesn't do is to -- show the code.
I'd rather call it "codeinfo" (which also is more in line with current dis module function names).
And, indeed, the variant I added that just returns the formatted string instead of printing it directly to stdout is called dis.code_info. dis.show_code is the existing helper that Guido added way back in 2007. As the checkin comment from back then put it, it shows you everything the interpreter knows about the code object except the details of the bytecode (which is already covered by dis.dis). So while I agree the name isn't great, I also don't think it is wrong enough to bother changing. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Eli Bendersky
-
Georg Brandl
-
Guido van Rossum
-
Nick Coghlan
-
Éric Araujo