History on proposals for Macros?
I want Python to have macros. This is obviously a hard sell. I'm willing to do some legwork to demonstrate value. What would a good proposal include? Are there good examples of failed proposals on this topic? Is the core team open to this topic? Thank you for your time, - Mathew Rocklin
On 03/27/2015 05:19 PM, Matthew Rocklin wrote:
I want Python to have macros. This is obviously a hard sell. I'm willing to do some legwork to demonstrate value.
What would a good proposal include? Are there good examples of failed proposals on this topic?
Is the core team open to this topic?
You probably want to have a look at MacroPy [1]. I don't think it was ever seriously proposed for core, though. Carl [1] https://github.com/lihaoyi/macropy
On Mar 27, 2015, at 16:23, Carl Meyer <carl@oddbird.net> wrote:
On 03/27/2015 05:19 PM, Matthew Rocklin wrote: I want Python to have macros. This is obviously a hard sell. I'm willing to do some legwork to demonstrate value.
What would a good proposal include? Are there good examples of failed proposals on this topic?
Is the core team open to this topic?
You probably want to have a look at MacroPy [1].
I don't think it was ever seriously proposed for core, though.
I think it might be worth looking at what smaller core changes could allow MacroPy to be better or simpler than it is. Some of the changes to the import mechanism and the ast module that we've already had between 3.0 and 3.4 have done that, and also have the benefit of making a Python implementation easier to understand and to hack on. I suspect the same would be true for other possible changes. And then, once those changes are in and MacroPy is as simple, pleasant, and wart-free as it can be, it would probably be an easier sell (although still maybe not easy...) to get something like it incorporated into the core compiler and/or importer instead of as an external import hook. Of course you may look at MacroPy and think, "This is way too big and complicated, I just want to be able to do what Language X had without learning all this other stuff", but then it'll at least help focus the ideas on what you do and don't want, right? Anyway, I think people who proposed macros in the past have had two major problems. First, they didn't think through the design. Lisp-style macros don't work in a language with syntax more complex than s-expressions. And making them partially text-based doesn't help either, given the way whitespace works in Python. And once you get to the point where you're doing AST transforms with the ast module, that doesn't really feel like macros anymore. If you instead come at it from the perspective of Dylan or one of the ML-with-macros variants, where what you're looking for is a way to write AST transformers in a declarative language that fits into the host language, you might get a lot father. Second, they didn't think through the marketing. Ask any famous Lispers why macros are important, and the answer is that they let you reprogram the language first, into a language that's easier to write your actual program in. That's a blatantly anti-Pythonic thing to do. You often can't read good Lisp code until you first learn the new, project-specific language they've built on Lisp, and nobody wants that for Python. Even smaller versions of that, like being able to create new flow control syntax, etc. are an anti-selling-point for Python. So, you need to think of a new way to sell macros that explicitly disavows the idea of creating new syntax, but still shows how macros can usefully do things that look and feel Pythonic but can't be done in Python.
On Fri, Mar 27, 2015 at 8:19 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
I want Python to have macros. This is obviously a hard sell. I'm willing to do some legwork to demonstrate value.
You're probably aware of this "prior art", but anyway, it's worth a link. Looks solid to me: https://github.com/lihaoyi/macropy Cheers, Luciano -- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Professor em: http://python.pro.br | Twitter: @ramalhoorg
Responding to comments off list: I'm not referring to C-style preprocessor macros, I'm referring to macros historically found in functional languages and commonly found in many user-targeted languages built in the last few years. The goal is to create things that look like functions but have access to the expression that was passed in. Some examples where this is useful: plot(year, miles / gallon) # Plot with labels determined by input-expressions, e.g. miles/gallon assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before run_concurrently(f(x), f(y), f(z)) # Run f three times in three threads controlled by run_concurrently Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs. There are lots of arguments against this, mostly focused around potential misuse. I'm looking for history of such arguments and for a general "Yes, this is theoretically possible" or "Not a chance in hell" from the community. Both are fine. Cheers, Matthew On Fri, Mar 27, 2015 at 4:24 PM, Luciano Ramalho <luciano@ramalho.org> wrote:
On Fri, Mar 27, 2015 at 8:19 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
I want Python to have macros. This is obviously a hard sell. I'm willing to do some legwork to demonstrate value.
You're probably aware of this "prior art", but anyway, it's worth a link. Looks solid to me:
https://github.com/lihaoyi/macropy
Cheers,
Luciano
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Professor em: http://python.pro.br | Twitter: @ramalhoorg
On Sat, Mar 28, 2015 at 09:53:48AM -0700, Matthew Rocklin wrote: [...]
The goal is to create things that look like functions but have access to the expression that was passed in.
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
How will the Python compiler determine that assertRaises should receive the syntax tree rather than the evaluated 1/0 expression (which of course will raise)? The information that assertRaises is a "macro" is not available at compile time. I really like the idea of delaying the evaluation of certain expressions until a time of the caller's choosing, but I don't see how to make that work automatically. Of course we can do it manually by wrapping the expression in a function, or by writing it as a string and compiling it with compile() for later eval'ing, but the sort of thing you show above strikes me as fundamentally incompatible with Python's execution model. Happy to be proven wrong though. -- Steve
On Mar 28, 2015, at 10:26, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Mar 28, 2015 at 09:53:48AM -0700, Matthew Rocklin wrote: [...] The goal is to create things that look like functions but have access to the expression that was passed in.
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
How will the Python compiler determine that assertRaises should receive the syntax tree rather than the evaluated 1/0 expression (which of course will raise)? The information that assertRaises is a "macro" is not available at compile time.
Well, it _could_ be available. At the time you're compiling a scope (a function, class, or top-level module code), if it uses an identifier that's the name of a macro in scope, the compiler expands the macro instead of compiling in a function call. Python already has a few things you can do at runtime that affect subsequent compilation, like __future__ statements; this isn't impossible. Of course that doesn't mean it's a good idea. It would require some pretty significant changes that may not be immediately obvious (e.g., the whole .pyc mechanism no longer works as-is if you can import macros from other modules, which isn't an issue for __future__ statements...). And debugging could be a nightmare--you now have to know what names are defined at call time, and also what names were defined at definition time, to trace through a function. And so on.
I really like the idea of delaying the evaluation of certain expressions until a time of the caller's choosing, but I don't see how to make that work automatically. Of course we can do it manually by wrapping the expression in a function,
That doesn't work for all uses of macros--a macro can swap two variables, or break or return, etc., and a higher-order function whose arguments are delayed by wrapping them in a function can't do that. But it does work for _many_ uses of macros, like this one. And if you look at languages with light lambda syntax, a lot of things that you'd naturally write as a macro in Lisp, you instead naturally write as a plain higher-order function, and the result is often more readable, not less. (Explicit is better than implicit, but you sometimes don't realize it if the explicitness forces you to write more boilerplate than actual code...) Consider: assertRaises(ZeroDivisionError, :1/0) That ":1/0" means the same thing as "lambda: 1/0". And now "wrapping the expression in a function" doesn't seem so bad. If such a light-lambda syntax reduced the desire for macros down to the point where it could be ignored, and if that desire weren't _already_ low enough that it can be ignored, it would be worth adding. I think the second "if" is where it fails, not the first, but I could be wrong.
On Sun, Mar 29, 2015 at 6:50 AM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
Python already has a few things you can do at runtime that affect subsequent compilation, like __future__ statements; this isn't impossible.
To be technically accurate, __future__ directives do not affect "subsequent compilation" - they affect the compilation of the one module they're at the top of, you can't do this: def func1(f): print "Hello, world!" f("Haha") from __future__ import print_function def func2(): func1(print) ChrisA
On Mar 28, 2015, at 15:45, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Mar 29, 2015 at 6:50 AM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
Python already has a few things you can do at runtime that affect subsequent compilation, like __future__ statements; this isn't impossible.
To be technically accurate, __future__ directives do not affect "subsequent compilation" - they affect the compilation of the one module they're at the top of, you can't do this:
def func1(f): print "Hello, world!" f("Haha")
from __future__ import print_function
def func2(): func1(print)
Except at the interactive interpreter, where you can write exactly that, and it will do exactly what you'd want/expect.
On Sat, Mar 28, 2015 at 12:50:09PM -0700, Andrew Barnert wrote:
On Mar 28, 2015, at 10:26, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Mar 28, 2015 at 09:53:48AM -0700, Matthew Rocklin wrote: [...] The goal is to create things that look like functions but have access to the expression that was passed in.
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
How will the Python compiler determine that assertRaises should receive the syntax tree rather than the evaluated 1/0 expression (which of course will raise)? The information that assertRaises is a "macro" is not available at compile time.
Well, it _could_ be available. At the time you're compiling a scope (a function, class, or top-level module code), if it uses an identifier that's the name of a macro in scope, the compiler expands the macro instead of compiling in a function call.
Perhaps I trimmed out too much of Matthew's comment, but he did say he isn't talking about C-style preprocessor macros, so I think that if you are imagining "expanding the macro" C-style, you're barking up the wrong tree. From his description, I don't think Lisp macros are quite the right description either. (As always, I welcome correction if I'm the one who is mistaken.) C macros are more or less equivalent to a source-code rewriter: if you define a macro for "foo", whenever the compiler sees a token "foo", it replaces it with the body of the macro. More or less. Lisp macros are different, and more powerful: http://c2.com/cgi/wiki?LispMacro http://cl-cookbook.sourceforge.net/macros.html but I don't think that's what Matthew wants either. The critical phrase is, I think: "one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context" which I interpret in this way: Suppose we have this chunk of code creating then using a macro: macro mymacro(expr): # process expr somehow mymacro(x + 1) That is *roughly* equivalent (ignoring the part about context) to what we can do today: def myfunction(expr): assert isinstance(expr, ast.Expression) # process expr somehow tree = ast.parse('x + 1', mode='eval') myfunction(tree) The critical difference being that instead of the author writing code to manually generate the syntax tree from a string at runtime, the compiler automatically generates the tree from the source code at compile time. This is why I think that it can't be done by Python. What should the compiler do here? callables = [myfunction, mymacro] random.shuffle(callables) for f in callables: f(x + 1) If that strikes you as too artificial, how about a simpler case? from mymodule import name name(x + 1) If `name` will refer to a function at runtime, the compiler needs to generate code which evaluates x+1 and passes the result to `name`; but if `name` will refer to a macro, the compiler needs to generate an ast (plus context) and pass it without evaluating it. To do that, it needs to know at compile-time which objects are functions and which are macros, but that is not available until runtime. But we might be able to rescue this proposal by dropping the requirement that the compiler knows when to pass the syntax tree and when to evaluate it. Suppose instead we had a lightweight syntax for generating the AST plus grabbing the current context: x = 23 spam(x + 1, !(x+1)) # macro syntax !( ... ) Now the programmer is responsible for deciding when to use an AST and when to evaluate it, not the compiler, and "macros" become regular functions which just happen to expect an AST as their argument. [...]
If such a light-lambda syntax reduced the desire for macros down to the point where it could be ignored, and if that desire weren't _already_ low enough that it can be ignored, it would be worth adding. I think the second "if" is where it fails, not the first, but I could be wrong.
I presume that Matthew wants the opportunity to post-process the AST, not merely evaluate it. If all you want is to wrap some code an an environment in a bundle for later evaluation, you are right, a function will do the job. But it's hard to manipulate byte code, hence the desire for a syntax tree. -- Steve
On Sun, Mar 29, 2015 at 12:51 PM, Steven D'Aprano <steve@pearwood.info> wrote:
But we might be able to rescue this proposal by dropping the requirement that the compiler knows when to pass the syntax tree and when to evaluate it. Suppose instead we had a lightweight syntax for generating the AST plus grabbing the current context:
x = 23 spam(x + 1, !(x+1)) # macro syntax !( ... )
Now the programmer is responsible for deciding when to use an AST and when to evaluate it, not the compiler, and "macros" become regular functions which just happen to expect an AST as their argument.
This is actually a lot more plausible than most of the other theories (except the preprocessor, but anyone can do that, and it's not necessarily going to help you any). If the magic macro operator is given precedence equivalent to lambda, it would often be possible to do it without the parens, too. In the same way that "lambda: expr" yields an object (a function) rather than evaluating its argument, the macro syntax would yield an object (an AST tree or somesuch) without actually evaluating the expression. ChrisA
On Mar 28, 2015, at 18:51, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Mar 28, 2015 at 12:50:09PM -0700, Andrew Barnert wrote:
On Mar 28, 2015, at 10:26, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Mar 28, 2015 at 09:53:48AM -0700, Matthew Rocklin wrote: [...] The goal is to create things that look like functions but have access to the expression that was passed in.
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
How will the Python compiler determine that assertRaises should receive the syntax tree rather than the evaluated 1/0 expression (which of course will raise)? The information that assertRaises is a "macro" is not available at compile time.
Well, it _could_ be available. At the time you're compiling a scope (a function, class, or top-level module code), if it uses an identifier that's the name of a macro in scope, the compiler expands the macro instead of compiling in a function call.
Perhaps I trimmed out too much of Matthew's comment, but he did say he isn't talking about C-style preprocessor macros, so I think that if you are imagining "expanding the macro" C-style, you're barking up the wrong tree.
No, I'm not imagining C style expansion. What I'm imagining is closer to Lisp-style, and closer still to Dylan-style.* I didn't want to get into details because they're complicated, and there even may be multiple complicated ways to do things that have to be chosen from, and none of that is relevant to your question. But if you're curious, let me give a more specific explanation: A macro is compiled by transforming its AST into AST-transforming bytecode, similar to a function, but then instead of embedding a BUILD_FUNCTION opcode into the defining scope's bytecode, you do that, and _also_ effectively call it in the current scope and bind the macro to the name there.** A macro is expanded by parsing the arguments into ASTs, calling the AST-transforming function on those ASTs, and substituting the result into the tree at the point of the macro call.*** You don't need to explicitly "pass" a context; the context in which the function is called (the compile-time scope, etc.) is implicitly available, just as for runtime functions. As I mentioned before, there are a number of additional issues you'd have to resolve (again, think import, and .pyc files, for an example), some of which may make the feature undesirable once you think them through, but I don't think any of them are relevant to your question, and I think something like this design is what he was asking about. * In Lisp, because there is no syntax and hence no separate step to turn a parenthesized token stream into an AST, it's ambiguous at which stage--before or after that non-existent step--macros are applied. In languages with syntax and grammars, the usual answer is to do it after parsing to AST. You can conceptually define macros at any stage in the pipeline, but that's where they turn out to be most useful. Also, I'm ignoring the issue of hygiene, but I think most people want macros to be hygienic by default and unhygienic only on explicit demand, rather than what Lisp does. ** The details here are tricky because the compiler's notion of "current scope" isn't defined by the language anyway and doesn't correspond to anything defined at runtime, but the intuitive idea is clear: while compiling a module or other scope, if you reach a def or class, a compiler has to do the equivalent of recursively calling itself or pushing a scope onto a stack manually; that stack defines the compile-time scope. So the name bound to the macro goes away when you exit that recursive call/pop that scope from the stack. The compiler currently keeps track of all variables assigned to in the current scope to determine local variables and closures; I believe macro assignments can be piled on top of that. If not, this is something completely new that you have to bolt on. *** In most languages with syntax and macros, a macro can only take an expression, and must return an expression--which is fine for most of those languages, where almost everything (flow control, variable binding, etc.) is an expression, but not so much in Python, where many of those things can only be done in a statement. But allowing a statement or an expression (or any arbitrary AST node) and allowing a macro to likewise "return" either (or any) type isn't straightforward once you think through some examples.
From his description, I don't think Lisp macros are quite the right description either. (As always, I welcome correction if I'm the one who is mistaken.)
C macros are more or less equivalent to a source-code rewriter: if you define a macro for "foo", whenever the compiler sees a token "foo", it replaces it with the body of the macro. More or less.
Lisp macros are different, and more powerful:
http://c2.com/cgi/wiki?LispMacro http://cl-cookbook.sourceforge.net/macros.html
but I don't think that's what Matthew wants either. The critical phrase is, I think:
"one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context"
which I interpret in this way:
Suppose we have this chunk of code creating then using a macro:
macro mymacro(expr): # process expr somehow
mymacro(x + 1)
That is *roughly* equivalent (ignoring the part about context) to what we can do today:
def myfunction(expr): assert isinstance(expr, ast.Expression) # process expr somehow
tree = ast.parse('x + 1', mode='eval') myfunction(tree)
The critical difference being that instead of the author writing code to manually generate the syntax tree from a string at runtime, the compiler automatically generates the tree from the source code at compile time.
This is why I think that it can't be done by Python. What should the compiler do here?
callables = [myfunction, mymacro] random.shuffle(callables) for f in callables: f(x + 1)
Well, that depends on how much of Python you want to be available at compile time. One possibility is that only the defmacro statement is executed at compile time. You may want to add imports to that. You may want to add assignments. You may want to add some large, ridiculously-complex, but well-defined subset of Python (I would ask you to think C++ constexpr rules, but that request might be construed as torture.) Or you may even want the entire language. And any of the above could be modified by making some or all of those constructs compile-time only when explicitly required (again, like constexpr). Any of these is conceptually sensible (although none of them may be desirable...), and they give you different answers here. For example, if you only execute defmacro and import at compile time, then at compile time f is not expanded as a macro, it's just called as a function, which will probably raise a TypeError at runtime (because the current value of x + 1 is probably not an AST node...).
If that strikes you as too artificial, how about a simpler case?
from mymodule import name name(x + 1)
If `name` will refer to a function at runtime, the compiler needs to generate code which evaluates x+1 and passes the result to `name`; but if `name` will refer to a macro, the compiler needs to generate an ast (plus context) and pass it without evaluating it. To do that, it needs to know at compile-time which objects are functions and which are macros, but that is not available until runtime.
This is exactly what I was referring to when I said that you need to make significant changes to other parts of the language, such as revising the import machinery and .pyc files, and that this problem does not apply to future statements. I don't want to go over all the details in as much depth as the last question, so hopefully you'll just accept that the answer is the same: it's not available today, but it could be available, in a variety of different ways that you'd have to choose between, all of which would have different knock-on effects.
But we might be able to rescue this proposal by dropping the requirement that the compiler knows when to pass the syntax tree and when to evaluate it. Suppose instead we had a lightweight syntax for generating the AST plus grabbing the current context:
x = 23 spam(x + 1, !(x+1)) # macro syntax !( ... )
That's a lot closer to what MacroPy does. And notice that if the !() syntax were added to the grammar, MacroPy or something like it could be significantly simpler.* Which means it might be easier to integrate it directly into the builtin compiler--but also means it might be less desirable to do so, as leaving it as an externally-supplied import hook has all the benefits of externally-supplied modules in general. (However, there is still the disadvantage that you have to apply an import hook before importing, effectively meaning you can't use or define macros in your top-level script. If necessary, you could fix that as well with special syntax that must come before anything but comments and future statements that adds an import hook. This is the kind of thing I was talking about in my first message, about finding smaller changes to the language that make MacroPy or something like it simpler and/or more flexible.) * I believe OCaml recently added something similar for similar purposes, but I haven't used it in a few years, so I may be misinterpreting what I saw skimming the what's new for the last major version.
Now the programmer is responsible for deciding when to use an AST and when to evaluate it, not the compiler, and "macros" become regular functions which just happen to expect an AST as their argument.
No, not quite. What do you _do_ with the AST returned by the macro? And when do you do it? You still have to substitute it into the tree of the current compilation target in place of the macro call, which means it still has to be available at compile time. It does provide a simpler way to resolve some of the other issues, but it doesn't resolve the most fundamental one.
[...]
If such a light-lambda syntax reduced the desire for macros down to the point where it could be ignored, and if that desire weren't _already_ low enough that it can be ignored, it would be worth adding. I think the second "if" is where it fails, not the first, but I could be wrong.
I presume that Matthew wants the opportunity to post-process the AST, not merely evaluate it. If all you want is to wrap some code an an environment in a bundle for later evaluation, you are right, a function will do the job. But it's hard to manipulate byte code, hence the desire for a syntax tree.
Sure, but most of the examples people give for wanting macros--including his example that you quoted--don't actually do anything that can't be done with a higher-order function. Which means they may not actually want macros, they just think they do. Ask a Haskell lover why Haskell doesn't need macros, and he'll tell you that it's because you don't need them, you only think you do because of your Lisp prejudices. Of course that isn't 100% true,* but it's true enough that most people are happy without macros in Haskell. Similarly, while it would be even farther from 100% true in Python,** it might still be true enough that most people are happy without macros in Python. (Except, as I said, most people are _already_ happy without macros in Python, which means we may have an even simpler option: just do nothing.) * For one reasonably well-known example (although I may be misremembering, so take this as a "this kind of thing" rather than "exactly this..."), if Haskell98 had macros, you could use them to simulate GADTs, which didn't exist until a later version of the language. For an equivalent example in Python: you could use macros to simulate with statements in Python 2.5. As long as non-silly cases for macros are rare enough, people are satisfied with evaluating them at language design time (a discussion on the list followed by an update to the language and a patch to GHCI/CPython) instead of compile time. :) ** The main reason it would be less true in Python is eager evaluation; a lazy language like Haskell (or, even better, a dataflow language) can replace even more uses of macros with HOFs than an eager language. For example, in Haskell, the equivalent of "def foo(x, y): return y if x else 0" doesn't need the value of y unless x is true, so it doesn't matter that it y is a value rather than an expression. But OCaml, for example, also doesn't have lazy evaluation, and people seem to have the same attitude toward macros there too. (Although it does have a powerful preprocessor, it's not that much different from what Python has with import hooks.) Well, despite trying to skim over some parts, I still wrote a whole book here; apologies for that, to anyone who's still reading. :)
On 03/28/2015 09:51 PM, Steven D'Aprano wrote:
But we might be able to rescue this proposal by dropping the requirement that the compiler knows when to pass the syntax tree and when to evaluate it. Suppose instead we had a lightweight syntax for generating the AST plus grabbing the current context:
x = 23 spam(x + 1, !(x+1)) # macro syntax !( ... )
Now the programmer is responsible for deciding when to use an AST and when to evaluate it, not the compiler, and "macros" become regular functions which just happen to expect an AST as their argument.
Something related to this that I've wanted to experiment with, but is hard to do in python to be able to split a function signature and body, and be able to use them independently. (But in a well defined way.) A signature object could have a default body that returns the closure. And a body (or code) could have a default signature that *takes* a namespace. Then a function becomes ... code(sig(...)) <---> function(...) The separate parts could be created with a decorator. @signature def sig_x(x): pass @code def inc_x(): x += 1 @code def dec_x(): x -= 1 In most cases it's best to think of applying code bodies to names spaces. names = sig_x(0) inc_x(names) dec_x(names) That is nicer than continuations as each code block is a well defined unit that executes to completion and doesn't require suspending the frame. (Yes, it can be done with dictionaries, but that wouldn't give the macro like functionality (see below) this would. And there may be other benifits to having it at a lower more efficient level.) To allow macro like ability a code block needs to be executable in the current scope. That can be done just by doing... code(locals()) # Dependable? And sugar to do that could be... if x < 10: ^^ inc_x #just an example syntax. else: ^^ dec_x # Note the ^^ looks like the M in Macro. ;-) Possibly the decorators could be used with lambda directly to get inline functionality. code(lambda : x + 1) And a bit of sugar to shorten the common uses if needed. spam(x + 1, code(lambda : x + 1)) spam(x + 1, ^^: x + 1) Cheers, Ron
On 03/29/2015 01:21 PM, Ron Adam wrote:
To allow macro like ability a code block needs to be executable in the current scope. That can be done just by doing...
code(locals()) # Dependable?
Just to clarify, this should have been... code_obj(locals()) Since I used "code" as the decorator name, this may have been a bit confusing. Cheers, Ron
On Mar 29, 2015, at 10:21, Ron Adam <ron3200@gmail.com> wrote:
On 03/28/2015 09:51 PM, Steven D'Aprano wrote: But we might be able to rescue this proposal by dropping the requirement that the compiler knows when to pass the syntax tree and when to evaluate it. Suppose instead we had a lightweight syntax for generating the AST plus grabbing the current context:
x = 23 spam(x + 1, !(x+1)) # macro syntax !( ... )
Now the programmer is responsible for deciding when to use an AST and when to evaluate it, not the compiler, and "macros" become regular functions which just happen to expect an AST as their argument.
Something related to this that I've wanted to experiment with, but is hard to do in python to be able to split a function signature and body, and be able to use them independently. (But in a well defined way.)
Almost everything you're asking for is already there. A function object contains, among other things, a sequence of closure cells, a local and global environment, default parameter values, and a code object. A code object contains, among other things, parameter names, a count of locals, and a bytecode string. You can see the attributes of these objects at runtime, and the inspect module docs describe what they mean. You can also construct these objects at runtime by using their constructors (you have to use types.FunctionType and types.CodeType; the built-in help can show you the parameters). You can also compile source code (or an AST) to a code object with the compile function. You can call a code object with the exec function, which takes a namespace (or, optionally, separate local and global namespaces--and, in a slightly hacky way, you can also override the builtin namespace). There are also Signature objects in the inspect module, but they're not "live" usable objects, they're nicely-organized-for-human-use representations of the signature of a function. So practically you'd use a dummy function object or just a dict or something, and create a new function from its attributes/members and the new code. So, except for minor spelling differences, that's exactly what you're asking for, and it's already there. But if I've guessed right about _why_ you want this, it doesn't do what you'd want it to, and I don't think there's any way it could. Bytecode accesses locals (including arguments), constants, and closure cells by index from the frame object, not by name from a locals dict (although the frame has one of those as well, in case you want to debug or introspect, or call locals()). So, when you call a function, Python sets up the frame object, matching positional and keyword arguments (and default values in the function object) up to parameters and building up the sequence of locals. The frame is also essential for returning and uncaught exceptions (it has a back pointer to the calling frame). The big thing you can't do directly is to create new closure cells programmatically from Python. The compiler has to know which of your locals will be used as closure variables by any embedded functions; it then stores these specially within your code object, so the MAKE_CLOSURE bytecode that creates a function object out of each embedded function can create matching closure cells to store in the embedded function object. This is the part that you need to add into what Python already has, and I'm not sure there's a clean way to do it. But you really should learn how all the existing stuff works (the inspect docs, the dis module, and the help for the constructors in the types module are actually sufficient for this, without having to read the C source code, in 3.4 and later) and find out for yourself, because if I'm wrong, you may come up with something cool. (Plus, it's fun and useful to learn about.) That only gets you through the first half of your message--enough to make inc_x work as a local function (well, almost--without a "nonlocal x" statement it's going to compile x as a local variable rather than a closure variable, and however you execute it, you're just going to get UnboundLocalError). What about the second part, where you execute code in an existing frame? That's even trickier. A frame already has its complete lists of locals, cells, and constants at construction time. If your code objects never used any new locals or constants, and never touched any nonlocal variables that weren't already touched by the calling function, all you need is some kind of "relocation" step that remaps the indices compiled into the bytecode into the indices in the calling frame (there's enough info in the code objects to do the mapping; for actually creating the relocated bytecode from the original you'll want something like the byteplay module, which unfortunately doesn't exist for 3.4--although I have an incomplete port that might be good enough to play with if you're interested). You can almost get away with the "no new locals or nonlocal cells" part, but "no new constants" is pretty restrictive. For example, if you compile inc_x into a fragment that can be executed inline, the number 1 is going to be constant #0 in its code object. And now, you try to "relocate" it to run in a frame with a different code object, and (unless that different code object happened to refer to 1 as a constant as well) there's nothing to match it up to. And again, I don't see a way around this without an even more drastic rearchitecting of how Python frames work--but again, I think it's worth looking for yourself in hopes that I'm wrong. There's another problem: every function body compiler to code ends with a return bytecode. If you didn't write one explicitly, you get the equivalent of "return None". That means that, unless you solve that in some way, executing a fragment inline is always going to return from the calling function. And what do you want to do about explicit return? Or break and continue? Of the two approaches, I think the first one seems cleaner. If you can make a closure cell out of x and then wrap inc_x's code in a normal closure that references it, that still feels like Python. (And having to make "nonlocal x" explicit seems like a good thing, not a limitation.) Fixing up fragments to run in a different frame, and modifying frames at runtime to allow them to be fixed up, seems a lot hackier. And the whole return issue is pretty serious, too. One last possibility to consider is something between the two: a different kind of object, defined differently, like a proc in Ruby (which is defined with a block rather than a def) might solve some of the problems with either approach. And stealing from Ruby again, procs have their own "mini-frames"; there's a two-level stack where every function stack frame has a proc stack frame, which allows a solution to the return-value problem that wouldn't be available with either closures or fragments. (However, note that the return-value problem is much more serious in Ruby, where everything is supposed to be an expression, with a value; in Python you can just say "fragment calls are statements, so they don't have values" if that's what you want.) One last note inline:
A signature object could have a default body that returns the closure.
And a body (or code) could have a default signature that *takes* a namespace.
Then a function becomes ...
code(sig(...)) <---> function(...)
The separate parts could be created with a decorator.
@signature def sig_x(x): pass
@code def inc_x(): x += 1
@code def dec_x(): x -= 1
In most cases it's best to think of applying code bodies to names spaces.
names = sig_x(0) inc_x(names) dec_x(names)
That is nicer than continuations as each code block is a well defined unit that executes to completion and doesn't require suspending the frame.
(Yes, it can be done with dictionaries, but that wouldn't give the macro like functionality (see below) this would. And there may be other benifits to having it at a lower more efficient level.)
To allow macro like ability a code block needs to be executable in the current scope. That can be done just by doing...
code(locals()) # Dependable?
And sugar to do that could be...
if x < 10: ^^ inc_x #just an example syntax. else: ^^ dec_x # Note the ^^ looks like the M in Macro. ;-)
Possibly the decorators could be used with lambda directly to get inline functionality.
code(lambda : x + 1)
This is a very different thing from what you were doing above. A function that modifies a closure cell's value, like inc_x, can't be written as a lambda (because assignments are statements). And this lambda is completely pointless if you're going to use it in a context where you ignore its return value (like the way you used inc_x above). So, I'm not sure what you're trying to do here, but I think you may have another problem to solve on top of the ones I already mentioned.
And a bit of sugar to shorten the common uses if needed.
spam(x + 1, code(lambda : x + 1))
spam(x + 1, ^^: x + 1)
Cheers, Ron
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 03/29/2015 08:36 PM, Andrew Barnert wrote:
Something related to this that I've wanted to experiment with, but is hard to do in python to be able to split a function signature and body, and be able to use them independently. (But in a well defined way.)
Almost everything you're asking for is already there.
Yes, I have looked into most of what you mention here.
A function object contains, among other things, a sequence of closure cells, a local and global environment, default parameter values, and a code object.
A code object contains, among other things, parameter names, a count of locals, and a bytecode string.
You can see the attributes of these objects at runtime, and the inspect module docs describe what they mean. You can also construct these objects at runtime by using their constructors (you have to use types.FunctionType and types.CodeType; the built-in help can show you the parameters).
You can also compile source code (or an AST) to a code object with the compile function.
You can call a code object with the exec function, which takes a namespace (or, optionally, separate local and global namespaces--and, in a slightly hacky way, you can also override the builtin namespace).
There are also Signature objects in the inspect module, but they're not "live" usable objects, they're nicely-organized-for-human-use representations of the signature of a function. So practically you'd use a dummy function object or just a dict or something, and create a new function from its attributes/members and the new code.
So, except for minor spelling differences, that's exactly what you're asking for, and it's already there.
I think it's more than minor spelling differences. :-) I've played around with de-constructing functions and using the constructors to put them back together again, enough to know it's actually quite hard to get everything right with other than the original parts. The exec function may be a good start to experimenting with this. I havn't used it with code objects enough to be familiar with what limits that has. It may not be that difficult to copy the C source for exec and create a new function more specific to this idea. (as a test) Even if that's slow, it may be good enough as a start.
But if I've guessed right about_why_ you want this, it doesn't do what you'd want it to, and I don't think there's any way it could.
Bytecode accesses locals (including arguments), constants, and closure cells by index from the frame object, not by name from a locals dict (although the frame has one of those as well, in case you want to debug or introspect, or call locals()). So, when you call a function, Python sets up the frame object, matching positional and keyword arguments (and default values in the function object) up to parameters and building up the sequence of locals. The frame is also essential for returning and uncaught exceptions (it has a back pointer to the calling frame).
This isn't a problem if the callable_code object creates a new frame. It is an issue when running a code block in the current frame. But I think there may be a way to get around that.
The big thing you can't do directly is to create new closure cells programmatically from Python. The compiler has to know which of your locals will be used as closure variables by any embedded functions; it then stores these specially within your code object, so the MAKE_CLOSURE bytecode that creates a function object out of each embedded function can create matching closure cells to store in the embedded function object. This is the part that you need to add into what Python already has, and I'm not sure there's a clean way to do it.
But you really should learn how all the existing stuff works (the inspect docs, the dis module, and the help for the constructors in the types module are actually sufficient for this, without having to read the C source code, in 3.4 and later) and find out for yourself, because if I'm wrong, you may come up with something cool. (Plus, it's fun and useful to learn about.)
I'm familiar with most of how python works, and even hacked a bit on ceval.c. (for fun) I haven't played much with the AST side of things, but I do know generally how python is put together and works.
That only gets you through the first half of your message--enough to make inc_x work as a local function (well, almost--without a "nonlocal x" statement it's going to compile x as a local variable rather than a closure variable, and however you execute it, you're just going to get UnboundLocalError).
What about the second part, where you execute code in an existing frame?
That's even trickier.
Yes, I definitely agree. I think one of the tests of a good idea is that it makes something that is normally hard (or tricky), simple and easy. But actually doing that may be quite hard. (or even not possible.)
A frame already has its complete lists of locals, cells, and constants at construction time. If your code objects never used any new locals or constants, and never touched any nonlocal variables that weren't already touched by the calling function, all you need is some kind of "relocation" step that remaps the indices compiled into the bytecode into the indices in the calling frame (there's enough info in the code objects to do the mapping; for actually creating the relocated bytecode from the original you'll want something like the byteplay module, which unfortunately doesn't exist for 3.4--although I have an incomplete port that might be good enough to play with if you're interested).
You can almost get away with the "no new locals or nonlocal cells" part, but "no new constants" is pretty restrictive. For example, if you compile inc_x into a fragment that can be executed inline, the number 1 is going to be constant #0 in its code object. And now, you try to "relocate" it to run in a frame with a different code object, and (unless that different code object happened to refer to 1 as a constant as well) there's nothing to match it up to.
And again, I don't see a way around this without an even more drastic rearchitecting of how Python frames work--but again, I think it's worth looking for yourself in hopes that I'm wrong.
Think of these things as non_local_blocks. The difference is they would use dynamic scope instead of static scope. Or to put it another way, they would inherit the scope they are executed in.
There's another problem: every function body compiler to code ends with a return bytecode. If you didn't write one explicitly, you get the equivalent of "return None". That means that, unless you solve that in some way, executing a fragment inline is always going to return from the calling function. And what do you want to do about explicit return? Or break and continue?
As a non_local_block, a return could be a needed requirement. It would return the value to the current location in the current frame. Break and continue are harder. Probably give an error the same as what would happen if they are used outside a loop. A break or return would need to be local to the loop. So you can't have a break in a non_local_block unless the block also has the loop in it. That keeps the associated parts local to each other.
Of the two approaches, I think the first one seems cleaner. If you can make a closure cell out of x and then wrap inc_x's code in a normal closure that references it, that still feels like Python. (And having to make "nonlocal x" explicit seems like a good thing, not a limitation.)
Agree... I picture the first approach as a needed step to get to the second part.
Fixing up fragments to run in a different frame, and modifying frames at runtime to allow them to be fixed up, seems a lot hackier. And the whole return issue is pretty serious, too.
One last possibility to consider is something between the two: a different kind of object, defined differently, like a proc in Ruby (which is defined with a block rather than a def) might solve some of the problems with either approach. And stealing from Ruby again, procs have their own "mini-frames"; there's a two-level stack where every function stack frame has a proc stack frame, which allows a solution to the return-value problem that wouldn't be available with either closures or fragments.
This may be closer to how I am thinking it would work. :-)
(However, note that the return-value problem is much more serious in Ruby, where everything is supposed to be an expression, with a value; in Python you can just say "fragment calls are statements, so they don't have values" if that's what you want.)
It seems to me they can be either. Python ignores None when it's returned by a function and not assigned to anything. And if a value is returned, than it's returned to the current position in the current frame. The return in this case is a non-local-block return. So I think it wouldn't be an issue.
One last note inline:
A signature object could have a default body that returns the closure.
And a body (or code) could have a default signature that*takes* a namespace.
Then a function becomes ...
code(sig(...)) <---> function(...)
The separate parts could be created with a decorator.
@signature def sig_x(x): pass
@code def inc_x(): x += 1
@code def dec_x(): x -= 1
In most cases it's best to think of applying code bodies to names spaces.
names = sig_x(0) inc_x(names) dec_x(names)
That is nicer than continuations as each code block is a well defined unit that executes to completion and doesn't require suspending the frame.
(Yes, it can be done with dictionaries, but that wouldn't give the macro like functionality (see below) this would. And there may be other benifits to having it at a lower more efficient level.)
To allow macro like ability a code block needs to be executable in the current scope. That can be done just by doing...
code(locals()) # Dependable?
And sugar to do that could be...
if x < 10: ^^ inc_x #just an example syntax. else: ^^ dec_x # Note the ^^ looks like the M in Macro.;-)
Possibly the decorators could be used with lambda directly to get inline functionality.
code(lambda : x + 1)
This is a very different thing from what you were doing above. A function that modifies a closure cell's value, like inc_x, can't be written as a lambda (because assignments are statements). And this lambda is completely pointless if you're going to use it in a context where you ignore its return value (like the way you used inc_x above). So, I'm not sure what you're trying to do here, but I think you may have another problem to solve on top of the ones I already mentioned.
It was an incomplete example. It should have been... add_1_to_x = code(lambda: x + 1) and then later you could use it in the same way as above. x = ^^ add_1_to_x This is just an example to show how the first option above connects to the examples below. with "^^: x + 1" being equivalent to "code(lambda: x + 1)". Which would also be equivalent to ... @code def add_1_to_x(): return x + 1 x = ^^ add_1_to_x
And a bit of sugar to shorten the common uses if needed.
spam(x + 1, code(lambda : x + 1))
spam(x + 1, ^^: x + 1)
On Mar 29, 2015, at 21:12, Ron Adam <ron3200@gmail.com> wrote:
On 03/29/2015 08:36 PM, Andrew Barnert wrote:
Something related to this that I've wanted to experiment with, but is hard to do in python to be able to split a function signature and body, and be able to use them independently. (But in a well defined way.)
Almost everything you're asking for is already there.
Yes, I have looked into most of what you mention here.
A function object contains, among other things, a sequence of closure cells, a local and global environment, default parameter values, and a code object.
A code object contains, among other things, parameter names, a count of locals, and a bytecode string.
You can see the attributes of these objects at runtime, and the inspect module docs describe what they mean. You can also construct these objects at runtime by using their constructors (you have to use types.FunctionType and types.CodeType; the built-in help can show you the parameters).
You can also compile source code (or an AST) to a code object with the compile function.
You can call a code object with the exec function, which takes a namespace (or, optionally, separate local and global namespaces--and, in a slightly hacky way, you can also override the builtin namespace).
There are also Signature objects in the inspect module, but they're not "live" usable objects, they're nicely-organized-for-human-use representations of the signature of a function. So practically you'd use a dummy function object or just a dict or something, and create a new function from its attributes/members and the new code.
So, except for minor spelling differences, that's exactly what you're asking for, and it's already there.
I think it's more than minor spelling differences. :-)
The objects you're asking for already exist, but in some cases with slightly different names. There isn't neat calling syntax for things like cloning a function object with a different code object but the same other attributes is a bit painful, but it's just a few two-line wrapper functions that you only have to write once (I think you can even subclass CodeType and FunctionType to add the syntax; if not, you can wrap and delegate). So, what you're asking for really is already part of Python, except for minor spelling differences. The problem is that what you're asking for doesn't give you what you want, because other things (like "the environment") don't work the way you're assuming they do, or because of consequences you probably haven't thought of (like the return issue).
I've played around with de-constructing functions and using the constructors to put them back together again, enough to know it's actually quite hard to get everything right with other than the original parts.
It really isn't, once you learn how they work. Getting the values to put into them (assembling bytecode, creating the lnotab, etc.) is a different story, but if your intention is to do that just by compiling function (or fragment) source, it's all done for you.
The exec function may be a good start to experimenting with this. I havn't used it with code objects enough to be familiar with what limits that has. It may not be that difficult to copy the C source for exec and create a new function more specific to this idea. (as a test)
Even if that's slow, it may be good enough as a start.
Why would it be slow? Also, what do you want your function to do differently from exec? What you described is wanting to run a code object in a specified environment; that's what exec does. I think you're thinking of Python environments as if they were scheme environments, but they aren't. In particular, Python closures don't work in terms of accessing variables from the environment by name; they work in terms of accessing cells from the function object.
But if I've guessed right about_why_ you want this, it doesn't do what you'd want it to, and I don't think there's any way it could.
Bytecode accesses locals (including arguments), constants, and closure cells by index from the frame object, not by name from a locals dict (although the frame has one of those as well, in case you want to debug or introspect, or call locals()). So, when you call a function, Python sets up the frame object, matching positional and keyword arguments (and default values in the function object) up to parameters and building up the sequence of locals. The frame is also essential for returning and uncaught exceptions (it has a back pointer to the calling frame).
This isn't a problem if the callable_code object creates a new frame.
Yes, it is, as I explained before. There's the minor problem that your code object needs to use nonlocal if it wants to reassign to the calling frame's x variable, and the major problem that there is no way to give it a closure that gives it access to that frame's x variable unless, when that calling frame's code was compiled, there was already an embedded function that needed to close over x. I suppose you could hack up the compiler to generate cells for every local variable instead of just those that are actually needed. Then you could sort of do what you want--instead of exec, you construct a function with the appropriate cells in the closure (by matching up the callee code's freevars with the calling code's cellvars).
It is an issue when running a code block in the current frame. But I think there may be a way to get around that.
The big thing you can't do directly is to create new closure cells programmatically from Python. The compiler has to know which of your locals will be used as closure variables by any embedded functions; it then stores these specially within your code object, so the MAKE_CLOSURE bytecode that creates a function object out of each embedded function can create matching closure cells to store in the embedded function object. This is the part that you need to add into what Python already has, and I'm not sure there's a clean way to do it.
But you really should learn how all the existing stuff works (the inspect docs, the dis module, and the help for the constructors in the types module are actually sufficient for this, without having to read the C source code, in 3.4 and later) and find out for yourself, because if I'm wrong, you may come up with something cool. (Plus, it's fun and useful to learn about.)
I'm familiar with most of how python works, and even hacked a bit on ceval.c. (for fun) I haven't played much with the AST side of things, but I do know generally how python is put together and works.
That only gets you through the first half of your message--enough to make inc_x work as a local function (well, almost--without a "nonlocal x" statement it's going to compile x as a local variable rather than a closure variable, and however you execute it, you're just going to get UnboundLocalError).
What about the second part, where you execute code in an existing frame?
That's even trickier.
Yes, I definitely agree. I think one of the tests of a good idea is that it makes something that is normally hard (or tricky), simple and easy. But actually doing that may be quite hard. (or even not possible.)
A frame already has its complete lists of locals, cells, and constants at construction time. If your code objects never used any new locals or constants, and never touched any nonlocal variables that weren't already touched by the calling function, all you need is some kind of "relocation" step that remaps the indices compiled into the bytecode into the indices in the calling frame (there's enough info in the code objects to do the mapping; for actually creating the relocated bytecode from the original you'll want something like the byteplay module, which unfortunately doesn't exist for 3.4--although I have an incomplete port that might be good enough to play with if you're interested).
You can almost get away with the "no new locals or nonlocal cells" part, but "no new constants" is pretty restrictive. For example, if you compile inc_x into a fragment that can be executed inline, the number 1 is going to be constant #0 in its code object. And now, you try to "relocate" it to run in a frame with a different code object, and (unless that different code object happened to refer to 1 as a constant as well) there's nothing to match it up to.
And again, I don't see a way around this without an even more drastic rearchitecting of how Python frames work--but again, I think it's worth looking for yourself in hopes that I'm wrong.
Think of these things as non_local_blocks. The difference is they would use dynamic scope instead of static scope. Or to put it another way, they would inherit the scope they are executed in.
That's a bit of a weird name, since it's a block that's executed locally on the current frame, as opposed to the normal non-local way, but... OK. It makes sense conceptually, but practically it doesn't fit in with Python's execution model. First, the scope is partly defined at compilation time--that's where the list of constants comes from, and the lists of local and free names. If the frame's code is still the calling function, none of these things are available in the frame. If, on the other hand, it's the non_local_block (what I was falling the "fragment"), then the calling scope's variables aren't available. (And just copying the caller's code's stuff into a new frame only gives you copies of the caller's variables, which doesn't let you do the main thing you wanted to do.) Unless you somehow make them both available (e.g., the Ruby-style two-level call stack I mentioned), I don't see a way around that. Second, remember that Python bytecode accesses locals by index; unless you do something like the relocation I described above, you have no way to access the calling scope's variables, and that's doable, but not at all trivial, and doesn't seem very Pythonic. Also, even if you do that, you've still got a problem. Python's LEGB rule that decides which scope to find a name in is handled mostly at compile time. The compiler decides whether to emit LOAD_FAST or one of the other LOAD_*, and likewise for STORE_*, based on the static lexical scope the name is defined in. That's going to make it very hard to hack in dynamic scope on top of Python bytecode--the "relocation" has to actually involve not just renumbering locals, but reproducing all the work the compiler does to decide what's local/global/etc. and replacing instructions as appropriate. As an alternative, maybe you don't want code objects at all, but rather AST objects. Then they can be compiled to bytecode in a given scope (with the builtin compile function) and then executed there (with exec). This solves most of the new problems added by the dynamic scoping/non_local_block/fragment idea while still allowing most of the benefits. The downside, of course, is that you're compiling stuff all over the place--but that's how dynamic code works in most Lisps, so...
There's another problem: every function body compiler to code ends with a return bytecode. If you didn't write one explicitly, you get the equivalent of "return None". That means that, unless you solve that in some way, executing a fragment inline is always going to return from the calling function. And what do you want to do about explicit return? Or break and continue?
As a non_local_block, a return could be a needed requirement. It would return the value to the current location in the current frame.
Do you mean that a call to a non_local_block is an expression, and the value of the expression is the value returned by the block? If so, that makes sense, but then the RETURN_VALUE handler in the interpreter has to be sensitive to the context it's in and do two different things, or the compiler has to be sensitive to what it's compiling--which seems impossible, given that you want to def a regular function and then decorate it after the fact--and issue a different opcode for return (and for the implicit return None).
Break and continue are harder. Probably give an error the same as what would happen if they are used outside a loop. A break or return would need to be local to the loop. So you can't have a break in a non_local_block unless the block also has the loop in it. That keeps the associated parts local to each other.
That definitely makes things simpler--and, I think, better. It will piss off people who expect (from Lisp or Ruby) to be able to do flow control across the boundaries of a non_local_block, but too bad for them. :)
Of the two approaches, I think the first one seems cleaner. If you can make a closure cell out of x and then wrap inc_x's code in a normal closure that references it, that still feels like Python. (And having to make "nonlocal x" explicit seems like a good thing, not a limitation.)
Agree... I picture the first approach as a needed step to get to the second part.
I don't think it really is; most of what you need to solve for the first becomes irrelevant for the second. For example, figuring out a way to dynamically generate closure cells to share variables across frame's is useless when you switch to running with the parent's variables as locals in the same frame.
Fixing up fragments to run in a different frame, and modifying frames at runtime to allow them to be fixed up, seems a lot hackier. And the whole return issue is pretty serious, too.
One last possibility to consider is something between the two: a different kind of object, defined differently, like a proc in Ruby (which is defined with a block rather than a def) might solve some of the problems with either approach. And stealing from Ruby again, procs have their own "mini-frames"; there's a two-level stack where every function stack frame has a proc stack frame, which allows a solution to the return-value problem that wouldn't be available with either closures or fragments.
This may be closer to how I am thinking it would work. :-)
It sounds like it might be. I think it's a really clumsy solution, but obviously it can work or Ruby wouldn't work. :) Notice that Ruby also avoids a lot of the problems below by having completely different syntax and semantics for defining procs vs. functions, and not allowing you to convert them into each other. But if you don't want that, then you have to solve all the problems they got around this way.
(However, note that the return-value problem is much more serious in Ruby, where everything is supposed to be an expression, with a value; in Python you can just say "fragment calls are statements, so they don't have values" if that's what you want.)
It seems to me they can be either. Python ignores None when it's returned by a function and not assigned to anything. And if a value is returned, than it's returned to the current position in the current frame.
No, that's not how things work. Python doesn't do anything special with None. A value is _always_ returned, whether None or otherwise. That value always becomes the value of the calling expression. And Python certainly doesn't care whether you assign it to something--e.g., you can use it inside a larger expression or return it or yield it without assigning it to anything. Of course if the outermost expression is part of an expression statement, the value that it evaluated to is ignored, but that has nothing to do with the value being None, or coming from a function call, or anything else; an expression statement just means "evaluate this expression, then throw away the results". Also, you're missing the bigger point: look at how RETURN_VALUE actually works. If you're running inside the caller's scope, it's going to return from the caller's scope unless you do something (which you need to figure out) to make that not true.
The return in this case is a non-local-block return. So I think it wouldn't be an issue.
What does "a non-local-block return" mean at the implementation level? A different opcode from RETURN_VALUE? A RETURN_VALUE executed from within a non_local_block object's code? A RETURN_VALUE executed from a frame that has a non_local_block running? Whatever the answer, how does the compiler or interpreter (as appropriate) know to do that?
One last note inline:
A signature object could have a default body that returns the closure.
And a body (or code) could have a default signature that*takes* a namespace.
Then a function becomes ...
code(sig(...)) <---> function(...)
The separate parts could be created with a decorator.
@signature def sig_x(x): pass
@code def inc_x(): x += 1
@code def dec_x(): x -= 1
In most cases it's best to think of applying code bodies to names spaces.
names = sig_x(0) inc_x(names) dec_x(names)
That is nicer than continuations as each code block is a well defined unit that executes to completion and doesn't require suspending the frame.
(Yes, it can be done with dictionaries, but that wouldn't give the macro like functionality (see below) this would. And there may be other benifits to having it at a lower more efficient level.)
To allow macro like ability a code block needs to be executable in the current scope. That can be done just by doing...
code(locals()) # Dependable?
And sugar to do that could be...
if x < 10: ^^ inc_x #just an example syntax. else: ^^ dec_x # Note the ^^ looks like the M in Macro.;-)
Possibly the decorators could be used with lambda directly to get inline functionality.
code(lambda : x + 1)
This is a very different thing from what you were doing above. A function that modifies a closure cell's value, like inc_x, can't be written as a lambda (because assignments are statements). And this lambda is completely pointless if you're going to use it in a context where you ignore its return value (like the way you used inc_x above). So, I'm not sure what you're trying to do here, but I think you may have another problem to solve on top of the ones I already mentioned.
It was an incomplete example. It should have been...
add_1_to_x = code(lambda: x + 1)
and then later you could use it in the same way as above.
x = ^^ add_1_to_x
OK, it sounds like what you're really looking for here is that code(spam) returns a function that's just like spam, but all of its variables (although you still have to work out what that means--remember that Python has already decided local vs. cell vs. global at compile time, before you even get to this code function) will use dynamic rather than lexical scoping. All of the other stuff seems to be irrelevant. In fact, maybe it would be simpler to just do what Lisp does: explicitly define individual _variables_ as dynamically scoped, effectively the same way we can already define variables as global or nonlocal, instead of compiling a function and then trying to turn some of its variables into dynamic variables after the fact. And the good news is, I'm 99% sure someone already did this and wrote a blog post about it. I don't know where, and it may be a few years and versions out of date, but it would be nice if you could look at what he did, see that you're 90% of the way to what you want, and just have to solve the last 10%. Plus, you can experiment with this without hacking up anything, with a bit of clumsiness. It's pretty easy to create a class whose instances dynamically scope their attributes with an explicit stack. (If it isn't obvious how, let me know and I'll write it for you.) Then you just instantiate that class (globally, if you want), and have both the caller and the callee use an attribute of that instance instead of a normal variable whenever you want a dynamically-scoped variable, and you're done. You can write nice examples that actually work in Python today to show how this would be useful, and then compare to how much better it would look with real dynamic variable support.
This is just an example to show how the first option above connects to the examples below. with "^^: x + 1" being equivalent to "code(lambda: x + 1)".
Which would also be equivalent to ...
@code def add_1_to_x(): return x + 1
x = ^^ add_1_to_x
And a bit of sugar to shorten the common uses if needed.
spam(x + 1, code(lambda : x + 1))
spam(x + 1, ^^: x + 1)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Mar 30, 2015 at 12:03:47AM -0700, Andrew Barnert wrote:
On Mar 29, 2015, at 21:12, Ron Adam <ron3200@gmail.com> wrote: [snip stuff about dynamic scoping]
Raymond Hettinger's ChainMap (now also in the std lib) may possibly be used to emulate different scoping rules, including dynamic. http://code.activestate.com/recipes/577434-nested-contexts-a-chain-of-mappin... http://code.activestate.com/recipes/305268-chained-map-lookups/ If you can build a function object with __globals__ set to a chained map, you may be able to accomplish most of what you are discussing. By memory, __globals__ needs to be a dict, so you have to use a dict subclass, something like this: class Context(ChainMap, dict): pass -- Steve
On 03/30/2015 03:03 AM, Andrew Barnert wrote:
Break and continue are harder. Probably give an error the same as what would happen if they are used outside a loop. A break or return would need to be local to the loop. So you can't have a break in a non_local_block unless the block also has the loop in it. That keeps the associated parts local to each other.
That definitely makes things simpler--and, I think, better. It will piss off people who expect (from Lisp or Ruby) to be able to do flow control across the boundaries of a non_local_block, but too bad for them.
Or in my own language I wrote to test ideas like this. In it, the keywords are objects so you can return a statement/keyword from a function and have it execute at that location. So for example a function can increment a counter and return a "nil" keyword, (a no-op like pythons pass keyword), and once a limit is reached, return a "break" keyword. Which breaks the loop. The memory model is by static hash-map lookup. Each defined function object gets a reference to the parents hash-map. So name resolution walks the hash-map tree until it finds the name. There are a number of ways to make that more efficient, but for now it keeps the language simple. Python on the other hand does many more checks at compile time, and pull's the name references into the code object. That allows for smaller and faster byte code execution. This seems to be where the main issues are. But as you noted, many of the pieces are already there. So it shouldn't be that difficult to come up with a working test implementation. For now, experimenting with exec may be the best way to test the idea. And possibly come back with some real code to discuss and see where that goes. I think until some real code is written we will go in circles pointing out things wrong with the idea. So lets wait a bit for some real code and examples. When people think of macro's they may be thinking of several different concepts. One is to redefine a statement or expression in a way that makes it easier to use. In that case the expression is transposed to another form at compile time. Lisp macro's work in this way. Another is to in-line a block of code defined in a single location to numerous other locations. Generally this is done with pre-processors. I think in-lining python functions has been discussed before and that idea would overlap this one. My interest in functions that can be taken apart and reused with signature/scope objects is a bit different. The idea of mutating the name space by applying code to it rather than calling code by applying values to it. (A normal function call.) This of course is what objects do, they have a state and methods are used to alter the state. But there are other things I think may be of interest in this idea that may relate back to other areas of python. (NOTE: Just read Stevens message about Raymond's chainmap. So I'm going to see it is useful.) And the reason for bringing this idea up here was I think it could be used to implement the lite macro behaviour that was suggested with a bit of added syntax. On the other hand it appears to me, that Python is going in the direction of making it easier to compile to C code. More dynamic features may not be helpful in the long run.
It was an incomplete example. It should have been...
add_1_to_x = code(lambda: x + 1)
and then later you could use it in the same way as above.
x = ^^ add_1_to_x OK, it sounds like what you're really looking for here is that code(spam) returns a function that's just like spam, but all of its variables (although you still have to work out what that means--remember that Python has already decided local vs. cell vs. global at compile time, before you even get to this code function) will use dynamic rather than lexical scoping. All of the other stuff seems to be irrelevant.
In fact, maybe it would be simpler to just do what Lisp does: explicitly define individual_variables_ as dynamically scoped, effectively the same way we can already define variables as global or nonlocal, instead of compiling a function and then trying to turn some of its variables into dynamic variables after the fact.
It's something to try, but if a code block needs boiler plate to work, or the function it's put in needs it, it really isn't go to be very nice.
And the good news is, I'm 99% sure someone already did this and wrote a blog post about it. I don't know where, and it may be a few years and versions out of date, but it would be nice if you could look at what he did, see that you're 90% of the way to what you want, and just have to solve the last 10%.
Plus, you can experiment with this without hacking up anything, with a bit of clumsiness. It's pretty easy to create a class whose instances dynamically scope their attributes with an explicit stack. (If it isn't obvious how, let me know and I'll write it for you.) Then you just instantiate that class (globally, if you want), and have both the caller and the callee use an attribute of that instance instead of a normal variable whenever you want a dynamically-scoped variable, and you're done. You can write nice examples that actually work in Python today to show how this would be useful, and then compare to how much better it would look with real dynamic variable support.
I'm going to play with the idea a bit over the next few days. :-) Cheers, Ron
This is just an example to show how the first option above connects tothe examples below. with "^^: x + 1" being equivalent to "code(lambda: x + 1)".
Which would also be equivalent to ...
@code def add_1_to_x(): return x + 1
x = ^^ add_1_to_x
>>>And a bit of sugar to shorten the common uses if needed. >>> >>>spam(x + 1, code(lambda : x + 1)) >>> >>>spam(x + 1, ^^: x + 1)
On Mar 30, 2015, at 11:55, Ron Adam <ron3200@gmail.com> wrote:
On 03/30/2015 03:03 AM, Andrew Barnert wrote:
[snip]
And the reason for bringing this idea up here was I think it could be used to implement the lite macro behaviour that was suggested with a bit of added syntax.
On the other hand it appears to me, that Python is going in the direction of making it easier to compile to C code. More dynamic features may not be helpful in the long run.
I don't think this is true. There definitely isn't any momentum in that direction--shedskin and its two competitors are dead, while PyPy and numba are alive and kicking ass. And I don't think it's desirable, or that the core developers think it's desirable. If you think Guido's and Jukka's static type annotations are a step in the direction of static compilation, read the two PEPs; that's very explicitly not a goal, at least in the near term. (And the impression I get from the list discussions is that Guido is skeptical that it will be in the long term, but he's willing to keep an open mind.) Anyway, usually, macros are used to generate a few functions that are called many times, not to generate a whole lot of functions that are only called once, so they don't cause any problems for a decent tracing JIT.
It was an incomplete example. It should have been...
add_1_to_x = code(lambda: x + 1)
and then later you could use it in the same way as above.
x = ^^ add_1_to_x OK, it sounds like what you're really looking for here is that code(spam) returns a function that's just like spam, but all of its variables (although you still have to work out what that means--remember that Python has already decided local vs. cell vs. global at compile time, before you even get to this code function) will use dynamic rather than lexical scoping. All of the other stuff seems to be irrelevant.
In fact, maybe it would be simpler to just do what Lisp does: explicitly define individual_variables_ as dynamically scoped, effectively the same way we can already define variables as global or nonlocal, instead of compiling a function and then trying to turn some of its variables into dynamic variables after the fact.
It's something to try, but if a code block needs boiler plate to work, or the function it's put in needs it, it really isn't go to be very nice.
I think I didn't explain this very well. Take a step back. Instead of having functions whose variables are all lexically scoped, and blocks whose variables are all dynamically scoped, what if you had just one kind of function, but two kinds of variables? Exactly like defvar in Lisp (although hopefully with a better name...). That's obviously more flexible. And it should be much simpler to implement. (And it's dead simple to fake in standard CPython 3.4, without hacking anything--as Steven pointed out, ChainMap makes it even easier). Once you have dynamic variables, it's not that hard to add a new defblock statement that acts just like def except its variables are dynamic by default, instead of lexical. That skips over all the problems with your first version and immediately gets you your second version, and it avoids most of the problems there as well. The only thing you're missing in the end is the ability to convert functions to blocks and vice-versa at runtime. That's the part that causes all the problems, and I don't think is actually necessary for what you want (but if I'm wrong, I guess never mind :).
On 03/30/2015 04:54 PM, Andrew Barnert wrote:
That skips over all the problems with your first version and immediately gets you your second version, and it avoids most of the problems there as well. The only thing you're missing in the end is the ability to convert functions to blocks and vice-versa at runtime. That's the part that causes all the problems, and I don't think is actually necessary for what you want (but if I'm wrong, I guess never mind:).
I don't think its necessary either and converting function blocks to an insertable code block isn't what I had in mind. If you can run a function block in specified frame, then it isn't needed. Just grab the current frame, execute the function block with it, as a function. Of course the programmer would be responsible for making sure the names in the code block are available in the name space used with it. Name errors should propagate normally. result = call_with_frame(function_code_object, sys.getframe(0)) But I think it will still runs into the name issues you mentioned earlier. And as you also mentioned it's quite possible someone has already done a call_with_frame function. I just haven't found it yet. Cheers, Ron
On 03/30/2015 04:54 PM, Andrew Barnert wrote:
[snip]
And the reason for bringing this idea up here was I think it could be used to implement the lite macro behaviour that was suggested with a bit of added syntax.
On the other hand it appears to me, that Python is going in the direction of making it easier to compile to C code. More dynamic features may not be helpful in the long run.
I don't think this is true. There definitely isn't any momentum in that direction--shedskin and its two competitors are dead, while PyPy and numba are alive and kicking ass. And I don't think it's desirable, or that the core developers think it's desirable.
Good to know. BTW... here is a first try. Works barely. Don't use the decorators inside a function yet. I'm sure there's quite a few byte code changes that need to be done to make this dependable, but it does show it's possible and is something interesting to play with. ;-) This is closely related to continuations, but rather than try to capture a frame and then continue the code, it just applies code objects to a name space. That in it self ins't new, but getting the parts from decorated functions is a nice way to do it. import inspect def fix_code(c): """ Fix code object. This attempts to make a code object from a function be more like what you would get if you gave exec a string. * There is more that needs to be done to make this work dependably. But it's a start. """ varnames = c.co_varnames names = c.co_names xchg = [(124, 0x65), #LOAD_FAST to LOAD_NAME (125, 0x5a)] #STORE_FAST to STORE_NAME bcode = [] bgen = iter(c.co_code) for b in bgen: for bx1, bx2 in xchg: if b == bx1: i1 = next(bgen) i2 = next(bgen) index = i1 + i2 * 256 if b in [124, 125]: b = bx2 char = varnames[index] names = names + (char,) index = names.index(char) i2 = index // 256 i1 = index - i2 bcode.append(b) bcode.append(i1) bcode.append(i2) break else: bcode.append(b) co_code = bytes(bcode) Code = type(c) co = Code( 0, #co_argcount, 0, #co_kwonlyargcount, 0, #co_nlocals, c.co_stacksize, 64, #co_flags, co_code, c.co_consts, names, #co_names (), #co_varnames, c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab ) return co class fn_signature: """ Hold a signature from a function. When called with argumetns it will bind them and return a mapping. """ def __init__(self, fn): self.sig = inspect.signature(fn) def __call__(self, *args, **kwds): return dict(self.sig.bind(*args, **kwds).arguments) class fn_code: """ Create a relocatable code object that can be applied to a name space. """ def __init__(self, fn): self.co = fix_code(fn.__code__) def __call__(self, ns): return eval(self.co, ns) def get_sig(fn): """ Decorator to get a signature. """ return fn_signature(fn) def get_code(fn): """ Decorator to get a code object. """ return fn_code(fn) # Example 1 # Applying code to a namespace created by a signature. @get_sig def foo(x): pass @get_code def inc_x(): x += 1 @get_code def dec_x(): x -= 1 @get_code def get_x(): return x ns = foo(3) # creates a namespace inc_x(ns) # Apply code to namespace inc_x(ns) print(get_x(ns)) # --> 5 dec_x(ns) dec_x(ns) print(get_x(ns)) # --> 3 # Example 2 # Code + signature <---> function def add_xy(x, y): return x + y sig = get_sig(add_xy) co = get_code(add_xy) print(co(sig(3, 5))) # --> 8
On Mar 31, 2015, at 22:11, Ron Adam <ron3200@gmail.com> wrote:
On 03/30/2015 04:54 PM, Andrew Barnert wrote: [snip]
And the reason for bringing this idea up here was I think it could be used to implement the lite macro behaviour that was suggested with a bit of added syntax.
On the other hand it appears to me, that Python is going in the direction of making it easier to compile to C code. More dynamic features may not be helpful in the long run.
I don't think this is true. There definitely isn't any momentum in that direction--shedskin and its two competitors are dead, while PyPy and numba are alive and kicking ass. And I don't think it's desirable, or that the core developers think it's desirable.
Good to know.
BTW... here is a first try. Works barely. Don't use the decorators inside a function yet. I'm sure there's quite a few byte code changes that need to be done to make this dependable, but it does show it's possible and is something interesting to play with. ;-)
This is closely related to continuations, but rather than try to capture a frame and then continue the code, it just applies code objects to a name space. That in it self ins't new, but getting the parts from decorated functions is a nice way to do it.
import inspect
def fix_code(c): """ Fix code object.
This attempts to make a code object from a function be more like what you would get if you gave exec a string.
* There is more that needs to be done to make this work dependably. But it's a start.
""" varnames = c.co_varnames names = c.co_names
xchg = [(124, 0x65), #LOAD_FAST to LOAD_NAME (125, 0x5a)] #STORE_FAST to STORE_NAME
The first problem here is that any 124 or 125 in an operand to any opcode except 124 or 125 will be matched and converted (although you'll usually probably get an IndexError trying to treat the next two arbitrary bytes as an index...). To solve this, you need to iterate opcode by opcode, not byte by byte. The dis module gives you the information to tell how many bytes to skip for each opcode's operands. (It also maps between opcode numbers and names, so you don't have to use magic numbers with comments.) Using it will solve this problem (and maybe others I didn't spot) and also make your code a lot simpler. Another problem is that this will only catch local variables. Anything you don't assign to in the function but do reference is liable to end up a global or cell, which will still be a global or cell when you try to run it later. I'm not sure exactly how you want to handle these (if you just convert them unconditionally, then it'll be a bit surprising if someone writes "global spam" and doesn't get a global...), but you have to do something, or half your motivating examples (like "lambda: x+1") won't work. (Or is that why you're doing the explicit namespace thing instead of using the actual scope later on, so that this won't break, because the explicit namespace is both your locals and your globals?)
bcode = [] bgen = iter(c.co_code) for b in bgen: for bx1, bx2 in xchg: if b == bx1: i1 = next(bgen) i2 = next(bgen) index = i1 + i2 * 256 if b in [124, 125]: b = bx2 char = varnames[index] names = names + (char,) index = names.index(char) i2 = index // 256 i1 = index - i2 bcode.append(b) bcode.append(i1) bcode.append(i2) break else: bcode.append(b) co_code = bytes(bcode)
Code = type(c) co = Code( 0, #co_argcount, 0, #co_kwonlyargcount, 0, #co_nlocals, c.co_stacksize, 64, #co_flags, co_code, c.co_consts, names, #co_names (), #co_varnames, c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab ) return co
class fn_signature: """ Hold a signature from a function.
When called with argumetns it will bind them and return a mapping. """ def __init__(self, fn): self.sig = inspect.signature(fn)
def __call__(self, *args, **kwds): return dict(self.sig.bind(*args, **kwds).arguments)
class fn_code: """ Create a relocatable code object that can be applied to a name space. """ def __init__(self, fn): self.co = fix_code(fn.__code__)
def __call__(self, ns): return eval(self.co, ns)
Why are you applying these to a dict? I thought the whole point was to be able to run it inside a scope and affect that scope's variables? If you just leave out the ns, won't that be closer to what you want? (And it also means you don't need the get_sig thing in the first place, and I'm not sure what that adds. Using a function signature plus a call expression as a fancy way of writing a dict display seems like just obfuscation. Maybe with default values, *args, binding partials and then calling get_sig on them, etc. is interesting for something, but I'm not sure what?)
def get_sig(fn): """ Decorator to get a signature. """ return fn_signature(fn)
def get_code(fn): """ Decorator to get a code object. """ return fn_code(fn)
# Example 1 # Applying code to a namespace created by a signature.
@get_sig def foo(x): pass
@get_code def inc_x(): x += 1
@get_code def dec_x(): x -= 1
@get_code def get_x(): return x
ns = foo(3) # creates a namespace
inc_x(ns) # Apply code to namespace inc_x(ns) print(get_x(ns)) # --> 5
dec_x(ns) dec_x(ns) print(get_x(ns)) # --> 3
# Example 2 # Code + signature <---> function
def add_xy(x, y): return x + y
sig = get_sig(add_xy) co = get_code(add_xy) print(co(sig(3, 5))) # --> 8
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 04/01/2015 10:25 AM, Andrew Barnert wrote:
xchg = [(124, 0x65), #LOAD_FAST to LOAD_NAME (125, 0x5a)] #STORE_FAST to STORE_NAME The first problem here is that any 124 or 125 in an operand to any opcode except 124 or 125 will be matched and converted (although you'll usually probably get an IndexError trying to treat the next two arbitrary bytes as an index...).
Yes, it will only work for very simple cases. It was just enough to get the initial examples working.
To solve this, you need to iterate opcode by opcode, not byte by byte. The dis module gives you the information to tell how many bytes to skip for each opcode's operands. (It also maps between opcode numbers and names, so you don't have to use magic numbers with comments.) Using it will solve this problem (and maybe others I didn't spot) and also make your code a lot simpler.
Unfortunately dis is written to give human output for python bytecode, not to edit bytecode. But it can help. It needs a function to go back to a code object after editing the instruction list.
Another problem is that this will only catch local variables. Anything you don't assign to in the function but do reference is liable to end up a global or cell, which will still be a global or cell when you try to run it later. I'm not sure exactly how you want to handle these (if you just convert them unconditionally, then it'll be a bit surprising if someone writes "global spam" and doesn't get a global...), but you have to do something, or half your motivating examples (like "lambda: x+1") won't work. (Or is that why you're doing the explicit namespace thing instead of using the actual scope later on, so that this won't break, because the explicit namespace is both your locals and your globals?)
Why are you applying these to a dict? I thought the whole point was to be able to run it inside a scope and affect that scope's variables? If you just leave out the ns, won't that be closer to what you want? (And it also means you don't need the get_sig thing in the first place, and I'm not sure what that adds. Using a function signature plus a call expression as a fancy way of writing a dict display seems like just obfuscation. Maybe with default values, *args, binding partials and then calling get_sig on them, etc. is interesting for something, but I'm not sure what?)
This was just a step in that direction. It obviously needs more work. There are a number of interesting aspects and directions this can go. * Ability to decompose functions into separate signature and body parts. Not useful (or hard) in itself, but being able to reuse those parts may be good. * Use those parts together. This is a basic test that should just work. Again it's not really that useful in itself, but it's a nice to have equivalency and helps to demonstrate how it works. * Use one body with different signatures. For example you might have signatures with different default values, or signatures that interface to different data formats. Rather than having a function that converts different data formats to fit a single signature format, we can just use different signatures to interface with the data more directly. This is one of the things macro's do in other languages. * Use code objects as blocks to implement continuation like behaviours. This is done breaking an algorithm into composable parts, then applying them to data. It's not quite the same as continuations, or generators, but has some of the same benefits. If the blocks avoid parsing signatures and creating/destroying frames, it can be a fast way to translate data. Of course, it's very limited as you need to have strict naming conventions to do this. So it would be limited to within a scope that follows those conventions. (Doable now with compile and exec, but it's awkward in my opinion.) * Use a body as a block in another function. Yes, this requires getting the live namespace from the frame it's used in. f = sys_getframe(-1) may work, but it's not that easy to do in this case. When exec is given a code object, it call's PyEval_EvalCodeEx in ceval.c directly with local and global dictionaries. (That should answer some of your comments as to why I uses the dictionary.) It may be possible to call directly to PyEval_EvalFrameEx (like generators do), with a frame object. Some or all of these may/will require the code object to be in a form that is more easily relocatable, but as you noted, its not easy to do. There are a lot of 'ifs' here, but I think it may be worth exploring. I'm going to try and make the bytecode fixer function work better (using dis or parts of it.) And then put it up on github where this idea can be developed further. (The other utilities I've found so far for editing bytecode aren't ported to python3 yet.) I don't think editing the bytecode is the ideal solution in the long run, but it will identify the parts that need addressing and then other solutions for those could be looked into such as doing the needed alterations in the AST rather than the bytecode. Cheers, Ron
On Wednesday, April 1, 2015 12:40 PM, Ron Adam <ron3200@gmail.com> wrote:
On 04/01/2015 10:25 AM, Andrew Barnert wrote:
xchg = [(124, 0x65), #LOAD_FAST to LOAD_NAME (125, 0x5a)] #STORE_FAST to STORE_NAME The first problem here is that any 124 or 125 in an operand to any opcode except 124 or 125 will be matched and converted (although you'll usually probably get an IndexError trying to treat the next two arbitrary bytes as an index...).
Yes, it will only work for very simple cases. It was just enough to get the initial examples working.
To solve this, you need to iterate opcode by opcode, not byte by byte. The dis module gives you the information to tell how many bytes to skip for each opcode's operands. (It also maps between opcode numbers and names, so you don't have to use magic numbers with comments.) Using it will solve this problem (and maybe others I didn't spot) and also make your code a lot simpler.
Unfortunately dis is written to give human output for python bytecode, not to edit bytecode. But it can help. It needs a function to go back to a code object after editing the instruction list.
No; even without doing all the work for you, dis still provides more than enough information to be useful. For example: xchg = {'LOAD_FAST': 'LOAD_NAME', 'STORE_FAST': 'STORE_NAME'} bcode = bytearray(co_code) for instr in dis.Bytecode(co_code): try: newop = xchg[instr.opname] except KeyError: pass else: index = instr.arg char = varnames[index] names = names + (char,) index = names.index(char) b[instr.offset] = dis.opmap[newop] b[instr.offset+1:instr.offset+3] = struct.pack('>H', index) That does everything your big loop did, except that it doesn't generate crashing bytecode if you have 257 names or if any of your arguments are 124, 125, or [31744, 32256). (It still doesn't handle > 65536 names via EXTENDED_ARG, but last time I tested, albeit in 2.earlyish, neither did the interpreter itself, so that should be fine... If not, it's not too hard to add that too.)
Why are you applying these to a dict? I thought the whole point was to be able to run it inside a scope and affect that scope's variables? If you just leave out the ns, won't that be closer to what you want? (And it also means you don't need the get_sig thing in the first place, and I'm not sure what that adds. Using a function signature plus a call expression as a fancy way of writing a dict display seems like just obfuscation. Maybe with default values, *args, binding partials and then calling get_sig on them, etc. is interesting for something, but I'm not sure what?)
This was just a step in that direction. It obviously needs more work.
There are a number of interesting aspects and directions this can go.
* Ability to decompose functions into separate signature and body parts.
Not useful (or hard) in itself, but being able to reuse those parts may be good.
Again, why? If your goal is to be able to declare a body to be used inline in another scope, what will you ever need these signature objects for?
* Use those parts together.
This is a basic test that should just work. Again it's not really that useful in itself, but it's a nice to have equivalency and helps to demonstrate how it works.
* Use one body with different signatures.
For example you might have signatures with different default values, or signatures that interface to different data formats. Rather than having a function that converts different data formats to fit a single signature format, we can just use different signatures to interface with the data more directly. This is one of the things macro's do in other languages.
All your signature objects can do is return a dict that you can eval a code block in, instead of evaling it in the current frame. If your goal is to eval it in the current frame, what good does that dict do you?
* Use code objects as blocks to implement continuation like behaviours.
This is done breaking an algorithm into composable parts, then applying them to data. It's not quite the same as continuations, or generators, but has some of the same benefits. If the blocks avoid parsing signatures and creating/destroying frames, it can be a fast way to translate data. Of course, it's very limited as you need to have strict naming conventions to do this. So it would be limited to within a scope that follows those conventions. (Doable now with compile and exec, but it's awkward in my opinion.)
Sure, but again, the (transformed) code object alone already does exactly that. If the signature object added something (like being able to avoid the strict naming conventions, maybe?), it would be helpful, but it doesn't; you can do exactly the same thing with just the code object that you can do with both objects.
* Use a body as a block in another function.
Yes, this requires getting the live namespace from the frame it's used in.
That's trivial. If you just call eval with the default arguments, it gets the live namespace from the frame it's used in. If you want to wrap up eval in a function that does the exact same thing eval does, then you need to manually go up one frame. I'm not sure why you want to do that, but it's easy.
f = sys_getframe(-1) may work, but it's not that easy to do in this case.
def run_code_obj(code): loc = sys._getframe(-1).f_locals return eval(code, loc) How is that not easy?
When exec is given a code object, it call's PyEval_EvalCodeEx in ceval.c directly with local and global dictionaries. (That should answer some of your comments as to why I uses the dictionary.)
Yes, and running the block of code directly with the local and global dictionaries is exactly what you want it to do, so why are you telling it not to do that? For example: def macro(): x += 1 code = fix_code(macro.__code__) def f(code_obj): x = 1 loc = locals() eval(code_obj) return loc['x'] (Or, if you prefer, use "run_code_obj" instead of "eval".) The problem here is that if you just "return x" at the end instead of "return loc['x']", you will likely see 1 instead of 2. It's the same problem you get if you "exec('x += 1')", exactly as described in the docs. That happens because f was compiled to look up x by index in the LOAD_FAST locals array, instead of by name in the locals dict, but your modified code objects mutate only the dict, not the array. That's the big problem you need to solve. Adding more layers of indirection doesn't get you any closer to fixing it.
It may be possible to call directly to PyEval_EvalFrameEx (like generators
do), with a frame object.
No. You've already got access to exactly the same information that you'd have that way. The problem is that you converted all the STORE_FAST instructions to STORE_NAME, and that means you're ignoring the array of fast locals and only using the dict, which means that the calling code won't see your changes. One way you could solve this is by applying the same code conversion to any function that wants to _use_ a code block that you apply to one that wants to be _used as_ a code block (except the former would return wraps(types.FunctionType(code, f.__globals__, ...), ...) instead of just returning code). It's ugly, and it's a burden on the user, and it makes everything slower (and it may break functions that use normal closures plus your code block things), but it's the only thing that could possibly work. If you want to use STORE_NAME, the caller has to use LOAD_NAME.
Some or all of these may/will require the code object to be in a form that
is more easily relocatable, but as you noted, its not easy to do.
If you want to be able to run these code blocks in unmodified functions (without radically changing the interpreter), then yes, you need to affect the caller's LOAD_FAST variables, which means you need to do a STORE_FAST with the caller's index for the variable, and you don't have the caller's index until call time, which means you need relocation. It isn't really _necessary_ to make the code easily relocatable, it just makes the relocation (which is necessary) easier and more efficient. For example, at definition time, you can build a table like: {'x': (1, 8)} So at call time, all you have to do is: names = {name: index for index, name in enumerate(sys._getframe(-1).f_code.co_names)} b = bytearray(c.co_code) for name, offsets in relocs.items(): index = names[name] for offset in offsets: b[offset:offset+2] = struct.pack('>H', index) code = types.CodeType(blah, blah, bytes(b), blah) (Note that this will raise a KeyError if the called code block references a variable that doesn't exist in the calling scope; you may want to catch that and reraise it as a different exception. Also note that, as I explained before, you may want to map NAME/GLOBAL/CELL lookups to FAST lookups--almost the exact opposite of what you're doing--so that code like "def f(): return x+1" sees the calling function's local x, not the global or cell x at definition time, but that's tricky because you probably want "def f(): global x; return x+1" to see the global x...) _Now_ you face the problem that you need to run this on the actual calling frame, rather than what exec does (effectively, run it on a temporary frame with the same locals and globals dicts). And I think that will require extending the interpreter. But all the stuff you wrote above isn't a step in the direction of doing that, it's a step _away_ from that. Once you have that new functionality, you will not want a code object that's converted all FAST variables to NAME variables, or a signature object that gives you a different set of locals to use than the ones you want, or anything like that; you will want a code object that leaves FAST variables as FAST variables but renumbers them, and uses the frame's variables rather than a different namespace.
There are a lot of 'ifs' here, but I think it may be worth exploring.
I'm going to try and make the bytecode fixer function work better (using dis or parts of it.) And then put it up on github where this idea can be developed further.
(The other utilities I've found so far for editing bytecode aren't ported to python3 yet.)
As I said in my previous message, there are at least three incomplete ports of byteplay to 3.x. I think https://github.com/serprex/byteplay works on 3.2, but not 3.3 or 3.4. https://github.com/abarnert/byteplay works on 3.4, but mishandles certain constructions where 2.7+/3.3+ optimizes try/with statements (try running it on the stdlib, and you'll see exceptions on three modules) that I'm pretty sure won't affect your examples. At any rate, while fixing and using byteplay (or replacing it with something new that requires 3.4+ dis, or 2.7/3.3 with the dis 3.4 backport, and avoids all the hacky mucking around trying to guess at stack effects) might make your code nicer, I don't think it's necessary; what you need is a small subset of what it can do (e.g., you're not inserting new instructions and renumbering all the jump offsets, or adding wrapping statements in try blocks, etc.), so you could just cannibalize it to borrow the parts you need and ignore the rest.
I don't think editing the bytecode is the ideal solution in the long run, but it will identify the parts that need addressing and then other solutions for those could be looked into such as doing the needed alterations in the AST rather than the bytecode.
If you want to be able to convert functions to code blocks at runtime (which is inherent in using a decorator), the bytecode is all you have. If you want to capture the AST, you need to do at import/compile time. If you're going to do that, MacroPy already does an amazing job of that, so why reinvent the wheel? (If there's some specific problem with MacroPy that you don't think can be solved without a major rearchitecture, I'll bet Haoyi Li would like to know about it...) And, more importantly, why put all this work into something completely different, which has a completely different set of problems to solve, if you're just going to throw it out later? For example, all the problems with renumbering variables indices or converting between different kinds of variables that you're solving here won't help you identify anything relevant to an AST-based solution, where variables are still just Name(id='x').
On Apr 1, 2015 3:02 PM, "Andrew Barnert" <abarnert@yahoo.com.dmarc.invalid> wrote:
On Wednesday, April 1, 2015 12:40 PM, Ron Adam <ron3200@gmail.com> wrote:
When exec is given a code object, it call's PyEval_EvalCodeEx in ceval.c directly with local and global dictionaries. (That should
some of your comments as to why I uses the dictionary.)
Yes, and running the block of code directly with the local and global dictionaries is exactly what you want it to do, so why are you telling it not to do that?
For example:
def macro(): x += 1 code = fix_code(macro.__code__)
def f(code_obj): x = 1 loc = locals() eval(code_obj) return loc['x']
(Or, if you prefer, use "run_code_obj" instead of "eval".)
The problem here is that if you just "return x" at the end instead of "return loc['x']", you will likely see 1 instead of 2. It's the same
answer problem you get if you "exec('x += 1')", exactly as described in the docs.
That happens because f was compiled to look up x by index in the
LOAD_FAST locals array, instead of by name in the locals dict, but your modified code objects mutate only the dict, not the array. That's the big problem you need to solve. Adding more layers of indirection doesn't get you any closer to fixing it. You can propagate changes to the dict back to the array by calling the c api function PyFrame_LocalsToDict. It's pretty easy to do via ctypes, see e.g. http://pydev.blogspot.com/2014/02/changing-locals-of-frame-frameflocals.html... I guess you could append some byte code to do this to your modified function bodies. -n
On Apr 1, 2015, at 15:17, Nathaniel Smith <njs@pobox.com> wrote:
On Apr 1, 2015 3:02 PM, "Andrew Barnert" <abarnert@yahoo.com.dmarc.invalid> wrote:
On Wednesday, April 1, 2015 12:40 PM, Ron Adam <ron3200@gmail.com> wrote:
When exec is given a code object, it call's PyEval_EvalCodeEx in ceval.c directly with local and global dictionaries. (That should answer some of your comments as to why I uses the dictionary.)
Yes, and running the block of code directly with the local and global dictionaries is exactly what you want it to do, so why are you telling it not to do that?
For example:
def macro(): x += 1 code = fix_code(macro.__code__)
def f(code_obj): x = 1 loc = locals() eval(code_obj) return loc['x']
(Or, if you prefer, use "run_code_obj" instead of "eval".)
The problem here is that if you just "return x" at the end instead of "return loc['x']", you will likely see 1 instead of 2. It's the same problem you get if you "exec('x += 1')", exactly as described in the docs.
That happens because f was compiled to look up x by index in the LOAD_FAST locals array, instead of by name in the locals dict, but your modified code objects mutate only the dict, not the array. That's the big problem you need to solve. Adding more layers of indirection doesn't get you any closer to fixing it.
You can propagate changes to the dict back to the array by calling the c api function PyFrame_LocalsToDict. It's pretty easy to do via ctypes, see e.g.
You mean PyFrame_LocalsToFast, not the other way around, right? That's a good idea. There might be problems executing two code blocks (or a code block and a normal eval/exec statement) in the same function, but for a prototype that's fine...
http://pydev.blogspot.com/2014/02/changing-locals-of-frame-frameflocals.html...
I guess you could append some byte code to do this to your modified function bodies.
That would be painful without byteplay--you have to insert the new instructions before every return and raise bytecode, which means renumbering jumps, etc. But do you really need to? Can you do it in the wrapper? def call_code(code): frame = sys._getframe(1) PyFrame_LocalsToDict(py_object(frame)) try: return eval(code, frame.f_locals(), frame.f_globals()) finally: PyFrame_LocalsToFast(py_object(frame)) (Doing the matched pair like this might avoid the problem with multiple code blocks in one function. I'm not sure, but... worth a try, right?) I think there will still be problem with cell vars (that is, updating a local in the caller which is used in a closure by a local function in the caller). And there's definitely still the problem of magically guessing which variables are meant to be local, closure, or global. But again, for a prototype, that all may be fine.
On Wed, Apr 01, 2015 at 10:00:21PM +0000, Andrew Barnert wrote: [mass snippage deployed] While I do have some interest in this subject, I think at the point you are doing detailed code reviews of experimental software, it's probably no longer on-topic for this mailing list and possibly should be taken off-list until you have something concrete to report. Also, if you want to explore this further: (1) Hacking the byte-code is not portable. It won't work in non-CPython implementations, and bytecode is not a stable part of the CPython API either. Hacking the AST may be better. (2) If you must hack the bytecode, there is at least one library for bytecode manipulations out in there, possibly on PyPI. Google for "python byte-code hacking" for more. -- Steve
On Sat, Mar 28, 2015 at 9:53 AM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Responding to comments off list:
I'm not referring to C-style preprocessor macros, I'm referring to macros historically found in functional languages and commonly found in many user-targeted languages built in the last few years.
Do you have examples and references? IIRC there's something named macros in Scheme but Scheme, unlike Python, completely unifies code and data, and there is a standard in-memory representation for code.
The goal is to create things that look like functions but have access to the expression that was passed in.
Some examples where this is useful:
plot(year, miles / gallon) # Plot with labels determined by input-expressions, e.g. miles/gallon
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
run_concurrently(f(x), f(y), f(z)) # Run f three times in three threads controlled by run_concurrently
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
None of the examples need the syntax tree though. The first wants the string, the last probably just want a way to turn an argument into a lambda.
There are lots of arguments against this, mostly focused around potential misuse. I'm looking for history of such arguments and for a general "Yes, this is theoretically possible" or "Not a chance in hell" from the community. Both are fine.
I don't think this is a mainline need in Python, so it's probably both. :-) -- --Guido van Rossum (python.org/~guido)
Lisps like Scheme do indeed have an easier time with these due to the whole code-is-data thing, it's quite doable in languages with real syntax though. R and Julia would be good examples of syntactic languages with full macros. The Julia implementation might be a good model for what Python could do. Their docs <http://julia.readthedocs.org/en/latest/manual/metaprogramming/> are also a nice read if anyone isn't familiar with the topic. Macropy represents unevaluated expressions with the objects from the ast module. This seems like a sane choice. To be a little pedantic I'll give a brief example loosely showing what a macro is, then I'll talk about assert statements as a use case where macros might help with a pain point in normal Python programming. *Brief Educational Blurb* We write code as text defmacro f(x): ... f(a + b) * sin(c) We then parse parts of that text into syntax trees. [image: Inline image 1] Usually we translate these trees into byte-code and evaluate bottom-up, starting with a + b, then applying f, etc... Macros stop this process. They capture the subtrees beneath them before execution. Whenever we see a macro (f), we don't evaluate its subtree (a + b). Instead we transform the subtree into an in-memory representation (perhaps ast.BinOp(a, ast.Add(), b)) and hand that to f to do with as it will. Lets see an example with assertions. *Use case with Assertions* When testing we often want to write statements like the following assert x == y assert x in y etc... When these statements fail we want to emit statements that are well informed of the full expression, e.g. 5 != 6 5 was not found in {1, 2, 3} In Python we can't do this; assert only gets True or False and doesn't understand what generated that value . We've come up with a couple of workarounds. The first is the venerable unittest.TestCase methods that take the two sides of the comparison explicitly e.g. assertEquals(a, b), assertContains(a, b). This was sufficiently uncomfortable that projects like py.test arose and gained adoption. Py.test goes through the trouble of parsing the python test_.py files in order to generate nicer error messages. Having macros around would allow users to write this kind of functionality directly in Python rather than resorting to full text parsing and code transformation. Macros provide an escape out of pure bottom-up evaluation. On Sun, Mar 29, 2015 at 5:53 PM, Guido van Rossum <guido@python.org> wrote:
On Sat, Mar 28, 2015 at 9:53 AM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Responding to comments off list:
I'm not referring to C-style preprocessor macros, I'm referring to macros historically found in functional languages and commonly found in many user-targeted languages built in the last few years.
Do you have examples and references? IIRC there's something named macros in Scheme but Scheme, unlike Python, completely unifies code and data, and there is a standard in-memory representation for code.
The goal is to create things that look like functions but have access to the expression that was passed in.
Some examples where this is useful:
plot(year, miles / gallon) # Plot with labels determined by input-expressions, e.g. miles/gallon
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
run_concurrently(f(x), f(y), f(z)) # Run f three times in three threads controlled by run_concurrently
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
None of the examples need the syntax tree though. The first wants the string, the last probably just want a way to turn an argument into a lambda.
There are lots of arguments against this, mostly focused around potential misuse. I'm looking for history of such arguments and for a general "Yes, this is theoretically possible" or "Not a chance in hell" from the community. Both are fine.
I don't think this is a mainline need in Python, so it's probably both. :-)
-- --Guido van Rossum (python.org/~guido)
Also, just to credentialize myself, I am not a huge Lisp lover. I don't want macros to do crazy logic programming or whatever. I write numeric code in the scientific Python ecosystem. I want macros to build better interfaces for downstream users. This seems to be the modern use case in user-focused languages rather than lisp-magic-hell. On Mon, Mar 30, 2015 at 8:20 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Lisps like Scheme do indeed have an easier time with these due to the whole code-is-data thing, it's quite doable in languages with real syntax though. R and Julia would be good examples of syntactic languages with full macros. The Julia implementation might be a good model for what Python could do. Their docs <http://julia.readthedocs.org/en/latest/manual/metaprogramming/> are also a nice read if anyone isn't familiar with the topic.
Macropy represents unevaluated expressions with the objects from the ast module. This seems like a sane choice.
To be a little pedantic I'll give a brief example loosely showing what a macro is, then I'll talk about assert statements as a use case where macros might help with a pain point in normal Python programming.
*Brief Educational Blurb*
We write code as text
defmacro f(x): ...
f(a + b) * sin(c)
We then parse parts of that text into syntax trees.
[image: Inline image 1] Usually we translate these trees into byte-code and evaluate bottom-up, starting with a + b, then applying f, etc... Macros stop this process. They capture the subtrees beneath them before execution. Whenever we see a macro (f), we don't evaluate its subtree (a + b). Instead we transform the subtree into an in-memory representation (perhaps ast.BinOp(a, ast.Add(), b)) and hand that to f to do with as it will. Lets see an example with assertions.
*Use case with Assertions*
When testing we often want to write statements like the following
assert x == y assert x in y etc...
When these statements fail we want to emit statements that are well informed of the full expression, e.g.
5 != 6 5 was not found in {1, 2, 3}
In Python we can't do this; assert only gets True or False and doesn't understand what generated that value . We've come up with a couple of workarounds. The first is the venerable unittest.TestCase methods that take the two sides of the comparison explicitly e.g. assertEquals(a, b), assertContains(a, b). This was sufficiently uncomfortable that projects like py.test arose and gained adoption. Py.test goes through the trouble of parsing the python test_.py files in order to generate nicer error messages.
Having macros around would allow users to write this kind of functionality directly in Python rather than resorting to full text parsing and code transformation. Macros provide an escape out of pure bottom-up evaluation.
On Sun, Mar 29, 2015 at 5:53 PM, Guido van Rossum <guido@python.org> wrote:
On Sat, Mar 28, 2015 at 9:53 AM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Responding to comments off list:
I'm not referring to C-style preprocessor macros, I'm referring to macros historically found in functional languages and commonly found in many user-targeted languages built in the last few years.
Do you have examples and references? IIRC there's something named macros in Scheme but Scheme, unlike Python, completely unifies code and data, and there is a standard in-memory representation for code.
The goal is to create things that look like functions but have access to the expression that was passed in.
Some examples where this is useful:
plot(year, miles / gallon) # Plot with labels determined by input-expressions, e.g. miles/gallon
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
run_concurrently(f(x), f(y), f(z)) # Run f three times in three threads controlled by run_concurrently
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
None of the examples need the syntax tree though. The first wants the string, the last probably just want a way to turn an argument into a lambda.
There are lots of arguments against this, mostly focused around potential misuse. I'm looking for history of such arguments and for a general "Yes, this is theoretically possible" or "Not a chance in hell" from the community. Both are fine.
I don't think this is a mainline need in Python, so it's probably both. :-)
-- --Guido van Rossum (python.org/~guido)
Matthew, Is something stopping you from exploring this? Do you have specific ideas on how to improve on macropy? It sounds almost as if you would like to implement this but you want some kind of promise ahead of time that your work will be incorporated into the language. But that's just not how it works. When you want to explore a big idea like this, at some point you have to be willing to take the risk of writing code without a guaranteed pay off. Haoyi didn't ask for macropy to be incorporated into Python -- in fact he was surprised at the amount of uptake it got. You've received quite a bit of feedback (and, may I say, push back :-) from a small number of python-ideas veterans -- you can take this or leave it, but at this point I think you've gotten about as much mileage out of the list as can be expected. Good luck! --Guido On Mon, Mar 30, 2015 at 8:27 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Also, just to credentialize myself, I am not a huge Lisp lover. I don't want macros to do crazy logic programming or whatever. I write numeric code in the scientific Python ecosystem. I want macros to build better interfaces for downstream users. This seems to be the modern use case in user-focused languages rather than lisp-magic-hell.
On Mon, Mar 30, 2015 at 8:20 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Lisps like Scheme do indeed have an easier time with these due to the whole code-is-data thing, it's quite doable in languages with real syntax though. R and Julia would be good examples of syntactic languages with full macros. The Julia implementation might be a good model for what Python could do. Their docs <http://julia.readthedocs.org/en/latest/manual/metaprogramming/> are also a nice read if anyone isn't familiar with the topic.
Macropy represents unevaluated expressions with the objects from the ast module. This seems like a sane choice.
To be a little pedantic I'll give a brief example loosely showing what a macro is, then I'll talk about assert statements as a use case where macros might help with a pain point in normal Python programming.
*Brief Educational Blurb*
We write code as text
defmacro f(x): ...
f(a + b) * sin(c)
We then parse parts of that text into syntax trees.
[image: Inline image 1] Usually we translate these trees into byte-code and evaluate bottom-up, starting with a + b, then applying f, etc... Macros stop this process. They capture the subtrees beneath them before execution. Whenever we see a macro (f), we don't evaluate its subtree (a + b). Instead we transform the subtree into an in-memory representation (perhaps ast.BinOp(a, ast.Add(), b)) and hand that to f to do with as it will. Lets see an example with assertions.
*Use case with Assertions*
When testing we often want to write statements like the following
assert x == y assert x in y etc...
When these statements fail we want to emit statements that are well informed of the full expression, e.g.
5 != 6 5 was not found in {1, 2, 3}
In Python we can't do this; assert only gets True or False and doesn't understand what generated that value . We've come up with a couple of workarounds. The first is the venerable unittest.TestCase methods that take the two sides of the comparison explicitly e.g. assertEquals(a, b), assertContains(a, b). This was sufficiently uncomfortable that projects like py.test arose and gained adoption. Py.test goes through the trouble of parsing the python test_.py files in order to generate nicer error messages.
Having macros around would allow users to write this kind of functionality directly in Python rather than resorting to full text parsing and code transformation. Macros provide an escape out of pure bottom-up evaluation.
On Sun, Mar 29, 2015 at 5:53 PM, Guido van Rossum <guido@python.org> wrote:
On Sat, Mar 28, 2015 at 9:53 AM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Responding to comments off list:
I'm not referring to C-style preprocessor macros, I'm referring to macros historically found in functional languages and commonly found in many user-targeted languages built in the last few years.
Do you have examples and references? IIRC there's something named macros in Scheme but Scheme, unlike Python, completely unifies code and data, and there is a standard in-memory representation for code.
The goal is to create things that look like functions but have access to the expression that was passed in.
Some examples where this is useful:
plot(year, miles / gallon) # Plot with labels determined by input-expressions, e.g. miles/gallon
assertRaises(ZeroDivisionError, 1/0) # Evaluate the rhs 1/0 within assertRaises function, not before
run_concurrently(f(x), f(y), f(z)) # Run f three times in three threads controlled by run_concurrently
Generally one constructs something that looks like a function but, rather than receiving a pre-evaluated input, receives a syntax tree along with the associated context. This allows that function-like-thing to manipulate the expression and to control the context in which the evaluation occurs.
None of the examples need the syntax tree though. The first wants the string, the last probably just want a way to turn an argument into a lambda.
There are lots of arguments against this, mostly focused around potential misuse. I'm looking for history of such arguments and for a general "Yes, this is theoretically possible" or "Not a chance in hell" from the community. Both are fine.
I don't think this is a mainline need in Python, so it's probably both. :-)
-- --Guido van Rossum (python.org/~guido)
-- --Guido van Rossum (python.org/~guido)
Is something stopping you from exploring this? Do you have specific ideas on how to improve on macropy?
Macropy is great but it requires an import-hook. Many scientific users work interactively.
It sounds almost as if you would like to implement this but you want some kind of promise ahead of time that your work will be incorporated into the language. But that's just not how it works. When you want to explore a big idea like this, at some point you have to be willing to take the risk of writing code without a guaranteed pay off. Haoyi didn't ask for macropy to be incorporated into Python -- in fact he was surprised at the amount of uptake it got.
The hard problem isn't building macros, it's deciding whether or not macros are good for Python. I'm trying to start a discussion. If this isn't the right place for that then I apologize.
You've received quite a bit of feedback (and, may I say, push back :-) from a small number of python-ideas veterans -- you can take this or leave it, but at this point I think you've gotten about as much mileage out of the list as can be expected.
My apologies. I didn't realize that I was misusing this list. I also didn't realize that I was receiving push-back, the comments here seemed friendly and encouraging. Last year at SciPy the message I heard was "If you want to convince the core team then come to python-ideas armed with motivating use cases." Here I am :) Anyway, if there isn't any interest then I'll leave off. Thank you all for your time, -Matt
Whoops, my apologies. Apparently I don't get e-mails sent only to python-ideas and not also to me. There was a lot of conversation to which I was ignorant. On Mon, Mar 30, 2015 at 9:26 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Is something stopping you from exploring this? Do you have specific ideas
on how to improve on macropy?
Macropy is great but it requires an import-hook. Many scientific users work interactively.
It sounds almost as if you would like to implement this but you want some kind of promise ahead of time that your work will be incorporated into the language. But that's just not how it works. When you want to explore a big idea like this, at some point you have to be willing to take the risk of writing code without a guaranteed pay off. Haoyi didn't ask for macropy to be incorporated into Python -- in fact he was surprised at the amount of uptake it got.
The hard problem isn't building macros, it's deciding whether or not macros are good for Python. I'm trying to start a discussion. If this isn't the right place for that then I apologize.
You've received quite a bit of feedback (and, may I say, push back :-) from a small number of python-ideas veterans -- you can take this or leave it, but at this point I think you've gotten about as much mileage out of the list as can be expected.
My apologies. I didn't realize that I was misusing this list. I also didn't realize that I was receiving push-back, the comments here seemed friendly and encouraging.
Last year at SciPy the message I heard was "If you want to convince the core team then come to python-ideas armed with motivating use cases." Here I am :)
Anyway, if there isn't any interest then I'll leave off. Thank you all for your time, -Matt
The !(x + y) solution would release some pressure. People have considered using lambda: to delay execution in Pandas queries. The result is a bit odd: https://github.com/pydata/pandas/issues/9229#issuecomment-69691738 On Mon, Mar 30, 2015 at 9:26 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Is something stopping you from exploring this? Do you have specific ideas
on how to improve on macropy?
Macropy is great but it requires an import-hook. Many scientific users work interactively.
It sounds almost as if you would like to implement this but you want some kind of promise ahead of time that your work will be incorporated into the language. But that's just not how it works. When you want to explore a big idea like this, at some point you have to be willing to take the risk of writing code without a guaranteed pay off. Haoyi didn't ask for macropy to be incorporated into Python -- in fact he was surprised at the amount of uptake it got.
The hard problem isn't building macros, it's deciding whether or not macros are good for Python. I'm trying to start a discussion. If this isn't the right place for that then I apologize.
You've received quite a bit of feedback (and, may I say, push back :-) from a small number of python-ideas veterans -- you can take this or leave it, but at this point I think you've gotten about as much mileage out of the list as can be expected.
My apologies. I didn't realize that I was misusing this list. I also didn't realize that I was receiving push-back, the comments here seemed friendly and encouraging.
Last year at SciPy the message I heard was "If you want to convince the core team then come to python-ideas armed with motivating use cases." Here I am :)
Anyway, if there isn't any interest then I'll leave off. Thank you all for your time, -Matt
Macros would be an extremely useful feature for pandas, the main data analysis library for Python (for which I'm a core developer). Why? Well, right now, R has better syntax than Python for writing data analysis code. The difference comes down to two macros that R developers have written within the past few years. Here's an example borrowed from the documentation for the dplyr R package [1]: flights %>% group_by(year, month, day) %>% select(arr_delay, dep_delay) %>% summarise( arr = mean(arr_delay), dep = mean(dep_delay) ) %>% filter(arr > 30 | dep > 30) Here "flights" is a dataframe, similar to a table in spreadsheet. It is also the only global variables in the analysis -- variables like "year" and "arr_delay" are actually columns in the dataframe. R evaluates variables lazily, in the context of the provided frame. In Python, functions like groupby_by would need to be macros. The other macro is the "pipe" or chaining operator %>%. This operator is used to avoid the need many temporary or highly nested expressions. The result is quite readable, but again, it needs to be a macro, because group_by and filter are simply functions that take a dataframe as their first argument. The fact that chaining works with plain functions means that it works even on libraries that weren't designed for it. We could do function chaining in Python by abusing an exist binary operator like >> or |, but all the objects on which it works would need to be custom types. What does this example look using pandas? Well, it's not as nice, and there's not much we can do about it because of the limitations of Python syntax: (flights .group_by('year', 'month', 'day') .select('arr_delay', 'dep_delay') .summarize( arr = lambda df: mean(df.arr_delay)), dep = lambda df: mean(df.dep_delay))) .filter(lambda df: (df.arr > 30) | (df.dep > 30))) (Astute readers will note that I've taken a few liberties with pandas syntax to make more similar to dplyr.) Instead of evaluating expressions in the delayed context of a dataframes, we use strings or functions. With all the lambdas there's a lot more noise than the R example, and it's harder to keep track of what's on. In principle we could simplify the lambda expressions to not use any arguments (Matthew linked to the GitHub comment where I showed what that would look like [2]), but the code remains awkwardly verbose. For chaining, instead of using functions and the pipe operator, we use methods. This works fine as long as users are only using pandas, but it means that unlike R, the Python dataframe is a closed ecosystem. Python developers (rightly) frown upon monkey-patching, so there's no way for external libraries to add their own functions (e.g., for custom plotting or file formats) on an equal footing to the methods built-in to pandas. I hope these use cases are illustrative. I don't have strong opinions on the technical merits of particular proposals. The "light lambda" syntax described by Andrew Barnert would at least solve the delayed evaluation use-case nicely, though the colon character is not ideal because it would rule out using light lambdas inside indexing brackets. Best, Stephan [1] http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html#chain... [2] https://github.com/pydata/pandas/issues/9229#issuecomment-69691738
On Mar 30, 2015, at 23:21, Stephan Hoyer <shoyer@gmail.com> wrote:
Macros would be an extremely useful feature for pandas, the main data analysis library for Python (for which I'm a core developer).
Why? Well, right now, R has better syntax than Python for writing data analysis code. The difference comes down to two macros that R developers have written within the past few years.
Here's an example borrowed from the documentation for the dplyr R package [1]:
flights %>% group_by(year, month, day) %>% select(arr_delay, dep_delay) %>% summarise( arr = mean(arr_delay), dep = mean(dep_delay) ) %>% filter(arr > 30 | dep > 30)
Here "flights" is a dataframe, similar to a table in spreadsheet. It is also the only global variables in the analysis -- variables like "year" and "arr_delay" are actually columns in the dataframe. R evaluates variables lazily, in the context of the provided frame. In Python, functions like groupby_by would need to be macros.
The other macro is the "pipe" or chaining operator %>%. This operator is used to avoid the need many temporary or highly nested expressions. The result is quite readable, but again, it needs to be a macro, because group_by and filter are simply functions that take a dataframe as their first argument. The fact that chaining works with plain functions means that it works even on libraries that weren't designed for it. We could do function chaining in Python by abusing an exist binary operator like >> or |, but all the objects on which it works would need to be custom types.
What does this example look using pandas? Well, it's not as nice, and there's not much we can do about it because of the limitations of Python syntax:
(flights .group_by('year', 'month', 'day') .select('arr_delay', 'dep_delay') .summarize( arr = lambda df: mean(df.arr_delay)), dep = lambda df: mean(df.dep_delay))) .filter(lambda df: (df.arr > 30) | (df.dep > 30)))
(Astute readers will note that I've taken a few liberties with pandas syntax to make more similar to dplyr.)
Instead of evaluating expressions in the delayed context of a dataframes, we use strings or functions. With all the lambdas there's a lot more noise than the R example, and it's harder to keep track of what's on. In principle we could simplify the lambda expressions to not use any arguments (Matthew linked to the GitHub comment where I showed what that would look like [2]), but the code remains awkwardly verbose.
For chaining, instead of using functions and the pipe operator, we use methods. This works fine as long as users are only using pandas, but it means that unlike R, the Python dataframe is a closed ecosystem. Python developers (rightly) frown upon monkey-patching, so there's no way for external libraries to add their own functions (e.g., for custom plotting or file formats) on an equal footing to the methods built-in to pandas.
One way around this is to provide a documented, clean method for hooking your types--e.g., a register classmethod that then makes the function appear as a method in all instances. Functionally this is the same as monkeypatching, but it looks a lot more inviting to the user. (And it also allows you to rewrite things under the covers in ways that would break direct monkeypatching, if you ever want to.) There are more examples of opening up modules this way than classes, but it's the same idea.
I hope these use cases are illustrative. I don't have strong opinions on the technical merits of particular proposals. The "light lambda" syntax described by Andrew Barnert would at least solve the delayed evaluation use-case nicely, though the colon character is not ideal because it would rule out using light lambdas inside indexing brackets.
The bare colon was just one of multiple suggestions that came up last time the idea was discussed (last February or so), and in previous discussions (going back to the long-abandoned PEP 312). In many cases, it looks very nice, but in others it's either ugly or (at least to a human) ambiguous without parens (which obviously defeat the whole point). I don't think anyone noticed the indexing issue, but someone (Nick Coghlan, I think) pointed out essentially the same issue in dict displays (e.g., for making a dynamic jump table). If you seriously want to revive that discussion--which might be worth doing, if your use cases are sufficiently different from the ones that were discussed (Tkinter and RPC server callbacks were the primary motivating case)--somewhere I have notes on all the syntaxes people suggested that I could dump on you. For the strings, another possibility is a namespace object that can effectively defer attribute lookup and have it done by the real table when you get it, as done by various ORMs, appscript, etc. Instead of this:
(flights .group_by('year', 'month', 'day')
... You write:
(flights .group_by(c.year, c.month, c.day)
That "c" is just an instance of a simple type that wraps its __getattr__ argument up so you can access it later--when get an argument to group_by that's one of those wrappers, you look it up on the table. It's the same number of characters, and looks a little more magical, but arguably it's more readable, at least once people are used to your framework. For example, here's a query expression to find broken tracks (with missing filenames) in iTunes via appscript that was used in a real application (until iTunes Match made this no longer work...): playlist.tracks[its.location == k.missing] Both its and k are special objects you import from appscript; its.__getattr__ returns a wrapper that's used to look up "location" in the members of whatever class playlist.tracks instances turn out to be at runtime, and k.missing returns a wrapper that's used to look up "missing" in the global keywords of whatever app playlist turns out to be part of at runtime. You can take this even further: if mean weren't a function that takes a column and returns its mean, but instead a function that takes a column name and returns a function that takes a table and returns the mean of the column with that name, then you could just write this:
.summarize( arr = mean(c.arr_delay)
But of course that isn't what mean is. But it can be what, say, f.mean is, if f is another one of those attribute-delaying objects: f.mean(c.arr_delay) returns a wrapper that summarize can use to call the function named "mean" on the column named "arr_delay". So, the whole thing reduces to:
(flights .group_by(c.year, c.month, c.day) .select(c.arr_delay, c.dep_delay) .summarize( arr = f.mean(c.arr_delay)), dep = f.mean(c.dep_delay))) .filter(c.arr > 30 | c.dep > 30)
isn't this: lambda df: mean(df.arr_delay) the same as.. functools.partial(mean, df.arr_delay) I kind of like the idea of a pipe operator (tho %>% looks just terrible and | is already taken) But consider: if we could get functools.compose that too would be alleviated. (something like https://mathieularose.com/function-composition-in-python/ ) On 31 March 2015 at 09:21, Stephan Hoyer <shoyer@gmail.com> wrote:
Macros would be an extremely useful feature for pandas, the main data analysis library for Python (for which I'm a core developer).
Why? Well, right now, R has better syntax than Python for writing data analysis code. The difference comes down to two macros that R developers have written within the past few years.
Here's an example borrowed from the documentation for the dplyr R package [1]:
flights %>% group_by(year, month, day) %>% select(arr_delay, dep_delay) %>% summarise( arr = mean(arr_delay), dep = mean(dep_delay) ) %>% filter(arr > 30 | dep > 30)
Here "flights" is a dataframe, similar to a table in spreadsheet. It is also the only global variables in the analysis -- variables like "year" and "arr_delay" are actually columns in the dataframe. R evaluates variables lazily, in the context of the provided frame. In Python, functions like groupby_by would need to be macros.
The other macro is the "pipe" or chaining operator %>%. This operator is used to avoid the need many temporary or highly nested expressions. The result is quite readable, but again, it needs to be a macro, because group_by and filter are simply functions that take a dataframe as their first argument. The fact that chaining works with plain functions means that it works even on libraries that weren't designed for it. We could do function chaining in Python by abusing an exist binary operator like >> or |, but all the objects on which it works would need to be custom types.
What does this example look using pandas? Well, it's not as nice, and there's not much we can do about it because of the limitations of Python syntax:
(flights .group_by('year', 'month', 'day') .select('arr_delay', 'dep_delay') .summarize( arr = lambda df: mean(df.arr_delay)), dep = lambda df: mean(df.dep_delay))) .filter(lambda df: (df.arr > 30) | (df.dep > 30)))
(Astute readers will note that I've taken a few liberties with pandas syntax to make more similar to dplyr.)
Instead of evaluating expressions in the delayed context of a dataframes, we use strings or functions. With all the lambdas there's a lot more noise than the R example, and it's harder to keep track of what's on. In principle we could simplify the lambda expressions to not use any arguments (Matthew linked to the GitHub comment where I showed what that would look like [2]), but the code remains awkwardly verbose.
For chaining, instead of using functions and the pipe operator, we use methods. This works fine as long as users are only using pandas, but it means that unlike R, the Python dataframe is a closed ecosystem. Python developers (rightly) frown upon monkey-patching, so there's no way for external libraries to add their own functions (e.g., for custom plotting or file formats) on an equal footing to the methods built-in to pandas.
I hope these use cases are illustrative. I don't have strong opinions on the technical merits of particular proposals. The "light lambda" syntax described by Andrew Barnert would at least solve the delayed evaluation use-case nicely, though the colon character is not ideal because it would rule out using light lambdas inside indexing brackets.
Best, Stephan
[1] http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html#chain... [2] https://github.com/pydata/pandas/issues/9229#issuecomment-69691738
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Mar 31, 2015, at 00:22, Joonas Liik <liik.joonas@gmail.com> wrote:
isn't this: lambda df: mean(df.arr_delay) the same as.. functools.partial(mean, df.arr_delay)
No, because in the first one df is a parameter (which gets the value of whatever DataFrame this is run on) while in the second it's a free variable (which just raises NameError, if you're lucky). You could do this with operator.attrgetter and composition, of course: compose(mean, attrgetter('arr_delay')) But I don't think that's more readable. Even if composition were an infix operator: (mean . attrgetter('arr_delay'))
I kind of like the idea of a pipe operator (tho %>% looks just terrible and | is already taken) But consider: if we could get functools.compose that too would be alleviated. (something like https://mathieularose.com/function-composition-in-python/ )
Why do you need to "get functools.compose"? If you just want the trivial version from that blog post, it's two lines that any novice can write himself. If you want one of the more complex versions that he dismisses, then there might be a better argument, but the post you linked argues against that, not for it. And if you don't trust yourself to write the two lines yourself, you can always pip install funcy or toolz or functional3 or more-functools. As far as using compose again for piping... Well, it's backward from what you want, and it also requires you to stack up enough parens to choke a Lisp guru, not to mention all the repetitions of compose itself (that's why Haskell has infix compose and apply operators with the precedence and associativity they have, so you can avoid all the parens).
On 31 March 2015 at 09:21, Stephan Hoyer <shoyer@gmail.com> wrote: Macros would be an extremely useful feature for pandas, the main data analysis library for Python (for which I'm a core developer).
Why? Well, right now, R has better syntax than Python for writing data analysis code. The difference comes down to two macros that R developers have written within the past few years.
Here's an example borrowed from the documentation for the dplyr R package [1]:
flights %>% group_by(year, month, day) %>% select(arr_delay, dep_delay) %>% summarise( arr = mean(arr_delay), dep = mean(dep_delay) ) %>% filter(arr > 30 | dep > 30)
Here "flights" is a dataframe, similar to a table in spreadsheet. It is also the only global variables in the analysis -- variables like "year" and "arr_delay" are actually columns in the dataframe. R evaluates variables lazily, in the context of the provided frame. In Python, functions like groupby_by would need to be macros.
The other macro is the "pipe" or chaining operator %>%. This operator is used to avoid the need many temporary or highly nested expressions. The result is quite readable, but again, it needs to be a macro, because group_by and filter are simply functions that take a dataframe as their first argument. The fact that chaining works with plain functions means that it works even on libraries that weren't designed for it. We could do function chaining in Python by abusing an exist binary operator like >> or |, but all the objects on which it works would need to be custom types.
What does this example look using pandas? Well, it's not as nice, and there's not much we can do about it because of the limitations of Python syntax:
(flights .group_by('year', 'month', 'day') .select('arr_delay', 'dep_delay') .summarize( arr = lambda df: mean(df.arr_delay)), dep = lambda df: mean(df.dep_delay))) .filter(lambda df: (df.arr > 30) | (df.dep > 30)))
(Astute readers will note that I've taken a few liberties with pandas syntax to make more similar to dplyr.)
Instead of evaluating expressions in the delayed context of a dataframes, we use strings or functions. With all the lambdas there's a lot more noise than the R example, and it's harder to keep track of what's on. In principle we could simplify the lambda expressions to not use any arguments (Matthew linked to the GitHub comment where I showed what that would look like [2]), but the code remains awkwardly verbose.
For chaining, instead of using functions and the pipe operator, we use methods. This works fine as long as users are only using pandas, but it means that unlike R, the Python dataframe is a closed ecosystem. Python developers (rightly) frown upon monkey-patching, so there's no way for external libraries to add their own functions (e.g., for custom plotting or file formats) on an equal footing to the methods built-in to pandas.
I hope these use cases are illustrative. I don't have strong opinions on the technical merits of particular proposals. The "light lambda" syntax described by Andrew Barnert would at least solve the delayed evaluation use-case nicely, though the colon character is not ideal because it would rule out using light lambdas inside indexing brackets.
Best, Stephan
[1] http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html#chain... [2] https://github.com/pydata/pandas/issues/9229#issuecomment-69691738
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Mar 30, 2015 at 9:26 PM, Matthew Rocklin <mrocklin@gmail.com> wrote:
Is something stopping you from exploring this? Do you have specific ideas
on how to improve on macropy?
Macropy is great but it requires an import-hook. Many scientific users work interactively.
It sounds almost as if you would like to implement this but you want some kind of promise ahead of time that your work will be incorporated into the language. But that's just not how it works. When you want to explore a big idea like this, at some point you have to be willing to take the risk of writing code without a guaranteed pay off. Haoyi didn't ask for macropy to be incorporated into Python -- in fact he was surprised at the amount of uptake it got.
The hard problem isn't building macros, it's deciding whether or not macros are good for Python. I'm trying to start a discussion. If this isn't the right place for that then I apologize.
This is the right place, and we're now at the point where it's your job to either show a concrete design spec that can actually be implemented, and have its tires kicked, or just go off and build something. In the latter case you'll probably learn about some practical issues that nobody might have thought of yet.
You've received quite a bit of feedback (and, may I say, push back :-)
from a small number of python-ideas veterans -- you can take this or leave it, but at this point I think you've gotten about as much mileage out of the list as can be expected.
My apologies. I didn't realize that I was misusing this list. I also didn't realize that I was receiving push-back, the comments here seemed friendly and encouraging.
You weren't misusing the list. Maybe (based on your next message) you weren't reading it though. :-)
Last year at SciPy the message I heard was "If you want to convince the core team then come to python-ideas armed with motivating use cases." Here I am :)
Anyway, if there isn't any interest then I'll leave off. Thank you all for your time,
I think you misunderstand. There's interest but there are also real concerns. I really do think that an implementation (if, as you say, that isn't the hard part) would be very helpful to judge whether it is a desirable feature. (Maybe you discover you can do it in a way that can be distributed via PYPI -- "pip install macros".) -- --Guido van Rossum (python.org/~guido)
participants (11)
-
Andrew Barnert
-
Carl Meyer
-
Chris Angelico
-
Guido van Rossum
-
Joonas Liik
-
Luciano Ramalho
-
Matthew Rocklin
-
Nathaniel Smith
-
Ron Adam
-
Stephan Hoyer
-
Steven D'Aprano