Symbolic expressions (or: partials and closures from the inside out)

Greetings, I have been writing a lot of code lately that involves creating symbolic expressions of one form or another, which are then fully evaluated at a later time. Examples of this include Elementwise, where I create expressions that act on every member of an iterable (there is a much improved new version coming soon, by the way), and a design by contract/validation lib I'm working on (which shall remain nameless :D) that uses symbolic expressions in the *args of the metaclass __new__ method to generate a constraint class which validates input using __instancecheck__. I do most of this with lambdas, a little hacking with closures and FunctionType(), and chainable objects. I am very impressed that python is this flexible, but there are some issues with the approach that I would like to rectify, namely: 1. Because of the early binding behavior of most things in Python, if I want to include isinstance(X, someclass) in a symbolic expression, I have to wrap it in a lambda (or use .apply(), in the case of Elementwise). This is not a huge deal for me, but it forces me to create wrappers for lots of functions (e.g. isinstance_(X, someclass)) and/or have users wrap every such function they want to use in a symbolic expression. Having to do this also bloats the code a lot; the github version of Elementwise is over 3,000 LoC at this point (including prodigious documentation, but still...). 2. Python expects that certain functions, such as int(), str(), etc, will have a specific return type. While in general I agree with this, it makes Elementwise somewhat inconsistent (and it will do the same to anything else that wants to work with symbolic expressions). I'm interested in fixing both issues. I believe both issues I've had could be solved by having a robust "symbolic object". These objects would basically usable like ordinary objects, however upon any attribute access or other form of interaction, the object would basically short circuit the calling function, and return a symbolic object directly to the outer scope. The symbolic object would behave like a generator function frozen at the point of attribute access, and upon send()-ing (or whatever method), it would behave exactly as if the values sent had been the ones passed in originally (ideally without consuming the generator). I have thought about ways to approximate this behavior python currently, and while I could hack something together using inspect to pull relevant info from the stack, then break out using a special exception (potentially passing a generator with state as an exception arg), this approach strikes me as VERY brittle, implementation dependent, ugly and difficult to work with. Additionally, you would need to catch the special exception somewhere in the stack, so this trick wouldn't work on the first thing in an expression to be evaluated. As an aside, I'd like to solicit some feedback on the validation syntax I've been working on. Currently, I have code that support things like: X = SymbolicObject() const = Constraints(X * 2 + 1 >= 5, X % 2 != 0) const2 = Constraints(X[-1] == "h") const3 = Constraints(X[-1].upper() == "H")
print isinstance(3, const) True
print isinstance(2, const) False
print isinstance(1, const) False
print isinstance("bleh", const2) True
print isinstance("bleh", const3) True
Callables are supported as well, so if you wanted to do something like: Constraints(isinstance(X.attr, someclass), somefunc(X[-2].attr, args)) You could approximate that with: Constraints(lambda x: isinstance(x.attr, someclass), lambda x: somefunc(x[-2].attr, args)) As I mentioned in the first paragraph, Constraints is a metaclass, so your validations are checked using __instancecheck__. I'm also considering having __init__ generate mock objects (for certain straight-forward cases, anyhow). Thanks for your time, Nathan

On 1/12/2012 3:45 PM, Nathan Rice wrote:
Python is designed for concrete, rather than symbolic computation. But the latter has been done.
I think you may be confusing name-resolution time with execution time. They are distinct, though somewhat tied together in Python. For example: if I hand you a book and say "Read this", I intend that you immediately resolve 'this' to the book I just handed you and immediately 'read' it. If I instead say, "Tomorrow morning, read this", I still intend that you immediately resolve 'this' to a particular object, but now intend that you package the object with the action for later execution. I believe you want this this mixture of resolution now with action later. If so, your problem is that executing 'read(this)' or 'this.read()' does both things now, while "def f(): read(this)" or "lambda: read(this)" puts off both things until later. Python has several ways to binds objects now with actions later. Let <book> be a reference to 'the book Terry handed me, which I stored wherever'. 0: rewrite the text -- a bit awkward in Python. action = compile("read({this})".format(this=<book>), 'xxx', 'eval') 1. default args -- has to be done when the function is defined. def action(this = <book>): read(this) 2. closures (nested functions) -- also requires a planned-ahead definition. make_read(x): return lambda: read(x) action = make_read(<book>) 3. bound methods -- only works for classes with methods. Class Book: def read(self): pass action = Book(<book>).read 4. partial binding of function params -- generalizes bound methods; works for any function and argument. from functools import partial action = partial(read, <book>)
if I want to include isinstance(X, someclass) in a symbolic expression,
Consider using partial, which can totally bind all needed args *now* for *later* action.
I have to wrap it in a lambda
Are you sure that partial will not work for you? Since partial is written in Python, you can grab and adjust the code to your needs. It amounts to adding default args after the fact by using a generic closure.
People expect that class constructors produce an instance of the class. It is surprising when one does otherwise ;-). Builtin classes like int, bool, and str are classes just like ones you write.
Using a special class is a standard way to delay concrete execution.
print isinstance(3, const) True
A Contraints instance defines a set. 'const' is the set 'odd_ge_3' It would look better if you used standard syntax and do the inclusion check in a __contains__ method.
3 in odd_ge_3 True
print isinstance(2, const) False
2 in odd_ge_3 1 in odd_ge_3
"bleh" in ends_in_h "bleh" in ends_in_h_or_H
This is your first mention, actually.
so your validations are checked using __instancecheck__.
But it is a fake check in that 3 is not really an instance of the class, which has no instances. It is hard for me to believe that you cannot put the same constraint data in an instance of Constraint as a class and the inclusion logic in __contains__. If you customize __instancecheck__ for each instance of Constraints, then write def__contains__(self, ob): return self._check(ob) where _check does the same as your current __instancecheck__ Even if you *have* to make Constraints a metaclass, for other reasons, I believe you could still give it the same __contains__ method. A metaclass *is* a class, and if its class instances represent collections, inclusion in the colleciton should be tested in the standard way.
-- Terry Jan Reedy

On Thu, Jan 12, 2012 at 3:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But what are types but abstract sets of values? Phrasing it as a typecheck is perfectly sensible from a type-theoretic standpoint. Also, the problem of representing `isinstance(X.attr, someclass)` [for non-Constraint someclass, e.g. str] in a Constraint would still remain, since there's no "__lcontains__" (thus precluding `X.attr in str`). <snip>
The same can be true for abstract base classes, which have been sufficiently accepted to warrant adding __instancecheck__() in the first place and also to be added to the std lib (witness the `abc` and `collections` modules). It may seem unfamiliar, but then again it was only made possible starting with v2.6, which is pretty recent. Cheers, Chris -- http://rebertia.com

Being able to create abstract expressions that are later realized is super useful and neat. You can do this decently right now, but life would be better if you didn't have to jump through so many hoops. Having symbolic variables override *anything* would also make lambdas obsolete and greatly increase the potential for lazy evaluation.
0: rewrite the text -- a bit awkward in Python.
action = compile("read({this})".format(this=<book>), 'xxx', 'eval')
Yeah, that is something I'd expect to see in Perl code :)
I use this extensively in Elementwise.
The issue isn't so much that I *can't* do things as they are more trouble than they should be, and it makes the end user interface for the stuff I write less elegant. For example, if I want to work with a symbolic object, but include a function that is not well behaved, or if I was creating a constraint on the result of a function applied to a symbolic object, I have to know ahead of time everything I want to do, so I can wrap the whole expression in a lambda. Once I wrap it, the nice generative chaining property disappears and I'm stuck with a callable.
In some cases it would, in some cases it wouldn't. Since I basically never do */** expansion on wrappers, lambdas tend to be my go-to more often.
type/str/int/etc as types is definitely semi-coherent, since the language really treats them more like functions. They are treated that way all over the docs as well.
From the data model page:
"object.__str__(self) Called by the str() built-in function and by the..." "object.__nonzero__(self) Called to implement truth value testing and the built-in operation bool()" "object.__complex__(self) object.__int__(self) object.__long__(self) object.__float__(self) Called to implement the built-in functions complex(), int(), long(), and float(). Should return a value of the appropriate type." So clearly this is an area that needs some polish :)
Standard, and currently a pain in the butt, starting from the fact that operators don't hook into __getattribute__ and getting progressively worse from there.
Used standard syntax? Can you elaborate please? Also, a set is one of many things a Constraints instance could logically be represented with, as well as a discontinuous interval, a class in the colloquial sense, etc. The nice thing about __instancecheck__ is that every possible constraint reduces to a type check. Of course, you could rephrase all type checking in terms of set membership easily, but I don't think it is quite as intuitive to most folks. Your preference has been noted though.
As I mentioned in the first paragraph, Constraints is a metaclass,
This is your first mention, actually.
Feeling feisty? I'm actually curious to see what sort of rational you come up with to back up that statement; should be interesting :)
It is a fake check, like any abstract base class instancecheck is a fake check.
It could be any sort of callable. __instancecheck__ is the only reason it is a metaclass. Otherwise, I probably wouldn't bother with classes at all; returning a check inner function with constraints in the closure would be easy. Cheers, Nathan

On Thu, Jan 12, 2012 at 3:45 PM, Nathan Rice <nathan.alexander.rice@gmail.com> wrote:
I find the way you've formalized this a bit "weird". It looks like you're suggesting adding laziness to Python. If that's what you want, maybe you should try PyPy and the thunk object space: http://doc.pypy.org/en/latest/objspace-proxies.html#the-thunk-object-space -- Devin

On Fri, Jan 13, 2012 at 8:45 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
While thunk is neat, it doesn't accomplish precisely what I'm describing in this instance. When a function starts to run under thunk, the computations take place as soon as the function gets to the object inside its scope. What I'm after is the ability to basically create functions using only expressions, in a generative manner. Lambda does accomplish this, but in an extremely clunky manner... for example: X = lambda x: x + 1 Y = lambda y: y * 2 Z = lambda z: z % 3 (or XYZ = lambda x: (((x + 1) * 2) % 3) If I want to perform a second step after this, I need create another lambda, because they don't chain/aren't generative. The thing that Elementwise does that is very appealing is that most operations are generative, so if you have an ElementwiseProxy object x, x2 = (((x + 1) * 2) % 3) basically does the same thing as the above, but you can then do x3 = x2 ** 3, and so on until you are ready to get your results. Once you have your results, you can use the same ElementwiseProxy again with different inputs. It is just a function generation technique which has some elegant properties. Having a native object type designed to facilitate this would let people do a lot of interesting things, and make things like Elementwise be a lot more "consistent". I bet you could do some really neat things by subclassing such an object as well. This is certainly not a common programming paradigm, but I think most people could reap a lot of benefits from it while being oblivious to its presence. Nathan

On Friday 13.01.2012 08:45:38 Devin Jeanpierre wrote:
I find the way you've formalized this a bit "weird". It looks like you're suggesting adding laziness to Python.
IMHO this is quoting like in Lisp: He stores a bit of code in an un-evaluated form. The bit of code is then interpreted somewhere else, in special contexts, that he can choose with his hacked version of "isinstance". Eike.

On Fri, Jan 13, 2012 at 03:53:35PM +0100, Eike Welk wrote:
Symbolic algebra is a widely-deployed technique in Python world. It is popular, e.g., in object-relational mappers. IMHO Python is powerful enough in that area, there is no need to change the language. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

Oleg Broytman wrote:
The existing techniques for this get a bit awkward at times, though, for example when you want boolean operations, since 'and' etc. can't be overridden. My Overloadable Boolean Operators PEP was one attempt to address this problem, but it hasn't met with a very enthusiastic reception. Recently I've been working on another approach using that I've been calling "deferred expressions" (although I'll probably change that name because "deferred" seems to have taken on a different meaning already). The idea is that you write something like this (sorry about the ugly syntax, I haven't thought of anything better yet): e = <(a + b * c)> This gives you something that behaves in one respect like a parameterless lambda -- you can evaluate it, and any free variables are interpreted in their original environment. However, its internal structure is effectively an AST that you can pick apart and process however you want, including selectively evaluating sub-expressions and building up new expressions. Currently they're not generative the way the OP wants, i.e. you can't do 'e1 + e2' and get a new deferred expression, but that could be arranged, and would provide a nice way of building new expressions. The only problem would be attribute access, since they currently have attributes modelled after those of the corresponding AST nodes, whereas under a generative regime, 'e.a' ought to return a new expression node representing the access of attribute 'a'. I have about 90% of this implemented. I'll make a release as soon as I have something fit for public consumption. -- Greg

On Fri, Jan 13, 2012 at 9:53 AM, Eike Welk <eike.welk.lists1@gmx.de> wrote:
I wanted to think he wanted quote, because quote is awesome, but that doesn't gel with his desired behavior inside functions. def foo(A): return (A + 2) * 5 (define (foo A) (* (+ A 2) 5))) If in Python one calls foo(SymbolicExpr + 3), then it returns a SymbolicExpr that can be lazily evaluated to compute the expression. This is fairly general and could work even with mutating statements and all sorts of nasty things. If, in Scheme, one calls (foo '(+ X 3)), it is an error. There's no automatic laziness associated with quoted expressions -- they're just data. If one called (foo '3) then it would outright eagerly evaluate the function, returning 25, because '3 == 3. OTOH, you can define a foo that operates on quoted expressions and would return '(* (+ (+ X 3) 2) 5), as follows: (define (foo A) `(* (+ ,A 2) 5)) That might be sufficient for his use-case, can't tell. It isn't specifically what he asked for, though. (See also: XY problem). -- Devin

I think something like your SymbolicObject class should be in Python's standard library. Quite a few projects implement similar classes. If there is a basic infrastructure for symbolic computation in the standard library, symbolic algorithms can be reused much more simple across projects. The pattern you employ with SymbolicObject, is a nice way to mix code that is executed immediately, with code that is executed later. Symbolic algebra languages, like Maple and Mathematica, function this way too. I propose however a slightly different interface: There should be a special object "quoted" to create instances of "SymbolicObject". The objects should be created in the "__getattribute__" method. This way the objects know their name, which is quite useful. A symbolic object would be created like this: X = quoted.X You could also write: const = Constraints(quoted.X * 2 + 1 >= 5, quoted.X % 2 != 0) The (tree of) symbolic objects should have a method that translate it to an abstract syntax tree (ast.AST from the standard library). Subsequent algorithms should operate on this AST. This way symbolic algorithms can also operate on all possible Python code (SymbolicObject can only create expressions). my_ast = (quoted.X * 2 + 1).to_ast() For convenience I think that all symbolic algorithms should also accept SymbolicObject, and convert them to ast.AST internally. The standard library should contain at least these symbolic algorithms: evalsym(tree, environment, return_type="any") Evaluates an expression and returns the result. Works similar to: eval(compile(tree, "", "eval"), environment) This call would return 7: evalsym(quoted.X * 2 + 1, {"X":3}) However it should do partial evaluation if some symbols are unknown, and return an ast.AST in this case (similar to Maple and Mathematica). This expression should be True: evalsym(quoted.X + quoted.Y, {"Y":3}) == quoted.X + 3 The optional argument "return_type" specifies the type of the result. This simplifies algorithms using "eval". Argument "return_type" can be: "any", "ast", or "object". substitute(tree, substitutions) Replaces a symbol with a value (a sub-tree of ast.Ast). This expression should be True: substitute(quoted.X + quoted.Y, {"Y": quoted.X * 2}) \ == (quoted.X + quoted.X * 2).to_ast() Maybe this algorithm should also be able to match general fragments of the tree and replace them. Eike.

You guys know this can more or less be done in Python today, right?
I leave further extensions as an exercise for the reader, but it's not especially tricky to add various boolean tests and whatnot. I don't claim any originality here. I'm fairly sure I read about this idea in blog post somewhere years ago.

On Sat, Jan 14, 2012 at 1:27 AM, Carl M. Johnson <cmjohnson.mailinglist@gmail.com> wrote:
You guys know this can more or less be done in Python today, right?
I did mention in the first post that I am the author of a library that does this right? :D The issue is of consistency. some functions can break out of this behavior. Some "type" functions also want to behave a certain way regardless of input (which makes sense in most cases but breaks chaining symbolic expressions). Additionally, the amount of code required to do that sort of object proxying is really obnoxious. Things like operator methods should be hooked into__getattribute__ on symbolic objects. There are also a LOT of neat possibilities that come out of symbolic objects, for instance, function templates, fairly easy currying and generation of closed form solutions by performing calculations on functions rather than data. Nathan

On Jan 14, 2012, at 10:13 AM, Nathan Rice wrote:
I did mention in the first post that I am the author of a library that does this right? :D
Sorry, my mistake for only skimming the thread. I just kept wondering why none of the later emails mentioned that it was possible…

I find it to be useful quite frequently. I think making generative SQL expressions with SQL Alchemy got me hooked.
The factory there seems to add complexity. The Class(args) constructor style is less magic, and I can imagine wanting to pass things to the symbolic object, like a name, default value, a docstring, etc.
I think a method call leads to more readable code than a function, because they grow left to right and are structured like English. Something along the lines of: expr = X ** (M*(N+1)) expr.realize({N:2, M:3, X:4}) expr.realize((N,2)).realize((M, 3)).realize((X, 4)) Tasty curry. As a side note, perhaps PyPy is a good place to prototype something like this? :) Nathan

I did some additional thinking about how the interface could be designed, and I think there might be a good middle ground with regard to your quoted object. I still feel like creating variables should be independent, and use the standard class constructor method. I feel like a Context object would be a good place to keep variable values, and the expression evaluation method. thus, you might have: context = Context() expr = SymbolicObject(name="A") + SymbolicObject(name="B") * SymbolicObject(name="C", default=5) context.A = 1 context.B = 2 print context.eval(expr) 11 context.C = 3 print context.eval(expr) 7
I don't think it would be too bad to create an AST representation inside the symbolic expression while building it.
Having the evaluate method return an AST with incomplete variable information would be confusing to me. I would expect to get a new symbolic expression with the specified parameters replaced.
If you build up the AST representation while creating the symbolic expression, replacing a symbolic expression object should have the effect of updating the tree. I think having clearly delineated AST and SymbolicExpr representations might be easier (possibly via converter functions). The actual different may be negligible, but it would serve as an explicit way of declaring what your expectations are with regard to return values, etc. As a side note, it appears that between PyPy object spaces and continulets, it should be possible to create this without any weird interpreter hacking. Hopefully I'll have time to put something together this weekend. Nathan

On 1/12/2012 3:45 PM, Nathan Rice wrote:
Python is designed for concrete, rather than symbolic computation. But the latter has been done.
I think you may be confusing name-resolution time with execution time. They are distinct, though somewhat tied together in Python. For example: if I hand you a book and say "Read this", I intend that you immediately resolve 'this' to the book I just handed you and immediately 'read' it. If I instead say, "Tomorrow morning, read this", I still intend that you immediately resolve 'this' to a particular object, but now intend that you package the object with the action for later execution. I believe you want this this mixture of resolution now with action later. If so, your problem is that executing 'read(this)' or 'this.read()' does both things now, while "def f(): read(this)" or "lambda: read(this)" puts off both things until later. Python has several ways to binds objects now with actions later. Let <book> be a reference to 'the book Terry handed me, which I stored wherever'. 0: rewrite the text -- a bit awkward in Python. action = compile("read({this})".format(this=<book>), 'xxx', 'eval') 1. default args -- has to be done when the function is defined. def action(this = <book>): read(this) 2. closures (nested functions) -- also requires a planned-ahead definition. make_read(x): return lambda: read(x) action = make_read(<book>) 3. bound methods -- only works for classes with methods. Class Book: def read(self): pass action = Book(<book>).read 4. partial binding of function params -- generalizes bound methods; works for any function and argument. from functools import partial action = partial(read, <book>)
if I want to include isinstance(X, someclass) in a symbolic expression,
Consider using partial, which can totally bind all needed args *now* for *later* action.
I have to wrap it in a lambda
Are you sure that partial will not work for you? Since partial is written in Python, you can grab and adjust the code to your needs. It amounts to adding default args after the fact by using a generic closure.
People expect that class constructors produce an instance of the class. It is surprising when one does otherwise ;-). Builtin classes like int, bool, and str are classes just like ones you write.
Using a special class is a standard way to delay concrete execution.
print isinstance(3, const) True
A Contraints instance defines a set. 'const' is the set 'odd_ge_3' It would look better if you used standard syntax and do the inclusion check in a __contains__ method.
3 in odd_ge_3 True
print isinstance(2, const) False
2 in odd_ge_3 1 in odd_ge_3
"bleh" in ends_in_h "bleh" in ends_in_h_or_H
This is your first mention, actually.
so your validations are checked using __instancecheck__.
But it is a fake check in that 3 is not really an instance of the class, which has no instances. It is hard for me to believe that you cannot put the same constraint data in an instance of Constraint as a class and the inclusion logic in __contains__. If you customize __instancecheck__ for each instance of Constraints, then write def__contains__(self, ob): return self._check(ob) where _check does the same as your current __instancecheck__ Even if you *have* to make Constraints a metaclass, for other reasons, I believe you could still give it the same __contains__ method. A metaclass *is* a class, and if its class instances represent collections, inclusion in the colleciton should be tested in the standard way.
-- Terry Jan Reedy

On Thu, Jan 12, 2012 at 3:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But what are types but abstract sets of values? Phrasing it as a typecheck is perfectly sensible from a type-theoretic standpoint. Also, the problem of representing `isinstance(X.attr, someclass)` [for non-Constraint someclass, e.g. str] in a Constraint would still remain, since there's no "__lcontains__" (thus precluding `X.attr in str`). <snip>
The same can be true for abstract base classes, which have been sufficiently accepted to warrant adding __instancecheck__() in the first place and also to be added to the std lib (witness the `abc` and `collections` modules). It may seem unfamiliar, but then again it was only made possible starting with v2.6, which is pretty recent. Cheers, Chris -- http://rebertia.com

Being able to create abstract expressions that are later realized is super useful and neat. You can do this decently right now, but life would be better if you didn't have to jump through so many hoops. Having symbolic variables override *anything* would also make lambdas obsolete and greatly increase the potential for lazy evaluation.
0: rewrite the text -- a bit awkward in Python.
action = compile("read({this})".format(this=<book>), 'xxx', 'eval')
Yeah, that is something I'd expect to see in Perl code :)
I use this extensively in Elementwise.
The issue isn't so much that I *can't* do things as they are more trouble than they should be, and it makes the end user interface for the stuff I write less elegant. For example, if I want to work with a symbolic object, but include a function that is not well behaved, or if I was creating a constraint on the result of a function applied to a symbolic object, I have to know ahead of time everything I want to do, so I can wrap the whole expression in a lambda. Once I wrap it, the nice generative chaining property disappears and I'm stuck with a callable.
In some cases it would, in some cases it wouldn't. Since I basically never do */** expansion on wrappers, lambdas tend to be my go-to more often.
type/str/int/etc as types is definitely semi-coherent, since the language really treats them more like functions. They are treated that way all over the docs as well.
From the data model page:
"object.__str__(self) Called by the str() built-in function and by the..." "object.__nonzero__(self) Called to implement truth value testing and the built-in operation bool()" "object.__complex__(self) object.__int__(self) object.__long__(self) object.__float__(self) Called to implement the built-in functions complex(), int(), long(), and float(). Should return a value of the appropriate type." So clearly this is an area that needs some polish :)
Standard, and currently a pain in the butt, starting from the fact that operators don't hook into __getattribute__ and getting progressively worse from there.
Used standard syntax? Can you elaborate please? Also, a set is one of many things a Constraints instance could logically be represented with, as well as a discontinuous interval, a class in the colloquial sense, etc. The nice thing about __instancecheck__ is that every possible constraint reduces to a type check. Of course, you could rephrase all type checking in terms of set membership easily, but I don't think it is quite as intuitive to most folks. Your preference has been noted though.
As I mentioned in the first paragraph, Constraints is a metaclass,
This is your first mention, actually.
Feeling feisty? I'm actually curious to see what sort of rational you come up with to back up that statement; should be interesting :)
It is a fake check, like any abstract base class instancecheck is a fake check.
It could be any sort of callable. __instancecheck__ is the only reason it is a metaclass. Otherwise, I probably wouldn't bother with classes at all; returning a check inner function with constraints in the closure would be easy. Cheers, Nathan

On Thu, Jan 12, 2012 at 3:45 PM, Nathan Rice <nathan.alexander.rice@gmail.com> wrote:
I find the way you've formalized this a bit "weird". It looks like you're suggesting adding laziness to Python. If that's what you want, maybe you should try PyPy and the thunk object space: http://doc.pypy.org/en/latest/objspace-proxies.html#the-thunk-object-space -- Devin

On Fri, Jan 13, 2012 at 8:45 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
While thunk is neat, it doesn't accomplish precisely what I'm describing in this instance. When a function starts to run under thunk, the computations take place as soon as the function gets to the object inside its scope. What I'm after is the ability to basically create functions using only expressions, in a generative manner. Lambda does accomplish this, but in an extremely clunky manner... for example: X = lambda x: x + 1 Y = lambda y: y * 2 Z = lambda z: z % 3 (or XYZ = lambda x: (((x + 1) * 2) % 3) If I want to perform a second step after this, I need create another lambda, because they don't chain/aren't generative. The thing that Elementwise does that is very appealing is that most operations are generative, so if you have an ElementwiseProxy object x, x2 = (((x + 1) * 2) % 3) basically does the same thing as the above, but you can then do x3 = x2 ** 3, and so on until you are ready to get your results. Once you have your results, you can use the same ElementwiseProxy again with different inputs. It is just a function generation technique which has some elegant properties. Having a native object type designed to facilitate this would let people do a lot of interesting things, and make things like Elementwise be a lot more "consistent". I bet you could do some really neat things by subclassing such an object as well. This is certainly not a common programming paradigm, but I think most people could reap a lot of benefits from it while being oblivious to its presence. Nathan

On Friday 13.01.2012 08:45:38 Devin Jeanpierre wrote:
I find the way you've formalized this a bit "weird". It looks like you're suggesting adding laziness to Python.
IMHO this is quoting like in Lisp: He stores a bit of code in an un-evaluated form. The bit of code is then interpreted somewhere else, in special contexts, that he can choose with his hacked version of "isinstance". Eike.

On Fri, Jan 13, 2012 at 03:53:35PM +0100, Eike Welk wrote:
Symbolic algebra is a widely-deployed technique in Python world. It is popular, e.g., in object-relational mappers. IMHO Python is powerful enough in that area, there is no need to change the language. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

Oleg Broytman wrote:
The existing techniques for this get a bit awkward at times, though, for example when you want boolean operations, since 'and' etc. can't be overridden. My Overloadable Boolean Operators PEP was one attempt to address this problem, but it hasn't met with a very enthusiastic reception. Recently I've been working on another approach using that I've been calling "deferred expressions" (although I'll probably change that name because "deferred" seems to have taken on a different meaning already). The idea is that you write something like this (sorry about the ugly syntax, I haven't thought of anything better yet): e = <(a + b * c)> This gives you something that behaves in one respect like a parameterless lambda -- you can evaluate it, and any free variables are interpreted in their original environment. However, its internal structure is effectively an AST that you can pick apart and process however you want, including selectively evaluating sub-expressions and building up new expressions. Currently they're not generative the way the OP wants, i.e. you can't do 'e1 + e2' and get a new deferred expression, but that could be arranged, and would provide a nice way of building new expressions. The only problem would be attribute access, since they currently have attributes modelled after those of the corresponding AST nodes, whereas under a generative regime, 'e.a' ought to return a new expression node representing the access of attribute 'a'. I have about 90% of this implemented. I'll make a release as soon as I have something fit for public consumption. -- Greg

On Fri, Jan 13, 2012 at 9:53 AM, Eike Welk <eike.welk.lists1@gmx.de> wrote:
I wanted to think he wanted quote, because quote is awesome, but that doesn't gel with his desired behavior inside functions. def foo(A): return (A + 2) * 5 (define (foo A) (* (+ A 2) 5))) If in Python one calls foo(SymbolicExpr + 3), then it returns a SymbolicExpr that can be lazily evaluated to compute the expression. This is fairly general and could work even with mutating statements and all sorts of nasty things. If, in Scheme, one calls (foo '(+ X 3)), it is an error. There's no automatic laziness associated with quoted expressions -- they're just data. If one called (foo '3) then it would outright eagerly evaluate the function, returning 25, because '3 == 3. OTOH, you can define a foo that operates on quoted expressions and would return '(* (+ (+ X 3) 2) 5), as follows: (define (foo A) `(* (+ ,A 2) 5)) That might be sufficient for his use-case, can't tell. It isn't specifically what he asked for, though. (See also: XY problem). -- Devin

I think something like your SymbolicObject class should be in Python's standard library. Quite a few projects implement similar classes. If there is a basic infrastructure for symbolic computation in the standard library, symbolic algorithms can be reused much more simple across projects. The pattern you employ with SymbolicObject, is a nice way to mix code that is executed immediately, with code that is executed later. Symbolic algebra languages, like Maple and Mathematica, function this way too. I propose however a slightly different interface: There should be a special object "quoted" to create instances of "SymbolicObject". The objects should be created in the "__getattribute__" method. This way the objects know their name, which is quite useful. A symbolic object would be created like this: X = quoted.X You could also write: const = Constraints(quoted.X * 2 + 1 >= 5, quoted.X % 2 != 0) The (tree of) symbolic objects should have a method that translate it to an abstract syntax tree (ast.AST from the standard library). Subsequent algorithms should operate on this AST. This way symbolic algorithms can also operate on all possible Python code (SymbolicObject can only create expressions). my_ast = (quoted.X * 2 + 1).to_ast() For convenience I think that all symbolic algorithms should also accept SymbolicObject, and convert them to ast.AST internally. The standard library should contain at least these symbolic algorithms: evalsym(tree, environment, return_type="any") Evaluates an expression and returns the result. Works similar to: eval(compile(tree, "", "eval"), environment) This call would return 7: evalsym(quoted.X * 2 + 1, {"X":3}) However it should do partial evaluation if some symbols are unknown, and return an ast.AST in this case (similar to Maple and Mathematica). This expression should be True: evalsym(quoted.X + quoted.Y, {"Y":3}) == quoted.X + 3 The optional argument "return_type" specifies the type of the result. This simplifies algorithms using "eval". Argument "return_type" can be: "any", "ast", or "object". substitute(tree, substitutions) Replaces a symbol with a value (a sub-tree of ast.Ast). This expression should be True: substitute(quoted.X + quoted.Y, {"Y": quoted.X * 2}) \ == (quoted.X + quoted.X * 2).to_ast() Maybe this algorithm should also be able to match general fragments of the tree and replace them. Eike.

You guys know this can more or less be done in Python today, right?
I leave further extensions as an exercise for the reader, but it's not especially tricky to add various boolean tests and whatnot. I don't claim any originality here. I'm fairly sure I read about this idea in blog post somewhere years ago.

On Sat, Jan 14, 2012 at 1:27 AM, Carl M. Johnson <cmjohnson.mailinglist@gmail.com> wrote:
You guys know this can more or less be done in Python today, right?
I did mention in the first post that I am the author of a library that does this right? :D The issue is of consistency. some functions can break out of this behavior. Some "type" functions also want to behave a certain way regardless of input (which makes sense in most cases but breaks chaining symbolic expressions). Additionally, the amount of code required to do that sort of object proxying is really obnoxious. Things like operator methods should be hooked into__getattribute__ on symbolic objects. There are also a LOT of neat possibilities that come out of symbolic objects, for instance, function templates, fairly easy currying and generation of closed form solutions by performing calculations on functions rather than data. Nathan

On Jan 14, 2012, at 10:13 AM, Nathan Rice wrote:
I did mention in the first post that I am the author of a library that does this right? :D
Sorry, my mistake for only skimming the thread. I just kept wondering why none of the later emails mentioned that it was possible…

I find it to be useful quite frequently. I think making generative SQL expressions with SQL Alchemy got me hooked.
The factory there seems to add complexity. The Class(args) constructor style is less magic, and I can imagine wanting to pass things to the symbolic object, like a name, default value, a docstring, etc.
I think a method call leads to more readable code than a function, because they grow left to right and are structured like English. Something along the lines of: expr = X ** (M*(N+1)) expr.realize({N:2, M:3, X:4}) expr.realize((N,2)).realize((M, 3)).realize((X, 4)) Tasty curry. As a side note, perhaps PyPy is a good place to prototype something like this? :) Nathan

I did some additional thinking about how the interface could be designed, and I think there might be a good middle ground with regard to your quoted object. I still feel like creating variables should be independent, and use the standard class constructor method. I feel like a Context object would be a good place to keep variable values, and the expression evaluation method. thus, you might have: context = Context() expr = SymbolicObject(name="A") + SymbolicObject(name="B") * SymbolicObject(name="C", default=5) context.A = 1 context.B = 2 print context.eval(expr) 11 context.C = 3 print context.eval(expr) 7
I don't think it would be too bad to create an AST representation inside the symbolic expression while building it.
Having the evaluate method return an AST with incomplete variable information would be confusing to me. I would expect to get a new symbolic expression with the specified parameters replaced.
If you build up the AST representation while creating the symbolic expression, replacing a symbolic expression object should have the effect of updating the tree. I think having clearly delineated AST and SymbolicExpr representations might be easier (possibly via converter functions). The actual different may be negligible, but it would serve as an explicit way of declaring what your expectations are with regard to return values, etc. As a side note, it appears that between PyPy object spaces and continulets, it should be possible to create this without any weird interpreter hacking. Hopefully I'll have time to put something together this weekend. Nathan
participants (8)
-
Carl M. Johnson
-
Chris Rebert
-
Devin Jeanpierre
-
Eike Welk
-
Greg
-
Nathan Rice
-
Oleg Broytman
-
Terry Reedy