proposal to add rowexpr as a keyword

ABSTRACT: I propose the following syntax: rowexpr: convert_to_euros(salary) > 50000 and deptno = 10 Rowexpr returns a function object that can be called with one argument, row, and the items "salary" and "deptno" are used to derefence row. RATIONALE: A lot of time people write Python code that works with other small interpreters, such as SQL engines, math libraries, etc., that have an expression syntax of their own. To use SQL as an example, I may want my SQL engine to use this Python expression: where convert_to_euros(salary) > 50000 and deptno = 10 To interact with SQL engines, I could imagine code like the following: sql(..., lambda row: convert_to_euros(row['salary']
50000 and row['deptno'] = 10) # not concise
sql(..., sql_parse('convert_to_euros(salary) > 50000 and deptno = 10') # requires sophisticated library sql(..., [and_op, [ge_op(func_call_op(convert_to_euros, 'salary'), 50000), eq_opt(itemgetter_op('deptno'), 10)]) # UGLY!!! SOLUTION: Let the Python interpreter build expression objects for you. You write: rowexpr: convert_to_euros(salary) > 50000 and deptno = 10 Python translates that into the same bytecode as: lambda row: convert_to_euros(row['salary'] > 50000 and row['deptno'] == 10 Benefits: 1) Allows smaller code. 2) Having Python translate the rowexpr creates an opportunity for better error messages, catching more things at startup time, etc. 3) In the particular case of SQL, it enable the development methodology that you might start off by writing code that processes data in memory, but with mostly SQL-like syntax, and then switch over to a real SQL database as scalability becomes a bigger concern. COMMON OBJECTIONS: 1) More syntax, of course. 2) Folks used to SQL would still need to adjust to differences between Python expressions (==) and SQL (=). 3) In the case of SQL, people may already be happy enough with current tools, and of course, interpreters external to Python can cache expressions, etc. OPEN QUESTIONS: Would a rowexpr be limited to dictionaries, or could it also try to access object attributes? ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

Steve Howell wrote:
I'm sorry, why would that not be translated into: lambda row: row['convert_to_euros'](row['salary'] > row['50000'] row['and'] row['deptno'] == row['10'] ? Specifically, how would python know what to dereference and what not to? What if there were two things, named the same, one in the row and one in the namespace? (i.e. a variable named 'salary') How would you escape things which would have been dereferenced but you didn't want to be? (i.e. "rowexpr: convert_to_euros(salary) > salary") I guess I'm kind of wasting my time here, since the introduction of a new keyword for this application really isn't going to happen, based on other decisions I've seen Guido make, but I do think that you need to think a little more about how the implementation of this feature would work. (Or perhaps you've done that thinking, and you just need to fill in the proposal with that information.) Thanks, Blake.

On 5/29/07, Blake Winton <bwinton@latte.ca> wrote:
Indeed, such half baked ideas have no chance of being taken seriously as language additions. Take a look at packages such as buzhug [1] and (especially) SqlAlchemy [2], not only because they might end up solving your problem but also to appreciate the complexity involved in supporting arbitrary SQL-like expressions at the language level, since you clearly underestimate it or haven't quite thought about it. George [1] http://buzhug.sourceforge.net/ [2] http://www.sqlalchemy.org/docs/ -- "If I have been able to see further, it was only because I stood on the shoulders of million monkeys."

"BJörn Lindqvist" <bjourne@gmail.com> wrote:
Indeed, but when they are called out as half-baked (explicitly or implicitly), and are offered no love by others in the list, perhaps it's time to let them die. On the other hand, there is the other argument that comp.lang.python is the right place to post half-baked ideas, which are then baked, brought to python-ideas for another round of "you didn't put in blueberries", before making it to python-dev/python-3000 . - Josiah

Guys, please don't judge me too harshly on the basis of a quick sketch of an idea. I'm obviously skimming over a lot of ideas in a quick pass. FWIW I've built virtual machines before, and way back I worked at Oracle and at times had to debug their PL/SQL implementation, so it's not like I don't understand what I'm asking for. I am also fully aware of Python's culture and have fought against frivolous language additions myself, although I'm biased like everybody else by own experience, so one person's frivolous is another person's can't-live-without-it. I will respond to the main technical objection--how to scope these--in a separate reply. Thanks. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

--- Blake Winton <bwinton@latte.ca> wrote:
My language here was imprecise. I should have said something to the effect of this: The interpreter would translate the "rowexpr" into bytecode that would execute the same way as the "lambda" example given the scope of the method convert_to_euros in the example above. Also, the bytecode would not be the "same" as the lambda; it would just have similar complexity.
I am suggesting that in a rowexpr expression, instead of the typical local scope being the default scope, "row" would be the local scope for dereferencing. Since I am proposing an expression syntax, does not the local scope go away? If you look at where salary, row, and deptno sit in the expression, it seems pretty clear to me that a well-defined scoping scheme would try to dereference row by default for all those tokens, and probably give up there, throwing an exception. The convert_to_euros token in the expression obviously poses a thornier problem. Do you prefer row.convert_to_euros, if such a think exists, or do you go right to what would be the typical scope of a similar lambda expression, and prefer, say, the definition of "def convert_to_euros" that sits at the module level? I would think the latter. The interpreter surely can evaluate such an expression and determine that convert_to_euros is going to be used a callable. Now, whether convert_to_euros is actually callable at run time is another story, but if it isn't, an exception is thrown, and at least you're not debugging somebody else's library to figure out why. Python would just tell you. I don't mean to oversimplify this, which is why I'm batting it around on a forum with a lot of smart people. But clearly it could be done. Whether it *should* be done is, of course, a question for debate, and I accept this. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

Steve Howell <showell30@yahoo.com> wrote:
You are proposing to insert a mini-language into Python so as to make certain relational data processing tasks easier. As you say, your particular proposal is pretty much just an alias for lambda where you don't specify the 'row' argument, names are acquired from the passed dictionary, and maybe you get some free enclosing scope goodies. Honestly, without explicit stating of where data is coming from, either via row['column'] or row.column, and the explicit naming of the argument to be passed (as in lambda argument: expression), it smells far too magical for Python. Never mind the fact that you will tend to get better performance by not using a lambda/your hacked lambda, as the direct execution will have fewer function calls... That is... pred = lambda row: (convert_to_euros(row['salary']) > 50000 and row['deptno'] == 10) a = [f(i) for i in rows if pred(i)] will run slower than b = [f(i) for i in rows if convert_to_euros(row['salary']) > 50000 and row['depno'] == 10] While the former *may* be easier to read (depending on who you are), the latter will be faster in all cases. The only way for the pred() version to be faster is if your rowexpr: variant was actually a macro expansion, but we don't do macro expansions in Python, so it will be slower. So, what do we have? Possible minor clarity win (but certainly not over lambdas), small length reduction over lambdas, slower execution compared to inlining the code (equal to lambda), more syntax. Put me down for -1. - Josiah

Josiah wrote: http://mail.python.org/pipermail/python-ideas/2007-May/000835.html Josiah, you are correct about all the following, and I apologize for any incorrect paraphrase: 1) I am proposing to add a mini-language within Python. 2) I want relational data processing tasks to be easier within Python. 3) I want rowexpr to acquire names from the passed first argument (with sensible scoping rules, of course). 4) Point #4 essentially amounts to free enclosing scope goodies. 5) My proposal does indeed smell of magic. 6) I do want the code to be easier to read for me (with the full caveat that YMMV). 7) I want the small clarity win. 8) I think the only way to achieve this particular clarity win is to add syntax. You may be incorrect about the following, or I may just be understanding your points, but I'm not wrong about any of these, since I'm just characterizing my own feelings: 1) I don't want rowexpr to be pretty much an alias for lambda; I actually want it to be more powerful. 2) I'm not sweating performance here; I'm already using an interpreted language. I don't want Python to be faster for my day-to-day tasks; I want it to be more expressive. I even--and I'll dare say--want the language to be BIGGER. 3) I'm not ignorant of the resistance to macro expansions in Python, but I do think a year-2010 Python interpreter could compile SQL syntax directly into bytecode. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

--- Steve Howell <showell30@yahoo.com> wrote:
er, point #3
You may be incorrect about the following, or I may just be understanding your points [...]
er, s/understanding/misunderstanding/ That wasn't a Freudian slip. See careless typing above (where I disrespectfully agree with you ;) ____________________________________________________________________________________Get the Yahoo! toolbar and be alerted to new email wherever you're surfing. http://new.toolbar.yahoo.com/toolbar/features/mail/index.php

Steve Howell <showell30@yahoo.com> wrote:
Technically speaking, lambda is sufficient for turing completeness, so the only thing to be gained is a reduction in what you write. As you offer... predicate = rowexpr: manipulation(column1) > value1 and \ column2 < value2 predicate = lambda row: manipulation(row['column1']) > value1 and \ row['column2'] < value) Is that reduction worthwhile? I personally don't think so, but my list comprehensions tend to have fairly minimal predicates. One thing to take into consideration is that Guido has previously shot down 'order by' syntax in list comprehensions and generator expressions because he didn't think that they were Pythonic (I seem to remember 'ugly' and 'worthless', but maybe that was my response to them).
That's fine, but realize that Python 3.0 is moving towards a smaller, clearer language; see the dictionary changes, range/xrange, exception handling, etc.
It could, but I would happily bet you $100 that it won't. If you are really intent on getting this, despite the arguments against it (from everyone so far), you could add your own syntax with logix, which will handle arbitrary SQL statement and compile it into Python bytecode (given sufficient information on how to do so). Yes, that's a cop-out, but sometimes the only way people get the language that they want is if they can add syntax at their whim. I've personally found that while I disagree with Guido on very many things related to Python, I'm usually too lazy to bother to add syntax I think I may need when I'm able to (with very minimal effort) write a helper function or two to do basically everything I need in lieu of syntax. - Josiah

--- Josiah Carlson <jcarlson@uci.edu> wrote:
Technically speaking, lambda is sufficient for turing completeness [...]
...as is Perl, or machine code, to pick sort of opposite ends of the spectrum :)
:) I am only expressing my own aesthetics, and I would certainly defer to Guido on most matters aesthetic, since he's written an aesthetically beautiful language. But having said that, I don't want my proposal automatically lumped in with every proposal that Guido has found unaesthetic, or rejected, and I believe he has even been known to change his mind from time to time. For all the real-world warts of SQL, I think SQL is a very aesthetically pleasing way to express transformations of relational data structures, and Python contains relational data structures, and therefore I think Python can benefit from using SQL as just one way of expressing relational transformations (and I'm still a little bit TIMTOWTDI from my Perl days, I fully admit). I fully concede all the obvious objections--more syntax, more ways to do it, difficulty of implementing it within the VM, ability of people to already manipulate relational data structures more cleanly than me in Python, etc. I'm not asking for this in Fall 2007, BTW, I'm expressing this as a vision for a bigger, better Python, maybe year 2010, even though smaller is usually better. And syntactically, I am only extending the language by one keyword, or one new way of triple-quoting. For my own use, native SQL would benefit the clarity of my (already working, but sometimes ugly) code more than some other additions proposed in Py3k, but YMMV. ____________________________________________________________________________________ Finding fabulous fares is fun. Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains. http://farechase.yahoo.com/promo-generic-14795097

On Tue, May 29, 2007, Steve Howell wrote:
Aside from the standard featuritis objection, my objection stems almost entirely from the difficulty of defining appropriate data structures on which to operate. SQL works partly because data in SQL tables is already *by definition* in a relational format -- which won't be true in Python, causing all kinds of runtime errors that IMO are inappropriate for SQL. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha

--- Aahz <aahz@pythoncraft.com> wrote:
In theory, I obviously agree, as a list can be full of all kinds of heterogeneous structures, and the fact that it can be is one of the beauties of Python, and certainly trying to apply native, byte-code interpreted SQL to such structures would certainly lead to run-time errors that even unit tests couldn't even catch in theory, much less practice. But, in my own day-to-day practice (I've recently been working on a billing system, which is not rocket science, just tedious), I find myself constantly calling into my DB API, which returns me a list of dictionaries, and the transformation from database to network to API to Python doesn't diminish the relational perfectness of the data whatsoever. Then I find myself transforming the SQL result set in many ways in Python. Some of those transformations are non-relational, which is the whole reason to bring the data into Python in the first place. But other transforms are relational, and that's where I want SQL. Which raises the natural question--to the extent that I want to do more relational transforms on the data that I already have, why don't I just farm that back out to my relational database? The two-part answer is that 1) of course I can, but 2) I don't want to, because I already have the data in Python. Answer #3 is Peoplesoft. If you've never worked with a really awkward database structure in the real world, please count yourself lucky, and I'll buy you drinks at your 30th birthday party. Do you understand at least my motivation, if not necessarily agreeing with the wisdom of my overall proposal? ____________________________________________________________________________________Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. http://farechase.yahoo.com/

On Tue, May 29, 2007, Steve Howell wrote:
Do you understand at least my motivation, if not necessarily agreeing with the wisdom of my overall proposal?
Sort of. I still think that your proposal deserves to be shot down for the same reasons that including regexes as part of the language should be shot down. I'm slightly less opposed to Talin's DSL idea, though. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha

--- Aahz <aahz@pythoncraft.com> wrote:
The more I think about the general themes in this discussion, the more I think PyPy is gonna be the proving ground for those kind of ideas. ____________________________________________________________________________________Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7

--- Josiah Carlson <jcarlson@uci.edu> wrote:
It could, but I would happily bet you $100 that it won't.
Maybe not by 2010, but I'll make a gentlemen's bet that native SQL is in Python by 2005.
Ok, the logix approach to introducing new syntax is good to know. I thought my pain would at least resonate with a few people, but it obviously has not so far.
As I said in another reply, I have working code, so I don't NEED the syntax, just want it, think it's a good idea, and I hope I've picked the correct forum in expressing a sort of wouldn't-it-be-great-if-Python-did-this suggestion. My progression in programming has always been to think that the current paradigm was brilliant (even in my C++ days!), and then only to discover there were even better paradigms. ____________________________________________________________________________________Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC

--- Josiah Carlson <jcarlson@uci.edu> wrote:
[...] you could add your own syntax with logix [...]
Or PyPy. Despite our disagreement, you are helping me to expand my thinking. Thank you. ____________________________________________________________________________________ Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/

Steve Howell wrote:
My objection this idea is that it's going in the opposite direction to what I'd like to see. In my opinion, having to embed one programming language inside another leads to ugly code. I would rather have an elegant way to deal with relational databases by writing Python code *instead* of SQL, than yet another mechanism for embedding SQL in Python. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

--- Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
You're wanting to solve a slightly different problem than I was, but can you elaborate on this a bit? How do you currently interact with relational databases? Are you objecting to some ugliness of SQL itself, or do you want a more powerful abstraction? To the extent that I have to work with awkward legacy database structures, I find that my strategy is usually this: 1) Write minimal SQL to get most of the data that I need up front. 2) Manipulate the data many ways in Python. 3) Write minimal SQL to put data back in the database. ____________________________________________________________________________________ Be a PS3 game guru. Get your game face on with the latest PS3 news and previews at Yahoo! Games. http://videogames.yahoo.com/platform?platform=120121

Steve Howell wrote:
You're wanting to solve a slightly different problem than I was, but can you elaborate on this a bit?
I know, it's more or less the complementary problem to what you're talking about. My point is that I don't like embedding SQL in my Python even when I'm dealing with a relational database, so I'm even less inclined to do so when dealing with Python data structures. I find that the existing features of the Python language do that well enough already.
Are you objecting to some ugliness of SQL itself, or do you want a more powerful abstraction?
Some of both. There's the general awkwardness involved whenever one language is embedded in another, plus the fact that I don't particularly like some aspects of SQL in particular. But often I also want a more powerful abstraction. The SQL that I need at a given point in the program isn't always fixed, and I need to generate it dynamically. As an example, in a recent project I needed to extract data from a number of tables to produce a report. The user has a variety of choices as to which fields are included in the report, and can optionally specify selection criteria for various fields, either a single value or a range of values. To accommodate this efficiently, I have to dynamically generate an SQL statement which includes or excludes 'where' clause terms of various sorts, and joins as necessary to pull in requested data from auxiliary tables. In this kind of application, there are few or no complete SQL statements written into the source, only fragments that get combined by an SQL-generation framework of some kind. So a facility like the one you suggest wouldn't help with this kind of problem. -- Greg

--- Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I can definitely relate to that sort of pain. ____________________________________________________________________________________Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting

Blake Winton wrote:
Sticking to your logic, why would it be > instead of row['>']? This is, I guess, particularly simple with Python's tokenizer because neither 50000 nor and can be valid identifiers.
That's a harder one, indeed. Perhaps, nothing should be dereferenced (at ``compile'' time, don't know if that's a valid term in CPython's interpreter chain tho), as it is in a lambda-expression.
ACK. If included (and I have major doubts it has a chance over 0.001%), it could just support "easy" expressions, such as lambdas do today.
Well, discussing in a mature manner can IMO never be a waste of time but I have to agree, I would not want this as a keyword either. A common idiom (aka stdlib module) *could* be handy, though. Regards, Stargaming

--- Stargaming <stargaming@gmail.com> wrote:
Admitting to not fully understanding the Python innards, here's some food for thought on how this would be done: 1) Not to diminish the parsing-into-AST stage, I won't cover that here, as other problems are trickier. Despite pretty different semantics, I think rowexpr would look a lot like lambda at the syntactic level. 2) You'd add a case to compiler_visit_expr for Rowexpr_kind in compile.c. 3) There's a pretty small method in compile.c called compiler_lambda that would be the model for compiler_rowexpr. The compiler_rowexp would generate an opcode that caused the single (implied, C++-"this"-like) arg to be popped, and possibly that opcode would have to be different than a normal opcode to pop one argument off the stack, so that ceval.c can do the right thing, but I'm not so sure. 4) You might need to add something to compiler_unit that is analogous to u_varnames, u_cellvars, u_freevars, etc., so that when you recursively call down to evaluate the expression in the rowexpr, the lower methods know to look in a different place to scope the words "salary," "convert_to_euros," and "dept" in the example expression 'where convert_to_euros(salary) > 50000 an dept = "software development"'. 5) The behavior of compiler_expr would NOT change for most operators, such as Num_kind, Str_kind, Attribute_kind, Subscript_kind, etc., as rowexpr doesn't change how any of those are intrerpreted compared to an expression in other scopes. 6) Compiler_expr eventually gets to building opcodes that give you salary, convert_to_euros, and dept, and those pieces of code need to look into compiler_unit to determine if they're part of the rowexpr, and do the right thing. 7) I'm sure there's more. I'm here to learn. ____________________________________________________________________________________Get the free Yahoo! toolbar and rest assured with the added security of spyware protection. http://new.toolbar.yahoo.com/toolbar/features/norton/index.php

--- Stargaming <stargaming@gmail.com> wrote:
Just tried a few examples, almost everything interesting happens at runtime: ======== PYTHON 2.3 Valid Python: salary = 2 print (lambda: salary)() # prints 2 --- Valid, but useless, Python (no errors) lambda: salary --- Run-time error: x = lambda: salary x() NameError: global name 'salary' is not defined --- Run-time error: x = lambda: convert_to_euros(salary) x() NameError: global name 'convert_to_euros' is not defined --- Run-time error: def convert_to_euros(amt): return amt.impossible salary = None x = lambda: convert_to_euros(salary) x() AttributeError: 'NoneType' object has no attribute 'impossible' --- Run-time error: def convert_to_euros(amt): raise 'this does not get here' row = None x = lambda row: convert_to_euros(row['salary']) x(row) TypeError: 'NoneType' object is unsubscriptable ======= THEORETICAL Valid Python: row = {'salary': 50000} def convert_to_euros(amt): return amt / 2 x = rowexpr: convert_to_euros(salary) print x(row) # prints 25000 --- Run-time error: def convert_to_euros(amt): raise 'this does not get here' row = None x = rowexpr: convert_to_euros(salary) x(row) TypeError: 'NoneType' object is unsubscriptable --- ____________________________________________________________________________________ Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/

--- Steven Bethard <steven.bethard@gmail.com> wrote:
Thanks. I tried it out with a slightly more involved expression. x = rowexpr( 'convert_to_euros(salary) ' 'if dept == "foo" else 0') rows = [dict(dept='foo', salary=i) for i in range(10000)] transform = [x(row) for row in rows] Here were my findings: 1) Not horribly slow. 3.4 seconds on my box for 10000 calls 2) I introduced a syntax error, and it was more clear than I thought it would be. It happens at runtime, of course, which is less than ideal, and it doesn't directly point me out to the line of code with the syntax error, but it does suggest the error: File "sql.py", line 14, in <module> transform = [x(row) for row in rows] File "sql.py", line 5, in evaluate exec '__result__ = ' + expr in d File "<string>", line 1 __result__ = convert_to_euros(salary) if dept = "foo" else 0 ^ SyntaxError: invalid syntax ____________________________________________________________________________________ Don't get soaked. Take a quick peak at the forecast with the Yahoo! Search weather shortcut. http://tools.search.yahoo.com/shortcuts/#loc_weather

On 5/30/07, Steve Howell <showell30@yahoo.com> wrote:
You can get the syntax error a little earlier (at the time of the rowexpr() call) by using compile()::
STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steve Howell wrote:
I'm sorry, why would that not be translated into: lambda row: row['convert_to_euros'](row['salary'] > row['50000'] row['and'] row['deptno'] == row['10'] ? Specifically, how would python know what to dereference and what not to? What if there were two things, named the same, one in the row and one in the namespace? (i.e. a variable named 'salary') How would you escape things which would have been dereferenced but you didn't want to be? (i.e. "rowexpr: convert_to_euros(salary) > salary") I guess I'm kind of wasting my time here, since the introduction of a new keyword for this application really isn't going to happen, based on other decisions I've seen Guido make, but I do think that you need to think a little more about how the implementation of this feature would work. (Or perhaps you've done that thinking, and you just need to fill in the proposal with that information.) Thanks, Blake.

On 5/29/07, Blake Winton <bwinton@latte.ca> wrote:
Indeed, such half baked ideas have no chance of being taken seriously as language additions. Take a look at packages such as buzhug [1] and (especially) SqlAlchemy [2], not only because they might end up solving your problem but also to appreciate the complexity involved in supporting arbitrary SQL-like expressions at the language level, since you clearly underestimate it or haven't quite thought about it. George [1] http://buzhug.sourceforge.net/ [2] http://www.sqlalchemy.org/docs/ -- "If I have been able to see further, it was only because I stood on the shoulders of million monkeys."

"BJörn Lindqvist" <bjourne@gmail.com> wrote:
Indeed, but when they are called out as half-baked (explicitly or implicitly), and are offered no love by others in the list, perhaps it's time to let them die. On the other hand, there is the other argument that comp.lang.python is the right place to post half-baked ideas, which are then baked, brought to python-ideas for another round of "you didn't put in blueberries", before making it to python-dev/python-3000 . - Josiah

Guys, please don't judge me too harshly on the basis of a quick sketch of an idea. I'm obviously skimming over a lot of ideas in a quick pass. FWIW I've built virtual machines before, and way back I worked at Oracle and at times had to debug their PL/SQL implementation, so it's not like I don't understand what I'm asking for. I am also fully aware of Python's culture and have fought against frivolous language additions myself, although I'm biased like everybody else by own experience, so one person's frivolous is another person's can't-live-without-it. I will respond to the main technical objection--how to scope these--in a separate reply. Thanks. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

--- Blake Winton <bwinton@latte.ca> wrote:
My language here was imprecise. I should have said something to the effect of this: The interpreter would translate the "rowexpr" into bytecode that would execute the same way as the "lambda" example given the scope of the method convert_to_euros in the example above. Also, the bytecode would not be the "same" as the lambda; it would just have similar complexity.
I am suggesting that in a rowexpr expression, instead of the typical local scope being the default scope, "row" would be the local scope for dereferencing. Since I am proposing an expression syntax, does not the local scope go away? If you look at where salary, row, and deptno sit in the expression, it seems pretty clear to me that a well-defined scoping scheme would try to dereference row by default for all those tokens, and probably give up there, throwing an exception. The convert_to_euros token in the expression obviously poses a thornier problem. Do you prefer row.convert_to_euros, if such a think exists, or do you go right to what would be the typical scope of a similar lambda expression, and prefer, say, the definition of "def convert_to_euros" that sits at the module level? I would think the latter. The interpreter surely can evaluate such an expression and determine that convert_to_euros is going to be used a callable. Now, whether convert_to_euros is actually callable at run time is another story, but if it isn't, an exception is thrown, and at least you're not debugging somebody else's library to figure out why. Python would just tell you. I don't mean to oversimplify this, which is why I'm batting it around on a forum with a lot of smart people. But clearly it could be done. Whether it *should* be done is, of course, a question for debate, and I accept this. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

Steve Howell <showell30@yahoo.com> wrote:
You are proposing to insert a mini-language into Python so as to make certain relational data processing tasks easier. As you say, your particular proposal is pretty much just an alias for lambda where you don't specify the 'row' argument, names are acquired from the passed dictionary, and maybe you get some free enclosing scope goodies. Honestly, without explicit stating of where data is coming from, either via row['column'] or row.column, and the explicit naming of the argument to be passed (as in lambda argument: expression), it smells far too magical for Python. Never mind the fact that you will tend to get better performance by not using a lambda/your hacked lambda, as the direct execution will have fewer function calls... That is... pred = lambda row: (convert_to_euros(row['salary']) > 50000 and row['deptno'] == 10) a = [f(i) for i in rows if pred(i)] will run slower than b = [f(i) for i in rows if convert_to_euros(row['salary']) > 50000 and row['depno'] == 10] While the former *may* be easier to read (depending on who you are), the latter will be faster in all cases. The only way for the pred() version to be faster is if your rowexpr: variant was actually a macro expansion, but we don't do macro expansions in Python, so it will be slower. So, what do we have? Possible minor clarity win (but certainly not over lambdas), small length reduction over lambdas, slower execution compared to inlining the code (equal to lambda), more syntax. Put me down for -1. - Josiah

Josiah wrote: http://mail.python.org/pipermail/python-ideas/2007-May/000835.html Josiah, you are correct about all the following, and I apologize for any incorrect paraphrase: 1) I am proposing to add a mini-language within Python. 2) I want relational data processing tasks to be easier within Python. 3) I want rowexpr to acquire names from the passed first argument (with sensible scoping rules, of course). 4) Point #4 essentially amounts to free enclosing scope goodies. 5) My proposal does indeed smell of magic. 6) I do want the code to be easier to read for me (with the full caveat that YMMV). 7) I want the small clarity win. 8) I think the only way to achieve this particular clarity win is to add syntax. You may be incorrect about the following, or I may just be understanding your points, but I'm not wrong about any of these, since I'm just characterizing my own feelings: 1) I don't want rowexpr to be pretty much an alias for lambda; I actually want it to be more powerful. 2) I'm not sweating performance here; I'm already using an interpreted language. I don't want Python to be faster for my day-to-day tasks; I want it to be more expressive. I even--and I'll dare say--want the language to be BIGGER. 3) I'm not ignorant of the resistance to macro expansions in Python, but I do think a year-2010 Python interpreter could compile SQL syntax directly into bytecode. ____________________________________________________________________________________Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469

--- Steve Howell <showell30@yahoo.com> wrote:
er, point #3
You may be incorrect about the following, or I may just be understanding your points [...]
er, s/understanding/misunderstanding/ That wasn't a Freudian slip. See careless typing above (where I disrespectfully agree with you ;) ____________________________________________________________________________________Get the Yahoo! toolbar and be alerted to new email wherever you're surfing. http://new.toolbar.yahoo.com/toolbar/features/mail/index.php

Steve Howell <showell30@yahoo.com> wrote:
Technically speaking, lambda is sufficient for turing completeness, so the only thing to be gained is a reduction in what you write. As you offer... predicate = rowexpr: manipulation(column1) > value1 and \ column2 < value2 predicate = lambda row: manipulation(row['column1']) > value1 and \ row['column2'] < value) Is that reduction worthwhile? I personally don't think so, but my list comprehensions tend to have fairly minimal predicates. One thing to take into consideration is that Guido has previously shot down 'order by' syntax in list comprehensions and generator expressions because he didn't think that they were Pythonic (I seem to remember 'ugly' and 'worthless', but maybe that was my response to them).
That's fine, but realize that Python 3.0 is moving towards a smaller, clearer language; see the dictionary changes, range/xrange, exception handling, etc.
It could, but I would happily bet you $100 that it won't. If you are really intent on getting this, despite the arguments against it (from everyone so far), you could add your own syntax with logix, which will handle arbitrary SQL statement and compile it into Python bytecode (given sufficient information on how to do so). Yes, that's a cop-out, but sometimes the only way people get the language that they want is if they can add syntax at their whim. I've personally found that while I disagree with Guido on very many things related to Python, I'm usually too lazy to bother to add syntax I think I may need when I'm able to (with very minimal effort) write a helper function or two to do basically everything I need in lieu of syntax. - Josiah

--- Josiah Carlson <jcarlson@uci.edu> wrote:
Technically speaking, lambda is sufficient for turing completeness [...]
...as is Perl, or machine code, to pick sort of opposite ends of the spectrum :)
:) I am only expressing my own aesthetics, and I would certainly defer to Guido on most matters aesthetic, since he's written an aesthetically beautiful language. But having said that, I don't want my proposal automatically lumped in with every proposal that Guido has found unaesthetic, or rejected, and I believe he has even been known to change his mind from time to time. For all the real-world warts of SQL, I think SQL is a very aesthetically pleasing way to express transformations of relational data structures, and Python contains relational data structures, and therefore I think Python can benefit from using SQL as just one way of expressing relational transformations (and I'm still a little bit TIMTOWTDI from my Perl days, I fully admit). I fully concede all the obvious objections--more syntax, more ways to do it, difficulty of implementing it within the VM, ability of people to already manipulate relational data structures more cleanly than me in Python, etc. I'm not asking for this in Fall 2007, BTW, I'm expressing this as a vision for a bigger, better Python, maybe year 2010, even though smaller is usually better. And syntactically, I am only extending the language by one keyword, or one new way of triple-quoting. For my own use, native SQL would benefit the clarity of my (already working, but sometimes ugly) code more than some other additions proposed in Py3k, but YMMV. ____________________________________________________________________________________ Finding fabulous fares is fun. Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains. http://farechase.yahoo.com/promo-generic-14795097

On Tue, May 29, 2007, Steve Howell wrote:
Aside from the standard featuritis objection, my objection stems almost entirely from the difficulty of defining appropriate data structures on which to operate. SQL works partly because data in SQL tables is already *by definition* in a relational format -- which won't be true in Python, causing all kinds of runtime errors that IMO are inappropriate for SQL. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha

--- Aahz <aahz@pythoncraft.com> wrote:
In theory, I obviously agree, as a list can be full of all kinds of heterogeneous structures, and the fact that it can be is one of the beauties of Python, and certainly trying to apply native, byte-code interpreted SQL to such structures would certainly lead to run-time errors that even unit tests couldn't even catch in theory, much less practice. But, in my own day-to-day practice (I've recently been working on a billing system, which is not rocket science, just tedious), I find myself constantly calling into my DB API, which returns me a list of dictionaries, and the transformation from database to network to API to Python doesn't diminish the relational perfectness of the data whatsoever. Then I find myself transforming the SQL result set in many ways in Python. Some of those transformations are non-relational, which is the whole reason to bring the data into Python in the first place. But other transforms are relational, and that's where I want SQL. Which raises the natural question--to the extent that I want to do more relational transforms on the data that I already have, why don't I just farm that back out to my relational database? The two-part answer is that 1) of course I can, but 2) I don't want to, because I already have the data in Python. Answer #3 is Peoplesoft. If you've never worked with a really awkward database structure in the real world, please count yourself lucky, and I'll buy you drinks at your 30th birthday party. Do you understand at least my motivation, if not necessarily agreeing with the wisdom of my overall proposal? ____________________________________________________________________________________Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. http://farechase.yahoo.com/

On Tue, May 29, 2007, Steve Howell wrote:
Do you understand at least my motivation, if not necessarily agreeing with the wisdom of my overall proposal?
Sort of. I still think that your proposal deserves to be shot down for the same reasons that including regexes as part of the language should be shot down. I'm slightly less opposed to Talin's DSL idea, though. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha

--- Aahz <aahz@pythoncraft.com> wrote:
The more I think about the general themes in this discussion, the more I think PyPy is gonna be the proving ground for those kind of ideas. ____________________________________________________________________________________Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7

--- Josiah Carlson <jcarlson@uci.edu> wrote:
It could, but I would happily bet you $100 that it won't.
Maybe not by 2010, but I'll make a gentlemen's bet that native SQL is in Python by 2005.
Ok, the logix approach to introducing new syntax is good to know. I thought my pain would at least resonate with a few people, but it obviously has not so far.
As I said in another reply, I have working code, so I don't NEED the syntax, just want it, think it's a good idea, and I hope I've picked the correct forum in expressing a sort of wouldn't-it-be-great-if-Python-did-this suggestion. My progression in programming has always been to think that the current paradigm was brilliant (even in my C++ days!), and then only to discover there were even better paradigms. ____________________________________________________________________________________Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC

--- Josiah Carlson <jcarlson@uci.edu> wrote:
[...] you could add your own syntax with logix [...]
Or PyPy. Despite our disagreement, you are helping me to expand my thinking. Thank you. ____________________________________________________________________________________ Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/

Steve Howell wrote:
My objection this idea is that it's going in the opposite direction to what I'd like to see. In my opinion, having to embed one programming language inside another leads to ugly code. I would rather have an elegant way to deal with relational databases by writing Python code *instead* of SQL, than yet another mechanism for embedding SQL in Python. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+

--- Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
You're wanting to solve a slightly different problem than I was, but can you elaborate on this a bit? How do you currently interact with relational databases? Are you objecting to some ugliness of SQL itself, or do you want a more powerful abstraction? To the extent that I have to work with awkward legacy database structures, I find that my strategy is usually this: 1) Write minimal SQL to get most of the data that I need up front. 2) Manipulate the data many ways in Python. 3) Write minimal SQL to put data back in the database. ____________________________________________________________________________________ Be a PS3 game guru. Get your game face on with the latest PS3 news and previews at Yahoo! Games. http://videogames.yahoo.com/platform?platform=120121

Steve Howell wrote:
You're wanting to solve a slightly different problem than I was, but can you elaborate on this a bit?
I know, it's more or less the complementary problem to what you're talking about. My point is that I don't like embedding SQL in my Python even when I'm dealing with a relational database, so I'm even less inclined to do so when dealing with Python data structures. I find that the existing features of the Python language do that well enough already.
Are you objecting to some ugliness of SQL itself, or do you want a more powerful abstraction?
Some of both. There's the general awkwardness involved whenever one language is embedded in another, plus the fact that I don't particularly like some aspects of SQL in particular. But often I also want a more powerful abstraction. The SQL that I need at a given point in the program isn't always fixed, and I need to generate it dynamically. As an example, in a recent project I needed to extract data from a number of tables to produce a report. The user has a variety of choices as to which fields are included in the report, and can optionally specify selection criteria for various fields, either a single value or a range of values. To accommodate this efficiently, I have to dynamically generate an SQL statement which includes or excludes 'where' clause terms of various sorts, and joins as necessary to pull in requested data from auxiliary tables. In this kind of application, there are few or no complete SQL statements written into the source, only fragments that get combined by an SQL-generation framework of some kind. So a facility like the one you suggest wouldn't help with this kind of problem. -- Greg

--- Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I can definitely relate to that sort of pain. ____________________________________________________________________________________Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting

Blake Winton wrote:
Sticking to your logic, why would it be > instead of row['>']? This is, I guess, particularly simple with Python's tokenizer because neither 50000 nor and can be valid identifiers.
That's a harder one, indeed. Perhaps, nothing should be dereferenced (at ``compile'' time, don't know if that's a valid term in CPython's interpreter chain tho), as it is in a lambda-expression.
ACK. If included (and I have major doubts it has a chance over 0.001%), it could just support "easy" expressions, such as lambdas do today.
Well, discussing in a mature manner can IMO never be a waste of time but I have to agree, I would not want this as a keyword either. A common idiom (aka stdlib module) *could* be handy, though. Regards, Stargaming

--- Stargaming <stargaming@gmail.com> wrote:
Admitting to not fully understanding the Python innards, here's some food for thought on how this would be done: 1) Not to diminish the parsing-into-AST stage, I won't cover that here, as other problems are trickier. Despite pretty different semantics, I think rowexpr would look a lot like lambda at the syntactic level. 2) You'd add a case to compiler_visit_expr for Rowexpr_kind in compile.c. 3) There's a pretty small method in compile.c called compiler_lambda that would be the model for compiler_rowexpr. The compiler_rowexp would generate an opcode that caused the single (implied, C++-"this"-like) arg to be popped, and possibly that opcode would have to be different than a normal opcode to pop one argument off the stack, so that ceval.c can do the right thing, but I'm not so sure. 4) You might need to add something to compiler_unit that is analogous to u_varnames, u_cellvars, u_freevars, etc., so that when you recursively call down to evaluate the expression in the rowexpr, the lower methods know to look in a different place to scope the words "salary," "convert_to_euros," and "dept" in the example expression 'where convert_to_euros(salary) > 50000 an dept = "software development"'. 5) The behavior of compiler_expr would NOT change for most operators, such as Num_kind, Str_kind, Attribute_kind, Subscript_kind, etc., as rowexpr doesn't change how any of those are intrerpreted compared to an expression in other scopes. 6) Compiler_expr eventually gets to building opcodes that give you salary, convert_to_euros, and dept, and those pieces of code need to look into compiler_unit to determine if they're part of the rowexpr, and do the right thing. 7) I'm sure there's more. I'm here to learn. ____________________________________________________________________________________Get the free Yahoo! toolbar and rest assured with the added security of spyware protection. http://new.toolbar.yahoo.com/toolbar/features/norton/index.php

--- Stargaming <stargaming@gmail.com> wrote:
Just tried a few examples, almost everything interesting happens at runtime: ======== PYTHON 2.3 Valid Python: salary = 2 print (lambda: salary)() # prints 2 --- Valid, but useless, Python (no errors) lambda: salary --- Run-time error: x = lambda: salary x() NameError: global name 'salary' is not defined --- Run-time error: x = lambda: convert_to_euros(salary) x() NameError: global name 'convert_to_euros' is not defined --- Run-time error: def convert_to_euros(amt): return amt.impossible salary = None x = lambda: convert_to_euros(salary) x() AttributeError: 'NoneType' object has no attribute 'impossible' --- Run-time error: def convert_to_euros(amt): raise 'this does not get here' row = None x = lambda row: convert_to_euros(row['salary']) x(row) TypeError: 'NoneType' object is unsubscriptable ======= THEORETICAL Valid Python: row = {'salary': 50000} def convert_to_euros(amt): return amt / 2 x = rowexpr: convert_to_euros(salary) print x(row) # prints 25000 --- Run-time error: def convert_to_euros(amt): raise 'this does not get here' row = None x = rowexpr: convert_to_euros(salary) x(row) TypeError: 'NoneType' object is unsubscriptable --- ____________________________________________________________________________________ Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/

--- Steven Bethard <steven.bethard@gmail.com> wrote:
Thanks. I tried it out with a slightly more involved expression. x = rowexpr( 'convert_to_euros(salary) ' 'if dept == "foo" else 0') rows = [dict(dept='foo', salary=i) for i in range(10000)] transform = [x(row) for row in rows] Here were my findings: 1) Not horribly slow. 3.4 seconds on my box for 10000 calls 2) I introduced a syntax error, and it was more clear than I thought it would be. It happens at runtime, of course, which is less than ideal, and it doesn't directly point me out to the line of code with the syntax error, but it does suggest the error: File "sql.py", line 14, in <module> transform = [x(row) for row in rows] File "sql.py", line 5, in evaluate exec '__result__ = ' + expr in d File "<string>", line 1 __result__ = convert_to_euros(salary) if dept = "foo" else 0 ^ SyntaxError: invalid syntax ____________________________________________________________________________________ Don't get soaked. Take a quick peak at the forecast with the Yahoo! Search weather shortcut. http://tools.search.yahoo.com/shortcuts/#loc_weather

On 5/30/07, Steve Howell <showell30@yahoo.com> wrote:
You can get the syntax error a little earlier (at the time of the rowexpr() call) by using compile()::
STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy
participants (9)
-
Aahz
-
BJörn Lindqvist
-
Blake Winton
-
George Sakkis
-
Greg Ewing
-
Josiah Carlson
-
Stargaming
-
Steve Howell
-
Steven Bethard