[Python-ideas] Proposal: Query language extension to Python (PythonQL)
Kyle Lahnakoski
klahnakoski at mozilla.com
Sat Mar 25 11:40:40 EDT 2017
Pavel,
I like PythonQL. I perform a lot of data transformation, and often find
Python's list comprehensions too limiting; leaving me wishing for
LINQ-like language features.
As an alternative to extending Python with PythonQL, Terry Reedy
suggested interpreting a DSL string, and Pavel Velikhov alluded to using
magic method tricks found in ORM libraries. I can see how both these are
not satisfactory.
A third alternative could be to encode the query clauses as JSON
objects. For example:
result = [ select (x, sum_y)
for x in range(1,8),
y in range(1,7)
where x % 2 == 0 and y % 2 != 0 and x > y
group by x
let sum_y = sum(y)
where sum_y % 2 != 0
]
result = pq([
{"select":["x", "sum_y"]},
{"for":{"x": range(1,8), "y": range(1,7)}},
{"where": lambda x,y: x % 2 == 0 and y % 2 != 0 and x > y},
{"groupby": "x"},
{"with":{"sum_y":{"SUM":"y"}},
{"where": {"neq":[{"mod":["sum_y", 2]}, 0]}}
])
This representation does look a little lispy, and it may resemble
PythonQL's parse tree. I think the benefits are:
1) no python language change
2) easier to parse
3) better than string-based DSL for catching syntax errors
4) {"clause": parameters} format is flexible for handling common query
patterns **
5) works in javascript too
6) easy to compose with automation (my favorite)
It is probably easy for you to see the drawbacks.
** The `where` clause can accept a native lambda function, or an
expression tree
"If you are writing a loop, you are doing it wrong!" :)
On 2017-03-24 11:10, Pavel Velikhov wrote:
> Hi folks!
>
> We started a project to extend Python with a full-blown query
> language about a year ago. The project is call PythonQL, the links are
> given below in the references section. We have implemented what is
> kind of an alpha version now, and gained some experience and insights
> about why and where this is really useful. So I’d like to share those
> with you and gather some opinions whether you think we should try to
> include these extensions in the Python core.
>
> *Intro*
>
> What we have done is (mostly) extended Python’s comprehensions with
> group by, order by, let and window clauses, which can come in any
> order, thus comprehensions become a query language a bit cleaner and
> more powerful than SQL. And we added a couple small convenience
> extensions, like a We have identified three top motivations for folks
> to use these extensions:
>
> *Our Motivations*
>
> 1. This can become a standard for running queries against database
> systems. Instead of learning a large number of different SQL dialects
> (the pain point here are libraries of functions and operators that are
> different for each vendor), the Python developer needs only to learn
> PythonQL and he can query any SQL and NoSQL database.
>
> 2. A single PythonQL expression can integrate a number of
> databases/files/memory structures seamlessly, with the PythonQL
> optimizer figuring out which pieces of plans to ship to which
> databases. This is a cool virtual database integration story that can
> be very convenient, especially now, when a lot of data scientists use
> Python to wrangle the data all day long.
>
> 3. Querying data structures inside Python with the full power of SQL
> (and a bit more) is also really convenient on its own. Usually folks
> that are well-versed in SQL have to resort to completely different
> means when they need to run a query in Python on top of some data
> structures.
>
> *Current Status*
>
> We have PythonQL running, its installed via pip and an encoding hack,
> that runs our preprocessor. We currently compile PythonQL into Python
> using our executor functions and execute Python subexpressions via
> eval. We don’t do any optimization / rewriting of queries into
> languages of underlying systems. And the query processor is basic too,
> with naive implementations of operators. But we’ve build DBMS systems
> before, so if there is a good amount of support for this project,
> we’ll be able to build a real system here.
>
> *Your take on this*
>
> Extending Python’s grammar is surely a painful thing for the
> community. We’re now convinced that it is well worth it, because of
> all the wonderful functionality and convenience this extension offers.
> We’d like to get your feedback on this and maybe you’ll suggest some
> next steps for us.
>
> *References*
>
> PythonQL GitHub page: https://github.com/pythonql/pythonql
> PythonQL Intro and Tutorial (this is all User Documentation we have
> right
> now): https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial
> A use-case of querying Event Logs and doing Process Mining with
> PythonQL: https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL
> PythonQL demo site: www.pythonql.org <http://www.pythonql.org>
>
> Best regards,
> PythonQL Team
>
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170325/6703e2c0/attachment-0001.html>
More information about the Python-ideas
mailing list