Query Language extension to Python
Hi Folks, We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org and lets us know what you think! The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned. Best regards! PythonQL team
On 1 November 2016 at 08:33, Pavel Velikhov
We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Nice! Paul
Cool!
https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial
How do I determine how much computation is pushed to the data? (Instead of
pulling all the data and running the computation with one local node) ...
https://en.wikipedia.org/wiki/Bulk_synchronous_parallel (MapReduce,)
- http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
-
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql_query....
- https://github.com/yhat/pandasql/
- http://docs.ibis-project.org/sql.html#common-column-expressions
- https://github.com/cloudera/ibis/blob/master/ibis/sql/alchemy.py
On Tuesday, November 1, 2016, Pavel Velikhov
Hi Folks,
We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Best regards! PythonQL team _______________________________________________ Python-ideas mailing list Python-ideas@python.org javascript:; https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi Wes! Right now we don’t push anything yet, we fetch everything into the Python’s runtime. But going forward the current idea is to push as much computation to the database as possible (most of the time the database will do a better job then our engine). If we run on top PySpark/Hadoop I think we should be able to completely translate 100% of PythonQL into these jobs.
On 1 Nov 2016, at 19:42, Wes Turner
wrote: Cool!
https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial
How do I determine how much computation is pushed to the data? (Instead of pulling all the data and running the computation with one local node) ... https://en.wikipedia.org/wiki/Bulk_synchronous_parallel https://en.wikipedia.org/wiki/Bulk_synchronous_parallel (MapReduce,)
- http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries - http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql_query.... http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql_query.... - https://github.com/yhat/pandasql/ https://github.com/yhat/pandasql/ - http://docs.ibis-project.org/sql.html#common-column-expressions http://docs.ibis-project.org/sql.html#common-column-expressions - https://github.com/cloudera/ibis/blob/master/ibis/sql/alchemy.py https://github.com/cloudera/ibis/blob/master/ibis/sql/alchemy.py
On Tuesday, November 1, 2016, Pavel Velikhov
mailto:pavel.velikhov@gmail.com> wrote: Hi Folks, We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org http://www.pythonql.org/ and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Best regards! PythonQL team _______________________________________________ Python-ideas mailing list Python-ideas@python.org javascript:; https://mail.python.org/mailman/listinfo/python-ideas https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ http://python.org/psf/codeofconduct/
On Tuesday, November 1, 2016, Pavel Velikhov
Hi Wes!
Right now we don’t push anything yet, we fetch everything into the Python’s runtime. But going forward the current idea is to push as much computation to the database as possible (most of the time the database will do a better job then our engine).
If we run on top PySpark/Hadoop I think we should be able to completely translate 100% of PythonQL into these jobs.
That would be great; and fast. A few more links that may be of use (in addition to ops._ in alchemy.py): - https://github.com/pythonql/pythonql/blob/master/Grammar.md#query-expression... - https://github.com/cloudera/ibis/blob/master/ibis/expr/window.py - http://www.ibis-project.org/faq.html#ibis-and-spark-pyspark - https://github.com/cloudera/ibis/tree/master/ibis/spark - http://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-... - https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apac... - http://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-...
On 1 Nov 2016, at 19:42, Wes Turner
javascript:_e(%7B%7D,'cvml','wes.turner@gmail.com');> wrote: Cool!
https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial
How do I determine how much computation is pushed to the data? (Instead of pulling all the data and running the computation with one local node) ... https://en.wikipedia.org/wiki/Bulk_synchronous_parallel (MapReduce,)
- http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries - http://pandas.pydata.org/pandas-docs/stable/generated/ pandas.read_sql_query.html - https://github.com/yhat/pandasql/ - http://docs.ibis-project.org/sql.html#common-column-expressions - https://github.com/cloudera/ibis/blob/master/ibis/sql/alchemy.py
On Tuesday, November 1, 2016, Pavel Velikhov
javascript:_e(%7B%7D,'cvml','pavel.velikhov@gmail.com');> wrote: Hi Folks,
We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Best regards! PythonQL team _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
How do you see this as different from Blaze (
http://blaze.readthedocs.io/en/latest/index.html)?
A
On Nov 1, 2016 1:34 AM, "Pavel Velikhov"
Hi Folks,
We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Best regards! PythonQL team _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi David! I haven’t used blaze, but its looks quite similar to pandas, at least conceptually. Thanks for the reference! The big difference with PythonQL is that we actually extend the syntax of Python with a few constructs that are typically used in query languages (group by, order by, window, let clause). The language extension is quite small and easy to grasp, but its very powerful: you can use this language to easily formulate pretty complex queries is a rather simple way. So traditionally - a query language (PythonQL) is good at expressing complex things easily, but then you need a lot of work from the optimizer and the database to turn it into an efficient plan. A library like blaze or pandas is more of an “algebra” - its really a plan specification. It will usually take much longer to memorize all the operators and ways of doing things in such a library and typically you have to go back to the documentation to do things that differ slightly from what you typically do. Oh yeah, so far our execution engine is pretty simple and not too efficient, but we plan to fix this in the future and be at least comparable to pandas performance (need to look at what’ s under the hood in blaze). Of course this is my take (although I heard a few similar things from our early users). It would be interesting to see how other folks compare the two approaches. Btw. we have built a library for working with pandas Dataframes, we could do it for blaze too, I suppose.
On 1 Nov 2016, at 21:17, David Mertz
wrote: How do you see this as different from Blaze (http://blaze.readthedocs.io/en/latest/index.html http://blaze.readthedocs.io/en/latest/index.html)? A
On Nov 1, 2016 1:34 AM, "Pavel Velikhov"
mailto:pavel.velikhov@gmail.com> wrote: Hi Folks, We have released PythonQL, a query language extension to Python (we have extended Python’s comprehensions with a full-fledged query language, drawing from the useful features of SQL, XQuery and JSONiq). Take a look at the project here: http://www.pythonql.org http://www.pythonql.org/ and lets us know what you think!
The way PythonQL currently works is you mark PythonQL files with a special encoding and the system runs a preprocessor for all such files. We have an interactive interpreter and Jupyter support planned.
Best regards! PythonQL team _______________________________________________ Python-ideas mailing list Python-ideas@python.org mailto:Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ http://python.org/psf/codeofconduct/
participants (4)
-
David Mertz
-
Paul Moore
-
Pavel Velikhov
-
Wes Turner