[Python-ideas] Proposal: Query language extension to Python (PythonQL)

Mark E. Haase mehaase at gmail.com
Sat Mar 25 12:54:08 EDT 2017


Hi Pavel,

This is a really impressive body of work. I had looked at this project in
the past but it is great to get back up to speed and see all the progress
made.

I use Python + databases almost every day, and the major unanswered
question is what benefit does dedicated language syntax have over using a
DBAL/ORM with a Builder style API? It obviously has huge costs (as all
syntax changes do) but the benefit is not obvious to me: I have never found
myself wanting built-in syntax for writing database queries.

My second thought is that every database layer I've ever used was
unavoidably leaky or incomplete. Database functionality (even if we
constrain "database" to mean RDBMS) is too diverse to be completely
abstracted away. This is why so many different abstractions already exist,
e.g. low-level like DBAPI and high-level like SQL Alchemy. You're not going
to find much support for cementing an imperfect abstraction right into the
Python grammar. In order to make the abstraction relatively complete, you'd
need to almost complete merge ANSI SQL grammar into Python grammar, which
sounds terrifying.

Third thought: is the implementation of a "Python query language" as
generic as the name implies? The docs mention support for document
databases, but I can run Redis queries? LDAP queries? DNS queries?

> We haven't build a real SQL Database wrapper yet, but in the meanwhile
you can use libraries like psycopg2 or SQLAlchemy to get data from the
database into an iterator, and then PythonQL can run on top of such
iterator.

Fourth thought: until PythonQL can abstract over a real database, it's far
too early to consider putting it into the language itself. These kinds of
"big change" projects typically need to stabilize on their own for a long
time before anybody will even consider putting them into the core language.

Finally – to end on a positive note – the coolest part of this project from
my point of view is using SQL as an abstraction over in-memory objects or
raw files. I can see how somebody that is comfortable with SQL would prefer
this declarative approach. I could see myself using an API like this to
search a Pandas dataframe, for example.

Cheers,
Mark

On Fri, Mar 24, 2017 at 11:10 AM, Pavel Velikhov <pavel.velikhov at gmail.com>
wrote:

> Hi folks!
>
>   We started a project to extend Python with a full-blown query language
> about a year ago. The project is call PythonQL, the links are given below
> in the references section. We have implemented what is kind of an alpha
> version now, and gained some experience and insights about why and where
> this is really useful. So I’d like to share those with you and gather some
> opinions whether you think we should try to include these extensions in the
> Python core.
>
> *Intro*
>
>   What we have done is (mostly) extended Python’s comprehensions with
> group by, order by, let and window clauses, which can come in any order,
> thus comprehensions become a query language a bit cleaner and more powerful
> than SQL. And we added a couple small convenience extensions, like a  We
> have identified three top motivations for folks to use these extensions:
>
> *Our Motivations*
>
> 1. This can become a standard for running queries against database
> systems. Instead of learning a large number of different SQL dialects (the
> pain point here are libraries of functions and operators that are different
> for each vendor), the Python developer needs only to learn PythonQL and he
> can query any SQL and NoSQL database.
>
> 2. A single PythonQL expression can integrate a number of
> databases/files/memory structures seamlessly, with the PythonQL optimizer
> figuring out which pieces of plans to ship to which databases. This is a
> cool virtual database integration story that can be very convenient,
> especially now, when a lot of data scientists use Python to wrangle the
> data all day long.
>
> 3. Querying data structures inside Python with the full power of SQL (and
> a bit more) is also really convenient on its own. Usually folks that are
> well-versed in SQL have to resort to completely different means when they
> need to run a query in Python on top of some data structures.
>
> *Current Status*
>
> We have PythonQL running, its installed via pip and an encoding hack, that
> runs our preprocessor. We currently compile PythonQL into Python using our
> executor functions and execute Python subexpressions via eval. We don’t do
> any optimization / rewriting of queries into languages of underlying
> systems. And the query processor is basic too, with naive implementations
> of operators. But we’ve build DBMS systems before, so if there is a good
> amount of support for this project, we’ll be able to build a real system
> here.
>
> *Your take on this*
>
> Extending Python’s grammar is surely a painful thing for the community.
> We’re now convinced that it is well worth it, because of all the wonderful
> functionality and convenience this extension offers. We’d like to get your
> feedback on this and maybe you’ll suggest some next steps for us.
>
> *References*
>
> PythonQL GitHub page: https://github.com/pythonql/pythonql
> PythonQL Intro and Tutorial (this is all User Documentation we have right
> now): https://github.com/pythonql/pythonql/wiki/
> PythonQL-Intro-and-Tutorial
> A use-case of querying Event Logs and doing Process Mining with PythonQL:
> https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-
> Mining-with-PythonQL
> PythonQL demo site: www.pythonql.org
>
> Best regards,
> PythonQL Team
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170325/500e533c/attachment-0001.html>


More information about the Python-ideas mailing list