DB-API support for sets?

The DB-API doesn't support sets directly, so you wind up having to manually expand them:
curs.execute("select count(*) from sometable where somecol in (11638, 4126)") <sqlite3.Cursor object at 0x7f8ef5f6c570>
In the example above, I'm able to skip input validation because the element values in the set are ints, but programmers being the lazy sort that they are, when such values are strings there's probably also the tendency to skip argument escaping. A quick check on bugs.python.org as well as Google yielded no obvious earlier discussion. Is it simply that underlying database drivers are unable to support this or was DB-API 2.0 frozen (no pun intended) before sets were available or in wide use in Python? Set support and the SQL "in" operator would seem to be handy, both as a convenience and as a barrier to SQL injection errors. Skip

Placeholders may be substituted by server. So we can't force to allow multiple values in one placeholder. (For example, MySQL allow only single value for one ?, while MySQLdb doesn't use MySQL's prepared statement at the moment.) DB-API is low level interface. Higher level library (like SQLAlchemy) provides better API for building query. Regards, -- Inada Naoki <songofacandy@gmail.com>

On Wed, May 22, 2019, at 09:58, Inada Naoki wrote:
Placeholders may be substituted by server. So we can't force to allow multiple values in one placeholder.
Can the placeholder be substituted for multiple placeholders by the client side, which are then further individually substituted by the server? Like, the pattern I would consider correct for this with today's api is: myset = { ... } cur.execute(f"... x in ({','.join('?'*len(myset))}) ...", (..., *myset, ...)) [this may not behave correctly on some databases when the set is empty, but on many databases (null) will not match any values including null, and therefore could be substituted instead] The main sticking point for this sort of thing in other languages is that DB layers often have a prepare method, and prepared queries as first-class objects, which often does not allow the actual query text executed to change based on the values of the parameters. But no such concept exists in DB-API (executemany does say it prepares once, but that is explicitly an implementation detail).

Hi Skip, On 22.05.2019 15:46, Skip Montanaro wrote:
The DB-API provides freedom to drivers to support many different data types, so it's really up to the driver whether can deal with sets, lists, generators, complex numbers, geo data objects, etc. etc. in the data and how they map these to the binding variables. This is intentional to allow database drivers to expose as much functionality support by the database backend as they can. In your use case, the above would be a request to raise with the sqlite3 driver maintainers. It certainly sounds like a useful addition, if SQLite's API has support for binding variables to IN clauses and can accept arrays with values in the API. In many cases, it should be possible to convert sets or lists to comma separated quoted value strings and then pass these to the database. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, May 22 2019)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Wed, May 22, 2019 at 11:47 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
I don't know whether sqlite3 supports this kind of operation. PostgreSQL/psycopg2 does, in a slightly different form:
With current psycopg2 builds, you have to convert the set into a list. It would be a relatively simple feature request (I hope!) to add support for sets the same way. Are you able to do an "any" operation with sqlite3? That might be easier to represent (PostgreSQL sees this as an ARRAY value, which can be compared in this way). ChrisA

Placeholders may be substituted by server. So we can't force to allow multiple values in one placeholder. (For example, MySQL allow only single value for one ?, while MySQLdb doesn't use MySQL's prepared statement at the moment.) DB-API is low level interface. Higher level library (like SQLAlchemy) provides better API for building query. Regards, -- Inada Naoki <songofacandy@gmail.com>

On Wed, May 22, 2019, at 09:58, Inada Naoki wrote:
Placeholders may be substituted by server. So we can't force to allow multiple values in one placeholder.
Can the placeholder be substituted for multiple placeholders by the client side, which are then further individually substituted by the server? Like, the pattern I would consider correct for this with today's api is: myset = { ... } cur.execute(f"... x in ({','.join('?'*len(myset))}) ...", (..., *myset, ...)) [this may not behave correctly on some databases when the set is empty, but on many databases (null) will not match any values including null, and therefore could be substituted instead] The main sticking point for this sort of thing in other languages is that DB layers often have a prepare method, and prepared queries as first-class objects, which often does not allow the actual query text executed to change based on the values of the parameters. But no such concept exists in DB-API (executemany does say it prepares once, but that is explicitly an implementation detail).

Hi Skip, On 22.05.2019 15:46, Skip Montanaro wrote:
The DB-API provides freedom to drivers to support many different data types, so it's really up to the driver whether can deal with sets, lists, generators, complex numbers, geo data objects, etc. etc. in the data and how they map these to the binding variables. This is intentional to allow database drivers to expose as much functionality support by the database backend as they can. In your use case, the above would be a request to raise with the sqlite3 driver maintainers. It certainly sounds like a useful addition, if SQLite's API has support for binding variables to IN clauses and can accept arrays with values in the API. In many cases, it should be possible to convert sets or lists to comma separated quoted value strings and then pass these to the database. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, May 22 2019)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Wed, May 22, 2019 at 11:47 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
I don't know whether sqlite3 supports this kind of operation. PostgreSQL/psycopg2 does, in a slightly different form:
With current psycopg2 builds, you have to convert the set into a list. It would be a relatively simple feature request (I hope!) to add support for sets the same way. Are you able to do an "any" operation with sqlite3? That might be easier to represent (PostgreSQL sees this as an ARRAY value, which can be compared in this way). ChrisA
participants (5)
-
Chris Angelico
-
Inada Naoki
-
M.-A. Lemburg
-
Random832
-
Skip Montanaro