[DB-SIG] Test suite update for format conversion accuracy.

Mon May 20 19:57:38 CEST 2013

On Mon, 2013-05-20 at 10:46 -0600, Vernon D. Cole wrote:
> On Mon, May 20, 2013 at 8:12 AM, Daniele Varrazzo <
> daniele.varrazzo at gmail.com> wrote:
> 
> > On Mon, May 20, 2013 at 12:51 PM, Vernon D. Cole <vernondcole at gmail.com>
> > wrote:
> > > Stuart:
> > >
> > > I have just uploaded a branch for your consideration for
> > > https://code.launchpad.net/dbapi-compliance.  This version adds a second
> > > field to the second test table (barflies) and loads that field with the
> > > literal 'thi%s :may cau%(e)s? troub:1e'.  The value of this literal is a
> > > trap for unwary format conversion routines which fail to notice that they
> > > are re-formatting a literal value.
> >
> > You are requiring the sql string to be parsed, literal string
> > identified and placeholders inside them not treated. This is not a
> > DBAPI2 requirement.
> >
> 
> Indeed, the DB API2 does not address parameter reformatting at all.  We are
> just starting to play with that.  I am merely making sure that a reformat
> operation does not step on the contents of an ANSI standard literal. And,
> yes, when I am reformatting a string I do have to parse it.  How else would
> I find my format markers?

This would restrict the DBAPI to only deal with ANSI standard literal,
not what whatever the server is ready to accept, limiting the usability
scope of the DBAPI3 (unless you are asking to amend the DBAPI2
introducing the restriction back, as it seem you want fixing the dbapi2
acceptance test suite without renaming to dbapi30.py)

> > Without this requirement a driver can be implemented just passing the
> > string to the client library, an extremely reasonable way to implement
> > a client library wrapper. If you add it every driver will need to know
> > the parsing rule of the server (and of its version). This is not a
> > task as trivial as counting the quotes
> 
> 
> As a matter of fact, counting quotes handles this problem quite nicely.
> Actually, in my code I don't really count them.  I do a .split("'") [that
> is using a single quote] and then only parse the odd-numbered tokens.  Then
> do a "'".join().

Probably that's enough for ADO (is that a client library? I've worked
with ODBC/ADO/DAO/Jet too much time ago). It's not for Postgres in my
experience.

> > or applying a regexp: take a
> > look at the Postgres literals grammar:
> > <http://www.postgresql.org/docs/9.2/static/sql-syntax-lexical.html>.
> >
> v v v v v v v v
>  4.1.2.1. String Constants
> 
> A string constant in SQL is an arbitrary sequence of characters bounded by
> single quotes ('), for example 'This is a string'. To include a
> single-quote character within a string constant, write two adjacent single
> quotes, e.g., 'Dianne''s horse'. Note that this is not the same as a
> double-quote character (").
> ^ ^ ^ ^ ^ ^
> 
> That's what I am using for this test: an arbitrary sequence of characters
> bounded by single quotes.

Look at the scroll bar: the page doesn't stop there. You are skipping
beasts as:

   U&'d\0061t\+000061'
   U&'d!0061t!+000061' UESCAPE '!'
   $$Dianne's horse$$
   $SomeTag$Dianne's horse$SomeTag$

the latter can also be nested:

   $function$
   BEGIN
       RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
   END;
   $function$

Of course a postgres table could also be called ' (single quote): it's a
matter of applying the syntax for identifiers quoting, so I think you
can write

    insert into "'" values (?, '?')

without the server complaining. But that should be translated into

    insert into "'" values ($1, '?')

while I think your algorithm would generate

    insert into "'" values (?, '$1')

And there are the comments too:

-- comment'd

Which can also be /*
   multiline
   /* and nest'd */ */

> > Of course PG developers are free to complicate that as much as they
> > want: a client driver is expected to be agnostic and just forward the
> > query.
> >
> 
> Yes, that is the point: forward the query, not mess it up.

In this case I would be only bound to support $n placeholder, with no
possibility whatsoever to be DBAPI compliant. If instead I don't have to
make the distinction of the placeholder being inside or outside the
string, placeholders replacement is an affordable operation (e.g. ? can
be replaced with $1 and ?? can be replaced with ?. And ??? can be
replaced alternatively by $1? or by ?$1, according to the current lunar
phase).

> > I don't think any currently implemented driver behaves as you wish.
> >
> 
> I happen to _know_ that at least _one_ does.  ( My code is GPL, feel free
> to copy. I copied from yours yesterday morning.)

Thank you for the offer, but really I wouldn't like to put myself in the
position of writing a client program to parse postgres grammar.

> ProgrammingError: Incorrect number of bindings supplied. The current
> > statement uses 1, and there are 2 supplied.
> >
> > That is, indeed, a programming error. "h" and "i" are two values. The line
> should have another comma:
> In [6]: cur.execute("insert into test values (?, 'wat?')", ('hi',))
> I usually pass a single parameter in a list simply to avoid that pattern:
> In [6]: cur.execute("insert into test values (?, 'wat?')", ['hi'])
> 
> (for extra fun, I don't think sqlite3 even contemplates the
> > possibility to use ? into a literal and neither ?? nor \? appear to
> > work, see <http://www.sqlite.org/lang_expr.html#varparam>)
> >
> > Here is what the document you linked says:
> v v v v
> 4.1.2.1. String Constants
> [...]
> Nothing in there about avoiding question marks.

That's the wrong link, it's postgres grammar, not sqlite (about which
I've already ack'd it was the wrong example, so no need to go there and
check). sqlite is not extensible so a ? can either be a placeholder or
be in a string literal, otherwise it's a parse error. Furthermore,
sqlite parser is in the client library, not in the server.

Question marks are not special for postgres. Actually they are valid
operators:
<http://www.postgresql.org/docs/9.2/static/sql-createoperator.html>. In
your model it should be the driver's responsibility to convert

    select * from foo where a ?? ? and what = '?'
    select * from "'" where a ?? ? and what = '?'

into

    select * from foo where a ? $1 and what = '?'
    select * from "'" where a ? $1 and what = '?'

> Now let's try it....

> Seems to me that counting quotes works.

Sorry, I don't have windows to test it. Does it work with a table called
"a'b" too? Does

    cur.execute("insert into table values (? || $$?$$)", ('hello',))

do what expected? Does it deal with comments in the query string ok?

-- 
Daniele