[DB-SIG] Would like to challenge DBAPI on callproc's return value

Milosz Kosmider milosz at milosz.ca
Mon Dec 7 22:51:41 EST 2015


As far as I can tell, the PG C API does not give direct access to stored
procedures at all, instead only giving access via exec function with
sanitized bound parameters. The current implementation of callproc in
psycopg2 binds variables in a SELECT statement in order to execute a stored
procedure, e.g. "SELECT some_procedure(...)". I took a look at
py-postgresql, where equivalent functionality is implemented as db.proc,
and there, too, a SELECT statement is used to invoke a stored procedure.

In fact, the way the API is described in
http://www.postgresql.org/docs/9.1/static/libpq-exec.html, it would appear
that C programmers are to communicate with PostgreSQL exclusively through
text commands. Outright access to internal objects like stored procedures
and tables is not given.

The solution in which the parameters tuple is returned directly works when
a tuple is passed in, but would be difficult in the dictionary case. In the
dictionary case, to return an ordered tuple, it would be necessary to query
the database to determine the order of parameters for the matched overload
of the given procedure, which would introduce some unwanted overhead.

One possible interpretation of the spec in the dictionary case would be to
return the passed-in dictionary, but it wouldn't be a tuple as stated by
the spec.

As I see it, there are two ways to implement the spec, both of which are
not 100% compliant. The first is to return the parameters object, whether
it is an ordered tuple or a dictionary. The second is to return Py_None.
These are equally easy to deal with for the programmer, computer, and API
user. I am leaning torwards Py_None because PG users don't expect a return
value anyway. I will talk this over with the psycopg2 team and reference
this email chain.

Cheers,
Milosz

On Mon, Dec 7, 2015 at 1:35 PM M.-A. Lemburg <mal at egenix.com> wrote:

> On 06.12.2015 20:10, Milosz Kosmider wrote:
> > Hi Marc-Andre, Mike,
> >
> > Thanks for responding and setting me straight on the spec! Being that
> > PostgreSQL is a little bit unique in that it always returns its INOUT/OUT
> > parameters in the result set and provides no other way to access their
> > values, would it be an acceptable compromise for callproc to return a
> null
> > value when backed by PostgreSQL?
> >
> > Even the solution of copying inputs to outputs is not trivial. PostgreSQL
> > procedures are invoked using the input parameters only, so one possible
> > solution would be to store the input parameters in a map, then take the
> > first row of the result set and add the values to that map, replacing any
> > existing (INOUT) values. Lastly we would order the parameters according
> to
> > the function prototype, which I am not even sure is unambiguous, because
> > proc(IN a, OUT b) and proc(IN a) are equal as prototypes. This is very
> > hacky and probably impossible. Moreover it provides no value in
> PostgreSQL.
> > That's why I think a null return value would be best. It best reflects
> what
> > PostgreSQL stored procedures actually return: nothing, directly.
>
> If I understand correctly the PG C API doesn't support
> IN/OUT and OUT parameters, right ?
>
> If that's the case, I'd simply copy the input parameter
> tuple and return this as output - this is what other interfaces
> do too when they don't support IN/OUT and OUT parameters.
> You already have all the values in Python, so the copy
> operation usually is a mere Py_INCREF on the parameters
> tuple.
>
> If PG documents that IN/OUT and OUT parameters are sent
> back via result sets, using this approach is what the PG
> users expect, so it seems like the most natural way to
> do this.
>
> I'm not sure whether using None as place holder is a good
> solution, but then again: if they are not updated by the
> driver anyway, this may work as well.
>
> Cheers,
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Experts (#1, Dec 07 2015)
> >>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
> >>> Python Database Interfaces ...           http://products.egenix.com/
> >>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
> ________________________________________________________________________
>
> ::: We implement business ideas - efficiently in both time and costs :::
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
>                       http://www.malemburg.com/
>
>
> > Cheers,
> > Milosz
> >
> > On Fri, Dec 4, 2015, 16:05 Mike Bayer <mike_mp at zzzcomputing.com> wrote:
> >
> >>
> >>
> >> On 12/01/2015 12:40 AM, Milosz Kosmider wrote:
> >>> Hi folks,
> >>>
> >>> As PostgreSQL has had named parameters in stored procedures since
> >>> version 9.0, I
> >>> have taken to adding support for the feature in the DBAPI-complicant
> >> Python
> >>> wrapper, psycopg2: https://github.com/psycopg/psycopg2/pull/16/files.
> >>>
> >>> The feature allows the "callproc" method to be invoked, in addition to
> >> the
> >>> standard, tuple-y way, like this: cur.callproc('some_proc', {
> some_param:
> >>> 'some_value', ... })
> >>>
> >>> While implementing the feature I stumbled upon the DBAPI spec for
> >> callproc:
> >>> http://legacy.python.org/dev/peps/pep-0249/#callproc. It states that
> >>> "the result
> >>> of the call is returned as modified copy of the input sequence. Input
> >>> parameters
> >>> are left untouched, output and input/output parameters replaced with
> >>> possibly
> >>> new values." It then goes onto say that "The procedure may also provide
> >>> a result
> >>> set as output."
> >>>
> >>> I would like to challenge the former requirement, and make the latter
> >>> statement a strict requirement.
> >>>
> >>> The former requirement of returning the modified input sequence
> >> effectively
> >>> forces cur.callproc to examine its result set and copy it over the
> input
> >>> sequence.
> >>
> >> I believe the term "input/output parameters" refers to OUT parameters
> >> specifically.   If Postgresql's stored procedure system is entirely
> >> result-set based, then "output parameters" just wouldn't apply here;
> >> you'd only need return the same list of inputs that was passed.
> >>
> >>
> >>> This is not only redundant, but in fact undefined for result
> >>> sets with
> >>> more than one row.
> >>
> >> again, this language refers to OUT parameters, which are not result
> >> rows, they are parameters that are writable, as in:
> >>
> >>
> >>
> https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:325417134618
> >>
> >>
> >>> The requirement's definition is moreover
> >>> self-contradictory,
> >>> because output parameters are not *in* the input sequence.
> >>
> >> In your case of Postgresql, they're not. In the case of Oracle, MS SQL
> >> Server, DB2, and others, they are (or can be). ;)
> >>
> >>
> >>
> >>
> >>> All stored procedures yield result sets.
> >>
> >> that is entirely untrue for many databases.
> >>
> >> SQL Server stored procedures for example usually return an integer
> >> return code.   They can also use OUT parameters, return multiple result
> >> sets, or a detached cursor object:
> >>
> >> https://technet.microsoft.com/en-us/library/aa174792%28v=sql.80%29.aspx
> >>
> >>
> >>
> >>> Some queries yield empty result
> >>> sets.
> >>> Some stored procedures yield empty result sets. As such, I would argue
> >> that,
> >>> like a query (e.g. cur.execute on a SELECT statement) cur.callproc
> >>> *must* endow
> >>> the cursor with a result set.
> >>>
> >>> I propose changing the DBAPI spec on callproc to be as follows:
> >>>
> >>> (This method is optional since not all databases provide stored
> >>> procedures.)
> >>>
> >>> Call a stored database procedure with the given name. The sequence of
> >>> parameters
> >>> must contain one entry for each argument that the procedure expects.
> >>> Overloaded
> >>> procedures are supported. If the underlying database supports named
> >>> parameters
> >>> in stored procedures, the sequence of parameters may be given as a
> >>> dictionary-like object mapping parameter names to values.
> >>>
> >>> The procedure must provide a result set as output. This is then made
> >>> available
> >>> through the standard fetch* methods.
> >>>
> >>> Return values are not defined.
> >>>
> >>> Thank you,
> >>> Milosz
> >>>
> >>>
> >>> _______________________________________________
> >>> DB-SIG maillist  -  DB-SIG at python.org
> >>> https://mail.python.org/mailman/listinfo/db-sig
> >>>
> >> _______________________________________________
> >> DB-SIG maillist  -  DB-SIG at python.org
> >> https://mail.python.org/mailman/listinfo/db-sig
> >>
> >
> >
> >
> > _______________________________________________
> > DB-SIG maillist  -  DB-SIG at python.org
> > https://mail.python.org/mailman/listinfo/db-sig
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/db-sig/attachments/20151208/34ba0a82/attachment.html>


More information about the DB-SIG mailing list