[DB-SIG] API 3.0 limiting paramstyle to ['named', 'qmark'] is okay. ('format' is not desirable)

Carl Karsten carl at personnelware.com
Sun May 19 21:45:35 CEST 2013


All,

I have been following this parmamstyle issue.  I'll admit, I don't see
an easy answer so don't get too defensive if it seems like I am
attacking anyone's position.  I really don't have strong feelings for
or against anything.   But I do have some ideas that might help direct
progress out of this circle things seem to be stuck in.

I don't think it is reasonable to expect python's dbapi to be as db
agnostic as we would like it to be.  I think anytime multi db support
is coded, this pattern will emerge:
for each server (MySql, PostgreSQL, MsSql...) there will be code
specific to that server.  If only 2 servers are coded for, and a 3rd
is desired, it will require additional code.
(I bet this pattern has a name.  Any design pattern experts here?)

Python's dbapi exists to make it easy to write that additional code.
It does not exist to eliminate the need to write additional code.

To help support or dispute my opinions and I think this whole thread,
I think it would help to define the dbapi users.  I see 4 or so groups

1.  single app/db/instance.  I have had plenty of jobs where I worked
on an app that had only 1 install, so I controlled everything: I got
to pick the whole stack: OS, db server, application language,
framework, db module.

2. multi db/platform framework developer.  I have a bit of experience here too:
http://www.f1tech.com/products/visual-foxexpress and
http://dabodev.com.  I have hacked on those code bases at the core
developer's houses.

3. framework user.  Like when I use Django to develop a website for a
client.  I only use the ORM's interface and don't use .raw('select
....)

4. single app (like #1) but mulit db.  App gets installed in various
environments and needs to support more than one db engine.  Doesn't
use an existing ORM or abstraction because they need to use features
of the SQL engine that are too hard to work with using the
abstractions.    I have no experience here.   I am wondering if anyone
in this thread does, and if this group actually has any members.   I
think most of the disagreement in this thread is trying to address the
needs of this group.


Here is my personal case study of what happens when there is nothing like dbapi.

The code I am going to reference is all here
https://github.com/CarlFK/veyepar/tree/master/dj/scripts
but I would not recommend diving in.  It is scary and the details are
not really relevant to this discussion.

My video processing app interfaces with 4 video hosting services
(youtube, vimeo, blip, archive). Each service is basically the same:
they provide a place to store a video and some metadata, plus have an
html interface to a player.

They also have unique features:  blip and archive host the original
file and provide a public url anyone can download that file from.
blip lets me upload an image to use for a thumb nail.  youtube
generates one on it's own, archive doesn't have thumbs.     All of
them support tags, but youtube does not allow a slash char in the tag.

Each has an API.  They all use http over tcp/ip, So I could use the
requests module to interface with them, but that would be hard.  Lucky
all 4 have python modules that use who knows what modules to do the
http.  But there is no consistency how these modules are called.  They
expose the features of each service, but each one has a unique way of
doing the same thing, so I wrote 4 wrappers that give me a constant
interface:

Each wrapper has:
class Uploader(object):
  def upload(self):

and then the following lines that calls into the provided api:

youtube_uploader.py
yt_service.InsertVideoEntry(video_entry, pf)

archive_uploader.py
key.set_contents_from_file(pf)

vimeo_uploader.py
vup.upload(self.files[0]['pathname'])

blip_uploader.py
(OK, my blip code is a mess.  This was the first one I coded for, and
I had no idea what the separation was going to look like.
class Blip(object):
  def PostMultipart
self.PostMultipart(self.BLIP_UPLOAD_URL, fields, files)

So now my app uses the wrappers and regardless of what service I am
calling, I do:

post_arc.py:        ret = uploader.upload()
post_vimeo.py:        ret = uploader.upload()
post_yt.py:        youtube_success = uploader.upload()

post_blip.py:             response_obj = blip_cli.Upload(video_id,
blip_user, blip_pw, files, meta, thumb)

Like i said, the blip code is a mess, but I think my mess supports
what I see the value of dbapi is.

All 4 foo_uploader's are very similar.  In fact to make a new one I
copy the last one I worked on and adjust it to make it work.  Yes, I
expect I could have created an abstract class.. that may happen
someday.

Same goes post_foo.py.

I have a dream of consolidating all 4 post_foo's into one that takes a
parameter and uses the corresponding foo_uploader module.    I am not
sure if that will ever happen, or if it should.

Here is how this relates to dbapi:

foo_upload.py does what dbapi modules do: make it easier to write each
new post_foo.

If a new video service called bar comes along that I need to post to,
I fully expect to first create the bar_upload wrapper, and then
post_bar.py.

If a new SQL server came along, I would expect there would be a client
lib written in C.  A small group of people (likely just one person) to
write a new dbapi module for it, and then all the developers who
wanted to add support for that server would write code.    All the
developers would thank the dbapi module writer for providing a
pythonic interface to the server's client lib..


I hope you were able to follow all of this, and get something out of
it.  I think this group should examine how things are currently being
used, and give some concrete examples of how they think things could
be different.


On Sun, May 19, 2013 at 11:10 AM, Vernon D. Cole <vernondcole at gmail.com> wrote:
> Dear Daniele:
>
>  I am a bit confused.  Two posts ago you said that supporting other
> parmamstyle values would be "*way, way* more complicated".  Now you say: "
> qmark or not qmark syntax is a non-issue: converting across types is a
> trivial operation."
>
> Clearly you are against the "two paramstyles are mandatory, others are
> optional" rule.  But I cannot quite make out whether you are against
> implementing it because it is too hard, or because it is too easy.
>
> My own experience, as a developer who has already done this, is that the job
> is somewhere in the middle -- somewhat complex.  It took me a day or two to
> write the code and the tests.  Not a job to give a beginner, but not a real
> head breaker either.
>
> You also state that the modification is not needed because it has never been
> requested by one of *your* users.
>
> <<Please avert your eyes while I remove my Windows-driver-developer shirt
> and put on my Linux-django-developer shirt>>
> ----
> Dear Daniele:
>   I am a software engineer who is developing a system to aid in the
> eradication of polio from the earth. Our company uses django, and PostgreSQL
> databases on Ubuntu servers and workstations, using psycopg2.  We now have a
> need to interface our data with SQL Server databases and analysis tools used
> by other agencies.  In order to do so, I am updating a db-api driver I
> maintained when I worked for a Microsoft-only shop, and have begun working
> with the core developers of django to better inter-operate with SQL Server.
>   A discussion on the subject of 'format' paramstyle with my mentors on the
> django-developers group revealed that they do _not_ favor the style, and in
> fact it causes problems by its interaction with the Python '%' operator.
> They would welcome the opportunity to simply their lives by gradually moving
> to 'qmark' and 'named' styles, which do not suffer from '%' pollution.
>   Please use your time and influence to make this possible for our premier
> database system.  Will this method of communication be sufficient, or should
> I open a trouble ticket addressing the issue?
>   Your sincere user,
> Vernon Cole
> software engineer
> eHealthAfrica.org
> ------
> <<\shirt>>
> Now, you have had a user request.
>
> Please don't just take your ball and go home.  That will not stop the ball
> game, but it will rain on it.  Work with us.
> --
> Vernon Cole
>
>
> On Sat, May 18, 2013 at 8:22 PM, Daniele Varrazzo
> <daniele.varrazzo at gmail.com> wrote:
>>
>> On Sat, May 18, 2013 at 9:29 AM, Vernon D. Cole <vernondcole at gmail.com>
>> wrote:
>> > So, here is the thing.  I want you to take the time to make your
>> > database
>> > driver "*way, way* more complicated" by supporting the SQL format I must
>> > use, which is 'qmark'.
>>
>> Aaahhh... is it that what you want? If what you wanted is a psycopg
>> talking qmark, why on the earth haven't you asked for that? Why
>> instead are you asking the DBAPI to rule that every db driver must
>> implement qmark?
>>
>> There: just released qmarkpg, allowing using qmark and named
>> parameters with psycopg:
>>
>>  - Source: https://github.com/dvarrazzo/qmarkpg
>>  - PyPI: https://pypi.python.org/pypi/qmarkpg/
>>
>> This is not a mock. The module passes the DBAPI test suite (2PC included).
>>
>> Nobody in my about 10 years of frequentation of the psycopg mailing
>> list has ever asked psycopg to implement qmark/named. I repeat it:
>> this is a need nobody has ever expressed. And believe me we have been
>> asked for a lot of things.
>>
>> This module is the proof that qmark or not qmark syntax is a
>> non-issue: converting across types is a trivial operation. Writing a
>> wrapper module is a solution infinitely superior to any
>> module/connection/cursor writable attribute dancing; perfectly thread
>> safe and 100% backward compatible.
>>
>> My impression from this long thread is that we are in front of
>> bikeshedding in its purest form: I haven't read a rationale yet to
>> justify the rewrite of *every query* written in Python against MySQL,
>> PostgreSQL and god knows what other database server. The justification
>> has been just a tad more than "we like the ?". The feature
>> specification is "let's copy Java".
>>
>> Well, I've looked into that too: JDBC specifies *only* qmark, not
>> named. Named is only supported by *a different* object (a wrapper, I
>> understand): JDBC does not define a grammar where both qmark and named
>> can be used. And unsurprisingly so: the grammar is ambiguous:
>>
>> cur.execute("select $$a ? and a :foo$$", args)
>>
>> This is a valid postgres query. If args is an object implementing
>> Mapping and Sequence ABCs there's no way to disambiguate that. Now go
>> on and tell me that we could double the ? even when the :foo are used:
>> we are making up that syntax now, because JDBC hasn't event tried. Oh,
>> and in the placeholder with unary "?" operator (valid in postgres)
>> "???": is the operator prefix or suffix?
>>
>> Any driver implementing both positional and named arguments will face
>> the ambiguities of a grammar the proponents have not specified: a
>> problem that format/pyformat don't have.
>>
>> My bottom line is that it is provable that a spec mandating
>> qmark/named is broken,  and psycopg will never implement it: not only
>> for its technical inferiority but above all because there is no real
>> request for it, and people who believe they need that can obtain the
>> same result in trivial ways without bothering the rest of the world.
>>
>> -- Daniele
>
>
>
> _______________________________________________
> DB-SIG maillist  -  DB-SIG at python.org
> http://mail.python.org/mailman/listinfo/db-sig
>



-- 
Carl K


More information about the DB-SIG mailing list