[DB-SIG] annotated 1.1 spec / feedback

Thu, 04 Mar 1999 14:08:34 +0100

Greg Stein wrote:
> 
> I've prepared an annotated 1.1 spec (it lists changes since 1.0). It is
> located at:
> 
>   http://www.lyra.org/greg/anno-DatabaseAPI-1.1.html
> 
> Please let me know if I've missed any changes. I'll update the page to
> follow Marc-Andre's releases of the API spec. Just watch for the version
> string at the top to see if I've updated yet.

Thanks.

> While annotating, I've also prepared some feedback items:
> 
> * Connect() should be lower-cased to connect(). Most Python module-level
> functions are all lower case; we should follow suit.

Hmm, this is probably a religous question: I tend to always use names
starting with a capital letter for classes and constructors.

> * the connect() parameters should be called "examples" rather than
> "usually". I've seen a db module or two where they try to follow suit
> and munge together parameters into a string, simply to have to tear it
> apart. We should not try to avoid imposing on the connection parameters.
> For example: MySQL should take four params: host, user, password,
> database. "dsn" is not "typical" for it, so we should confuse people
> into attempting to abide by it.

Ok.

> * the parameters should be positional, rather than keyword-based.
> keywords can be optional, but we should not require the params to be
> keyword-based.

But keywords are so much simpler to maintain: the order doesn't
matter, only the wording. They also serve as nice way to document
what the parameter stands for, e.g.

connect('degas','miro','kline')

isn't too easy to comprehend without first looking at the interface's
docs, yet

connect(host='degas',user='miro',password='kline')

is.

If you fear the programming overhead, I can post a few C macros
that make dealing with keywords in C really simple.

> * I think the exception should only be available thru dbi. Having them
> in two places may lead to using different classes, thus breaking the
> inheritance/catching of the exceptions. I understand that the module
> implementor can get the other class object and expose it as a value, but
> there'd be hard-to-find problems if they skipped that step.

It is actually quite simple to add references to those object
in dbi. The reason for having them exposed through the module is
that when using multiple databases, execption catching is likely
to fail (the dbi approach doesn't handle this case too well anyway:
packages might help here).

> * maybe we should discuss punting dbi. module distribution may be easier
> without it. While it is handy for clients that use more than one DB
> module,

Actually, it makes things harder for those cases (see above). mxODBC
supports opening connections to many different databases and each
has its own version of the dbi module because each subpackage uses
its own set of exception classes.

> I can easily envision a python module that exports the
> appropriate set for the DB being used.

Right: stick it all into the database interface package.

> * cursor.rowcount is new, and is extraneous. The result of execute()
> should be used.

The execute() method returns None in case there are rows available.
It does not return the number of rows in the result set.

> * the params to callproc should be: callproc(procname [, params]).

Right.

> * I think we should clean up the description attribute if possible. The
> 7-tuple is/was very Oracle-specific.

Ok, things like display_size and internal_size are not really
relevant, but why break existing code expecting these 7 entries (even
if it probably does not use all of them) ?

> * The old description for execute() returning None was clearer. Note
> that a SELECT can return no columns/rows (where your description says >
> 0).

The None return value is to indicate that data is available
to be fetched vie the fetchXXX() methods. If no rows where produced,
something else should be returned, e.g. 1.

The reason I changed the description was that the previous one
wasn't easy to implement: it is not clear whether a statement
is DDL, DML or DQL without parsing it...

> * For the >=0 return value from execute(), it should be clarified with
> the word "affect" since you could delete them or insert them (which are
> not "touching" rows).

Ok.

> * in general, I think the 1.1 change to the return value description
> didn't really clear things up. Basing the descriptions on returning
> columns/rows doesn't make much sense to me. Instead, I liked the
> description based on the *type* of statement that was performed. Under
> this style, a SELECT that doesn't produce columns/rows would return 1,
> but that is just wrong.

See above. Without parsing there isn't much you can do as module
implementor to get the return value right. Maybe we should drop the
return value altogether: rowcount gives the number of rows in the
result set (even 0 for SELECTs that didn't produce any rows) or the
number of affected rows.

> * execute()'s description should change back to the terms "sequence" and
> "sequence of sequences" for the parameters. There is no reason to force
> them to be tuples or lists.

I thought about this: there is no way for the method to tell whether 
it was passed a sequence of sequences that is meant to be a
sequence of parameter sequences or just a single sequence with
sequence entries, e.g. ('abc',) would break under that definition.

Is there really such a big need to be able to pass *any* sequence
as parameter ?

> * the part about cursor movement in fetchone() should be totally
> removed. fetchone() is about the CLIENT fetching a single row. It should
> NOT affect how the underlying module performs its fetches.

That renders cursors useless... are you sure you want this ?

> * the fetch* methods should say they return a sequence or a sequence of
> sequences. We should not force them to be tuples or lists of tuples.
> Some DB modules may want to have additional semantics or capabilities on
> their results. As long as it obeys sequence behavior, then the spec
> should be happy. The arraysize attribute's description also refers to a
> list of tuples, and should be updated.

Just thought it would be a good idea to keep execute() parameter
input and fetchXXX() output in sync.

> * dbiDate values probably shouldn't *contain* mxDateTime values.
> Instead, mxDateTime should be used in place of dbiDate. The point of
> dbiDate was to provide a way to discriminate between an integer and a
> date (unix ticks). mxDateTime can fill the same role.

I'll clarify that...

> * we may want to strike the "ending in db" comment. seems silly in
> retrospec and (empirically) has not been followed anyhow.

Ok.

Note: I've updated the spec to 1.1a6.

-- 
Marc-Andre Lemburg                               Y2000: 302 days left
---------------------------------------------------------------------
          : Python Pages >>> http://starship.skyport.net/~lemburg/  :
           ---------------------------------------------------------