Mailman 3 SQLite module for Python 2.5 - Python-Dev

newer
logging needs better documentation

SQLite module for Python 2.5

older
Changing pymalloc behaviour for...

Gerhard Haering

20 Oct 2004 20 Oct '04

4:29 p.m.

Hi python-dev-elopers, Last December, we had a short thread discussing the integration of PySQLite into Python 2.4. At the time, I was against inclusion, because I thought PySQLite was not ripe for it, mostly because I thought the API was not stable. Now, I have started writing a new PySQLite module, which has the following key features: - Uses iterator-style SQLite 3.x API: sqlite3_compile, sqlite3_step() etc. This way, it is possible to use prepared statements, and for large resultsets, it requires less memory, because the whole resultset isn't fetched into memory at once any longer. - Completely incompatible with the SQLite 0.x/1.x API: I'm free to create a much better API now. - "In the face of ambiguity, refuse the temptation to guess." - PySQLite 1.x tries to "guess" which Python type to convert to. It's pretty good at it, because it queries the column type information. This works for, I'd say 90 % of all cases at least. But as soon as you use anything fancy like functions, aggregates or expressions in SQL, the _typeless_ nature of SQLite breaks through and it will tell us nothing about the declared column type (of course, because the data is not coming from a database column). So I decided to change the default behaviour and make PySQLite typeless by default, too. Everything will be returned as a Unicode string (the default might be user-configurable per connection). Unless, unless of course the user explicitly activates the "guess-mode" ;-) But to do so, she must read the docs then she will be aware of the fact that it only works in 90 % of all cases. So why am I bothering you about this? I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it. Especially those of you that have used PySQLite in the past, do you have any suggestions that would make the rewrite a better candidate for inclusion into Python? One problem I see is that even the new PySQLite will grow and try to wrap much of the SQLite API that are not directly related to the DB-API. If such a thing is too complicated/big for the standard library, then maybe it would be better to produce a much simpler PySQLite, especially for the Python standard library that leaves all the fancy stuff out. My codename would be "embsql". So, what would you like to see? "import sqlite", "import embsql", or "pypi.install('pysqlite')" ? -- Gerhard

Attachments:

signature.asc (application/pgp-signature — 189 bytes)

Show replies by date

M.-A. Lemburg

20 Oct 20 Oct

5:05 p.m.

Gerhard Haering wrote:

...

Hi python-dev-elopers,

Last December, we had a short thread discussing the integration of PySQLite into Python 2.4. At the time, I was against inclusion, because I thought PySQLite was not ripe for it, mostly because I thought the API was not stable.

[...]

I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.

SQLite is a gem and PySQLite works great, but I don't see why we should start adding third-party tools of this size (>38k LOC C code) to the standard Python distribution. Perhaps you we should consider adding only the Python interface and then ship a DLL with the Windows installer like we do for expat and the Sleepycat DBM ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 20 2004)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Barry Warsaw

5:16 p.m.

On Wed, 2004-10-20 at 13:05, M.-A. Lemburg wrote:

...

SQLite is a gem and PySQLite works great, but I don't see why we should start adding third-party tools of this size (>38k LOC C code) to the standard Python distribution.

Perhaps you we should consider adding only the Python interface and then ship a DLL with the Windows installer like we do for expat and the Sleepycat DBM ?!

Oh, maybe I misread Gerhard's post, but I definitely didn't expect him to do anything other than this! I'd be -1 on adding the SQLite code to Python, but +1 on shipping the wrapper module with the source code, and the DLL on Windows. -Barry

Bob Ippolito

5:18 p.m.

On Oct 20, 2004, at 13:05, M.-A. Lemburg wrote:

...

Gerhard Haering wrote:

...
Hi python-dev-elopers, Last December, we had a short thread discussing the integration of PySQLite into Python 2.4. At the time, I was against inclusion, because I thought PySQLite was not ripe for it, mostly because I thought the API was not stable. [...]

I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.

SQLite is a gem and PySQLite works great, but I don't see why we should start adding third-party tools of this size (>38k LOC C code) to the standard Python distribution.

I don't think he ever said that the SQLite source tree should go into Python. By default can mean that Python builds a SQLite wrapper if SQLite is available, just like it does for bsddb, readline, etc. Binary builds for Win32 and Mac should of course ship with a copy of SQLite for use by the PySQLite extension (w/ a dll or just statically linked in). Heck, Mac OS X 10.4 will be shipping with SQLite anyway <http://www.apple.com/macosx/tiger/unix.html>!

...

Perhaps you we should consider adding only the Python interface and then ship a DLL with the Windows installer like we do for expat and the Sleepycat DBM ?!

Python includes expat, doesn't it? -bob

M.-A. Lemburg

5:48 p.m.

Bob Ippolito wrote:

...

On Oct 20, 2004, at 13:05, M.-A. Lemburg wrote:

...
Gerhard Haering wrote:

...
Hi python-dev-elopers, Last December, we had a short thread discussing the integration of PySQLite into Python 2.4. At the time, I was against inclusion, because I thought PySQLite was not ripe for it, mostly because I thought the API was not stable. [...]

...
...
I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.

SQLite is a gem and PySQLite works great, but I don't see why we should start adding third-party tools of this size (>38k LOC C code) to the standard Python distribution.

I don't think he ever said that the SQLite source tree should go into Python. By default can mean that Python builds a SQLite wrapper if SQLite is available, just like it does for bsddb, readline, etc. Binary builds for Win32 and Mac should of course ship with a copy of SQLite for use by the PySQLite extension (w/ a dll or just statically linked in). Heck, Mac OS X 10.4 will be shipping with SQLite anyway <http://www.apple.com/macosx/tiger/unix.html>!

If that's what Gerhard meant, no objections.

...

...
Perhaps you we should consider adding only the Python interface and then ship a DLL with the Windows installer like we do for expat and the Sleepycat DBM ?!

Python includes expat, doesn't it?

True, but it didn't use to be included. The fact that our Fred Drake maintains it made the difference, I guess. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 20 2004)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Fred L. Drake, Jr.

7 p.m.

On Wednesday 20 October 2004 01:48 pm, M.-A. Lemburg wrote:

...

True, but it didn't use to be included. The fact that our Fred Drake maintains it made the difference, I guess.

That might have something to do with it, but that's certainly not the only thing, and not reason enough. The biggest reason to include at least basic XML support in the standard library is that new XML file formats are being used for supplemental data by a variety of applications, and it's reasonable for many of them to be handled behind the scenes by libraries that don't expose an XML-related API. If the application using the library itself doesn't require XML support, it really shouldn't need to worry about the fact that one of the libraries does. Making an XML parser and some basic APIs available in the Python standard library (SAX and DOM) works out to make life easier for people putting together applications that may end up touching XML indirectly (via some other library that hides it). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Bill Janssen

21 Oct 21 Oct

12:25 a.m.

...

Making an XML parser and some basic APIs available in the Python standard library (SAX and DOM) works out to make life easier for people putting together applications that may end up touching XML indirectly (via some other library that hides it).

Yes, yes, yes! And it should support XML 1.1 -- apparently the currently available Python tools don't (I'm told). Bill

Fred L. Drake, Jr.

12:38 a.m.

On Wednesday 20 October 2004 08:25 pm, Bill Janssen wrote:

...

Yes, yes, yes! And it should support XML 1.1 -- apparently the currently available Python tools don't (I'm told).

That's correct; no one has had time to update Expat to support the new specification. Patches welcome. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>

Skip Montanaro

12:14 a.m.

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc. Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think. Skip

Dennis Allison

12:35 a.m.

+1 from my point-of-view. The autobuild for wraopers is one of the very nice features of Python. On Wed, 20 Oct 2004, Skip Montanaro wrote:

...

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think.

Skip _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford....

Phillip J. Eby

12:52 a.m.

At 07:14 PM 10/20/04 -0500, Skip Montanaro wrote:

...

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think.

Well, one difference is that none of the databases you just listed are embeddable. There has to be a separate database server process. SQLite, like other "database" modules in the stdlib, just stores data in a disk file.

Skip Montanaro

3:53 a.m.

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc. >> Then why not MySQLdb, psycopg and sybase-python also? No slight >> intended against PySQLite, but those other wrapper modules have been >> around quite a bit longer I think. Phillip> Well, one difference is that none of the databases you just Phillip> listed are embeddable. There has to be a separate database Phillip> server process. SQLite, like other "database" modules in the Phillip> stdlib, just stores data in a disk file. It seems people misunderstood my comment. I should have been more clear. I see no reason PySQLite should be accorded better status than any of the other relational database wrappers. If MySQLdb, etc aren't included with the distribution I don't think PySQLite should be either. I realize it's easier to administer a PySQLite database than a PostgreSQL database, but from a pure client standpoint there's nothing really easier about it. By including PySQLite we'd somehow be blessing it as a better SQL solution than the other options. That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python. Skip

Carlos Ribeiro

4:06 a.m.

On Wed, 20 Oct 2004 22:53:30 -0500, Skip Montanaro <skip@pobox.com> wrote:

...

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

>> Then why not MySQLdb, psycopg and sybase-python also? No slight >> intended against PySQLite, but those other wrapper modules have been >> around quite a bit longer I think.

Phillip> Well, one difference is that none of the databases you just Phillip> listed are embeddable. There has to be a separate database Phillip> server process. SQLite, like other "database" modules in the Phillip> stdlib, just stores data in a disk file.

It seems people misunderstood my comment. I should have been more clear. I see no reason PySQLite should be accorded better status than any of the other relational database wrappers. If MySQLdb, etc aren't included with the distribution I don't think PySQLite should be either. I realize it's easier to administer a PySQLite database than a PostgreSQL database, but from a pure client standpoint there's nothing really easier about it. By including PySQLite we'd somehow be blessing it as a better SQL solution than the other options. That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python.

I think that I understand your argument -- in fact that was my first impression when the thread started. It sounds perfectly reasonable, but it really doesn't hold upon closer inspection. In a very similar situation, the presence of the SimpleHTTPServer on the library hasn't stopped anyone from using Apache, or from writing their own web server engines -- some as extensions of the standard module, some as replacements written from the scratch. Of course, webservers and database engines are different beasts, and Apache is what it is, a true benchmark -- but can't similar the same thing be said about MySQL or PostgreSQL (not to mention Oracle and other commercial offerings)? -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Bob Ippolito

4:15 a.m.

On Oct 20, 2004, at 23:53, Skip Montanaro wrote:

...

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

...
...
Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think.

Phillip> Well, one difference is that none of the databases you just Phillip> listed are embeddable. There has to be a separate database Phillip> server process. SQLite, like other "database" modules in the Phillip> stdlib, just stores data in a disk file.

It seems people misunderstood my comment. I should have been more clear. I see no reason PySQLite should be accorded better status than any of the other relational database wrappers. If MySQLdb, etc aren't included with the distribution I don't think PySQLite should be either. I realize it's easier to administer a PySQLite database than a PostgreSQL database, but from a pure client standpoint there's nothing really easier about it. By including PySQLite we'd somehow be blessing it as a better SQL solution than the other options. That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python.

By including expat are we blessing it as somehow a better solution than libxml2? PySQLite *is* a better choice for inclusion than the others: because the license permits, it's standalone, easy to use. and can be reasonably included with binary distributions of Python (it can even be linked statically into the extension). More or less any database module that's not embedded (except for ODBC, perhaps) is on shakier ground because the protocol can change between database versions, though I suppose that's not expected to happen very often for something like PostgreSQL. Also, MySQLdb is especially tricky because of the license. I can't imagine how that rather contrived scenario could reflect badly on Python or SQLite.. it certainly wouldn't be any worse than Python's standard library support for networking or XML, or the interpreter's inability to scale with threads. -bob

Skip Montanaro

2:23 p.m.

>> By including PySQLite we'd somehow be blessing it as a better SQL >> solution than the other options. That means it will almost certainly >> be stretched beyond its limits and used in situations where it isn't >> appropriate (multiple writers, writers that hold the database for a >> long time, etc). That will reflect badly on both SQLite and Python. Bob> I can't imagine how that rather contrived scenario could reflect Bob> badly on Python or SQLite. You assume it was contrived, but it wasn't at all. We hit exactly these problems almost upon first use. We were in the process of copying a large amount of data from our corporate Sybase database. Because SQLite's lock granularity is the entire file, the SQLite database was unusable until the entire update process was complete, even though many tables were completely updated long before the update process finished. We also encountered a major performance problem almost immediately. It seems that using BETWEEN is much worse (order of magnitude worse) than two comparison clauses using

...

=, <, etc.

We are in the process of deciding which server-based SQL solution to move to. Skip

Bob Ippolito

2:32 p.m.

On Oct 21, 2004, at 10:23, Skip Montanaro wrote:

...

...
...
By including PySQLite we'd somehow be blessing it as a better SQL solution than the other options. That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python.

Bob> I can't imagine how that rather contrived scenario could reflect Bob> badly on Python or SQLite.

You assume it was contrived, but it wasn't at all. We hit exactly these problems almost upon first use. We were in the process of copying a large amount of data from our corporate Sybase database. Because SQLite's lock granularity is the entire file, the SQLite database was unusable until the entire update process was complete, even though many tables were completely updated long before the update process finished. We also encountered a major performance problem almost immediately. It seems that using BETWEEN is much worse (order of magnitude worse) than two comparison clauses using

...
=, <, etc.

We are in the process of deciding which server-based SQL solution to move to.

The concurrency problem makes it sound like you were using SQLite 2.x, not SQLite 3.x. If it was SQLite 3.x, then you could've used separate files for each table: """ A limited form of table-level locking is now also available in SQLite. If each table is stored in a separate database file, those separate files can be attached to the main database (using the ATTACH command) and the combined databases will function as one. But locks will only be acquired on individual files as needed. So if you redefine "database" to mean two or more database files, then it is entirely possible for two processes to be writing to the same database at the same time. To further support this capability, commits of transactions involving two or more ATTACHed database are now atomic. """ ( from http://www.sqlite.org/version3.html -- see also http://www.sqlite.org/lockingv3.html ) -bob

Paul Moore

2:35 p.m.

On Thu, 21 Oct 2004 09:23:52 -0500, Skip Montanaro <skip@pobox.com> wrote:

...

You assume it was contrived, but it wasn't at all. We hit exactly these problems almost upon first use. We were in the process of copying a large amount of data from our corporate Sybase database.

Getting very off-topic here, but I'm surprised you considered SQLite in this situation, as an "equivalent" to Sybase. I'd certainly never seen it as catering for that sort of application. I see it more related to something like MS Access (which I hope no-one would consider for serious sized corporate applications - even though I know some people do :-() Of course, if that perception is common, then I think that the library documentation should be very clear about where SQLite is appropriate, and where it is not... Paul.

Skip Montanaro

3:44 p.m.

...

...
...
...
...
"Paul" == Paul Moore <p.f.moore@gmail.com> writes:

Paul> On Thu, 21 Oct 2004 09:23:52 -0500, Skip Montanaro <skip@pobox.com> wrote: >> You assume it was contrived, but it wasn't at all. We hit exactly >> these problems almost upon first use. We were in the process of >> copying a large amount of data from our corporate Sybase database. Paul> Getting very off-topic here, but I'm surprised you considered Paul> SQLite in this situation, as an "equivalent" to Sybase. Again, people assume lots about what we are doing. I'm not interested in getting into all the details here for many reasons, but I don't believe I said anything about SQLite/Sybase equivalency. I said we were copying a large amount of data from Sybase to SQLite. Paul> Of course, if that perception is common, then I think that the Paul> library documentation should be very clear about where SQLite is Paul> appropriate, and where it is not... Here we are coming back around to what I initially indicated. SQLite will be stretched beyond its limits very quickly. We certainly did (yes, we were using v2, but there is no v3 Python binding yet, right?). The absence of any indication what those limits are will shine a bad light on both SQLite and Python when things don't work as expected. I'm done with this thread. I've registered by concerns about adding PySQLite to the standard distribution. Skip

Ian Bicking

7:42 a.m.

Skip Montanaro wrote:

...

It seems people misunderstood my comment. I should have been more clear. I see no reason PySQLite should be accorded better status than any of the other relational database wrappers. If MySQLdb, etc aren't included with the distribution I don't think PySQLite should be either. I realize it's easier to administer a PySQLite database than a PostgreSQL database, but from a pure client standpoint there's nothing really easier about it. By including PySQLite we'd somehow be blessing it as a better SQL solution than the other options. That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python.

While I like the idea of SQLite wrappers in the standard library, I think this is a good point -- and indeed, a lot of people run up against the limits of SQLite at some point (e.g. PyPI). I think SQLite is a good transitional database, and as such it will often be a sufficient long-term choice, but for a lot of applications it will ultimately be too limiting. I am particularly concerned if the SQLite bindings become less like the other DB-API bindings, so that it is hard to port applications away from SQLite. Specifically, while the type coercion isn't perfect, it makes SQLite *much* more like other RDBMS's; I'd be bothered if by default SQLite acted significantly different than other databases. While the DB-API doesn't address this issue of return types, it's only an issue for SQLite, since all the other databases are typed. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org

Gerhard Haering

8:46 a.m.

On Thu, Oct 21, 2004 at 02:42:19AM -0500, Ian Bicking wrote:

...

Skip Montanaro wrote:

...
[Putting PySQLite into stdlib] That means it will almost certainly be stretched beyond its limits and used in situations where it isn't appropriate (multiple writers, writers that hold the database for a long time, etc). That will reflect badly on both SQLite and Python.

While I like the idea of SQLite wrappers in the standard library, I think this is a good point -- and indeed, a lot of people run up against the limits of SQLite at some point (e.g. PyPI).

Off-topic here, but that must have been PySQLite < 0.5, because since then, concurrent readers is no problem any longer. With SQLite3 btw., SQLite has much better concurrency support. And with "using it right", it can scale up a lot better now. But that's irrelevant here, IMO. My point is to include a usable DB-API 2.0 implementation that people can use as a starting point when developing applications that need a relational database. Other languages do the same btw. Java (win32?) includes a JDBC driver or ODBC, and PHP5 includes a SQLite module.

...

[...] I am particularly concerned if the SQLite bindings become less like the other DB-API bindings, so that it is hard to port applications away from SQLite. Specifically, while the type coercion isn't perfect, it makes SQLite *much* more like other RDBMS's; I'd be bothered if by default SQLite acted significantly different than other databases. While the DB-API doesn't address this issue of return types, it's only an issue for SQLite, since all the other databases are typed.

That's an important issue for me. And because I believe you guys here are good at creating good API designs I'd like to hear suggestions. (*) - Worse is better - stay with the old scheme that works in 90 % of all cases, but in 10 % lets the users be surprised and complain? - Stupid by default, which works 100%. If people want the "smart mode", then they need to read the docs and thus know its limitations. OTOH, the "stupid" behaviour is probably surprising too, but at least coherent. For those not so used to (Py)SQLite: All in all, SQLite *is* still typeless. (*) PySQLite builds all the type guessing on top of SQLite, but because of the limitations in the engine, it can't always guess right. WAIT! I *can* implement something that is smarter than always converting to unicode/string, and that is, I can ask the SQLite engine which type a column has, but the limitation is it will only return its internal types: #define SQLITE_INTEGER 1 #define SQLITE_FLOAT 2 #define SQLITE_TEXT 3 #define SQLITE_BLOB 4 #define SQLITE_NULL 5 As soon as you want anything more fancy, like DATE or TIMESTAMP, or BOOLEAN, or whatever, you need PySQLite support again. Would it be a good default for the standard library if the module only knew about these SQLite internal types? -- Gerhard

M.-A. Lemburg

9:26 a.m.

Gerhard Haering wrote:

...

My point is to include a usable DB-API 2.0 implementation that people can use as a starting point when developing applications that need a relational database. Other languages do the same btw. Java (win32?) includes a JDBC driver or ODBC, and PHP5 includes a SQLite module.

Note that JDBC and ODBC are database driver interfaces much like the Python DB API, not database drivers. You still need to add a JDBC or ODBC driver in order to talk to the database backend of your choice (just like you have to do with the DB API). Adding an SQLite interface goes beyond that since it is a database driver for a specific database backend. If you are just after a "usable database driver", then I have to agree with Skip: any of the other available drivers would fit in just as well. Please clarify this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2004)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Gerhard Haering

9:41 a.m.

On Thu, Oct 21, 2004 at 11:26:12AM +0200, M.-A. Lemburg wrote:

...

Gerhard Haering wrote:

...
My point is to include a usable DB-API 2.0 implementation that people can use as a starting point when developing applications that need a relational database. Other languages do the same btw. Java (win32?) includes a JDBC driver or ODBC, and PHP5 includes a SQLite module. [...] If you are just after a "usable database driver", then I have to agree with Skip: any of the other available drivers would fit in just as well. Please clarify this.

I'm aiming at a usable DB-API implementation in the stdlib that does not need a server. I want Python to have an RDBMS interface that works OOTB, no administration required. SQLite seems the obvious choice to me, haven't looked at Gadfly in a while, and MySQLdb/MySQL embedded (GPL) has licensing issues (and adds megabytes to the Python binary download, instead of ca. 270 kB uncompressed as for SQLite). -- Gerhard

Paul Moore

2:23 p.m.

On Thu, 21 Oct 2004 11:41:30 +0200, Gerhard Haering <gh@ghaering.de> wrote:

...

On Thu, Oct 21, 2004 at 11:26:12AM +0200, M.-A. Lemburg wrote:

...
Gerhard Haering wrote:

...
My point is to include a usable DB-API 2.0 implementation that people can use as a starting point when developing applications that need a relational database. Other languages do the same btw. Java (win32?) includes a JDBC driver or ODBC, and PHP5 includes a SQLite module. [...] If you are just after a "usable database driver", then I have to agree with Skip: any of the other available drivers would fit in just as well. Please clarify this.

I'm aiming at a usable DB-API implementation in the stdlib that does not need a server. I want Python to have an RDBMS interface that works OOTB, no administration required. SQLite seems the obvious choice to me, haven't looked at Gadfly in a while, and MySQLdb/MySQL embedded (GPL) has licensing issues (and adds megabytes to the Python binary download, instead of ca. 270 kB uncompressed as for SQLite).

I'm +1 on including PySQLite in the core. It would fit in the same space as Berkeley DB, *not* client-server databases like MySQL, PostgreSQL, Oracle, etc. However, it conforms to 2 important standards, SQL and the Python DB API, where Berkeley DB does not. This matters where people are looking for a more "portable" solution (whether that means scaling up to a full RDBMS at a later stage, or scaling *down* from such a thing, for a more standalone application, or just leveraging existing expertise). I don't think that the issue of batteries included vs easier package installation is relevant here - at the moment, Python *is* "batteries included". While a better package management solution is a laudable goal, until someone comes up and produces something, it doesn't affect the situation - when such a thing exists, I would assume that it would be appropriate to *un*bundle parts of the current stdlib (BSDDB, XML come to mind as "big" areas). Having to also unbundle PySQLite again shouldn't be too much of a chore. (If we're not willing to unbundle, the message is that the packaging solution is good enough for others, but not for "us" - a message I wouldn't feel happy supporting...) Paul.

Barry Warsaw

10:38 p.m.

On Thu, 2004-10-21 at 10:23, Paul Moore wrote:

...

I'm +1 on including PySQLite in the core. It would fit in the same space as Berkeley DB, *not* client-server databases

I don't think that the issue of batteries included vs easier package installation is relevant here - at the moment, Python *is* "batteries included". While a better package management solution is a laudable goal, until someone comes up and produces something, it doesn't affect the situation

I agree with both points. -Barry

Gregory P. Smith

28 Oct 28 Oct

8:40 a.m.

On Thu, Oct 21, 2004 at 03:23:28PM +0100, Paul Moore wrote:

...

On Thu, 21 Oct 2004 11:41:30 +0200, Gerhard Haering <gh@ghaering.de> wrote:

...
On Thu, Oct 21, 2004 at 11:26:12AM +0200, M.-A. Lemburg wrote:

...
Gerhard Haering wrote:

...
My point is to include a usable DB-API 2.0 implementation that people can use as a starting point when developing applications that need a relational database. Other languages do the same btw. Java (win32?) includes a JDBC driver or ODBC, and PHP5 includes a SQLite module. [...] If you are just after a "usable database driver", then I have to agree with Skip: any of the other available drivers would fit in just as well. Please clarify this.

I'm aiming at a usable DB-API implementation in the stdlib that does not need a server. I want Python to have an RDBMS interface that works OOTB, no administration required. SQLite seems the obvious choice to me, haven't looked at Gadfly in a while, and MySQLdb/MySQL embedded (GPL) has licensing issues (and adds megabytes to the Python binary download, instead of ca. 270 kB uncompressed as for SQLite).

I'm +1 on including PySQLite in the core. It would fit in the same space as Berkeley DB, *not* client-server databases like MySQL, PostgreSQL, Oracle, etc. However, it conforms to 2 important standards, SQL and the Python DB API, where Berkeley DB does not.

Agreed. Hopefully including it would encourage the random people who have found the undocumented bsddb.dbtables module to use something saner. :) Along the same lines of including PySQLite it'd also be nice to consider a good database object abstraction module such as SqlObject (http://sqlobject.sf.net/). Anything to encourage people -not- to write raw SQL inline in their code is a good thing (and makes the app much more readable and even more portable as SQL is only partially so).

...

I don't think that the issue of batteries included vs easier package installation is relevant here - at the moment, Python *is* "batteries included".

Also agreed. I personally think the pysqlite module bundled would get more use by more people than bsddb. -g

Oleg Broytmann

9:54 a.m.

New subject: SQLObject module for Python 2.5

On Thu, Oct 28, 2004 at 01:40:01AM -0700, Gregory P. Smith wrote:

...

Along the same lines of including PySQLite it'd also be nice to consider a good database object abstraction module such as SqlObject (http://sqlobject.sf.net/).

-1 for now. SQLObject shows a great promise, but the promise has to be fulfiled (yes, I did send a few patches)... Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

Ian Bicking

3:41 p.m.

Gregory P. Smith wrote:

...

Along the same lines of including PySQLite it'd also be nice to consider a good database object abstraction module such as SqlObject (http://sqlobject.sf.net/). Anything to encourage people -not- to write raw SQL inline in their code is a good thing (and makes the app much more readable and even more portable as SQL is only partially so).

SQLObject isn't really mature enough, and seems too complex to really be right for the standard library. Maybe something like dbrow, though. But most low-level database libraries or extensions really belong as part of the DB API, and the DB API is implement on a per-backend basis. Most high-level libraries are outside the scope of the standard library. dbrow is kind of an anomoly, being both low-level and fairly database-neutral. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org

Stuart Bishop

24 Oct 24 Oct

5:51 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 M.-A. Lemburg wrote: | Gerhard Haering wrote: | |> My point is to include a usable DB-API 2.0 implementation that people |> can use as a starting point when developing applications that need a |> relational database. Other languages do the same btw. Java (win32?) |> includes a JDBC driver or ODBC, and PHP5 includes a SQLite module. | | | Note that JDBC and ODBC are database driver interfaces much like | the Python DB API, not database drivers. You still need to add | a JDBC or ODBC driver in order to talk to the database backend | of your choice (just like you have to do with the DB API). | | Adding an SQLite interface goes beyond that since it is a | database driver for a specific database backend. | | If you are just after a "usable database driver", then I have to | agree with Skip: any of the other available drivers would fit in | just as well. Please clarify this. | I doubt that anything except Gadfly or SQLite could be built from a Python source distribution. The others, such as PostgreSQL or Oracle drivers, would require a chunk of the database distribution or licences to build and would lock you using a particular release of the database backend. The other product suitable for inclusion would be SQLRelay if its maintainers and its release cycle agree. If I remember correctly, Gadfly would have been in 2.4 if anybody had been able to commit to the long term maintenance of it, so I thought the 'should we' question had already been answered. - -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFBe0LRAfqZj7rGN0oRAqK1AJ4zX938kzA0dr3f52/fIBvfr+ymYgCggjnh DwvnLwRHny6n6UBHZhim9No= =/IG7 -----END PGP SIGNATURE-----

Barry Warsaw

21 Oct 21 Oct

10:36 p.m.

On Thu, 2004-10-21 at 04:46, Gerhard Haering wrote:

...

WAIT!

I *can* implement something that is smarter than always converting to unicode/string, and that is, I can ask the SQLite engine which type a column has, but the limitation is it will only return its internal types:

#define SQLITE_INTEGER 1 #define SQLITE_FLOAT 2 #define SQLITE_TEXT 3 #define SQLITE_BLOB 4 #define SQLITE_NULL 5

I think that would be a neat idea as a default. Still, I want what the MySQL python binding has -- a way to provide a mapping of column names to converters. IIRC, the interface for that was a bit clunky, but it was definitely usable, so it might be better to be consistent, than better. :) -Barry

Carlos Ribeiro

3:57 a.m.

On Wed, 20 Oct 2004 20:52:35 -0400, Phillip J. Eby <pje@telecommunity.com> wrote:

...

At 07:14 PM 10/20/04 -0500, Skip Montanaro wrote:

...
Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think.

Well, one difference is that none of the databases you just listed are embeddable. There has to be a separate database server process. SQLite, like other "database" modules in the stdlib, just stores data in a disk file.

Not to mention that it's a snap to install & manage -- no need for administrative accounts and complex daemon setup. Although this is really part of the 'embeddable' concept, it's still something worth noting on its own. Also its worth to note that its license is *much* Python-friendlier than almost every one of the other options, as far as the db engine itself is concerned. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

M.-A. Lemburg

8:16 a.m.

Skip Montanaro wrote:

...

Bob> By default can mean that Python builds a SQLite wrapper if SQLite Bob> is available, just like it does for bsddb, readline, etc.

Then why not MySQLdb, psycopg and sybase-python also? No slight intended against PySQLite, but those other wrapper modules have been around quite a bit longer I think.

SQLite plays in a different league: it is much more like Sleepycat DBM than a full-blown multi-user database engine where you'd use one of the many other database modules. But I understand what you're saying: by placing one of the many possible solutions into the distribution we would be playing Microsoft, in a sense, by branding this one solution as "better" simply because it's easier to use. However, I don't see this happening in the Python world. E.g. take a look at PyXML vs. the vast and healthy set of tools available through third-parties. Placing PyXML into the core hasn't killed off these external projects. Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2004)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Oleg Broytmann

8:49 a.m.

On Thu, Oct 21, 2004 at 10:16:01AM +0200, M.-A. Lemburg wrote:

...

Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution

-1

...

or should we focus more on making installation of third-party tools easier ?

+1 BTW, just installing is not enough, even when it is come with Python distribution. Installing a newer version of BerkeleyDB breaks older databases due to incompatible file formats. IMO, we should focus on installing AND upgrading. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

Jim Fulton

11:35 a.m.

New subject: Need better packaging (was Re: SQLite module for Python 2.5)

M.-A. Lemburg wrote: ...

...

Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I think you hit the nail on the head. For myself, I think we want to make it *much* easier to install 3rd-party tools. I also think we need to make it *much* easier to *update* the modules that come with Python without having to wait for a new Python release (much as it is easy to update packages in a linux distribution). While I sympathise with the "batteries included" philosophy, it has a number of drawbacks: - It makes technical decisions very risky. After all, once something is added to the Python distribution, it's had to take it out and it's hard to change it (see below). - It can favor some technologies unfairly - It can cause things to be included that shouldn't be. My favorite examples of this are asyncore and the Berkeley DB extensions. The former is no-longer supported by it's original author and causes an undue burden on the Python developers. The later is inadequately supported and causes instability. (For example, I can't use "make test" in a CVS checkout or the beta release, as Python seg-faults when it gets to the bdb tests.) - It actually stifles development of the library. It's hard to be motivated to improve library modules when the time between releases is sooooo long. Either: - You develop features that you won't be able to use for a year or more. (Large systems like Zope and Twisted can't rely on current versions of Python.) or - You have to use copies of future library modules. We are doing this now with doctest. We've had to do this in the past, with considerable difficulty, for CPickle and ayncore. Part of the problem is that library modules evolve at a different rate than the language. Often this is because library modules are new. Newer systems typically evolve faster than mature systems. Sometimes, mature modules, like doctest, suddenly experience a growth spurt. Finally, I think that the *real* needs that drive "batteries included" would be better served by a packaging system. A packaging system (ala cpan or rpm) would make it much easier for people to get the featureful Python's they need than the current system. (It could be argues that we should just use native packaging systems, like RPM. This approach has 2 serious problems. First, it doen't work on Windows. Second, it raises the packaging bar much higher, as packagers have to create separate packages for each target system. A Python-based packaging system would allow people to create packages that are usable on any platform that Python runs on.) IMO, one of the (if not *the*) most important Python development project is the development of a packaging system. I think there are a number of good starts toward this, such as PEP 262 and various efforts such as the mac-based packaging system and some work Fred Drake has done here at Zope Corp. It would be great to follow through with this and get to the point where its us far less important what happens to be included in a distribution. I'm committed to making this happen. ZC will eventually build something if no one else does. But it can happen *much* sooner if we all work together. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org

Brett C.

5:30 p.m.

New subject: Need better packaging (was Re: SQLite module for Python 2.5)

Jim Fulton wrote:

...

M.-A. Lemburg wrote: ...

...
Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I think you hit the nail on the head. For myself, I think we want to make it *much* easier to install 3rd-party tools. I also think we need to make it *much* easier to *update* the modules that come with Python without having to wait for a new Python release (much as it is easy to update packages in a linux distribution).

OK, so is everyone just talking about coming up with our own version of CPAN, or does this also include replacing Distutils? And for those of you not in the Mac world, MacPython already has something called packman that downloads 3rd-party apps and installs them by running their setup.py scripts. -Brett

Jim Fulton

6:04 p.m.

New subject: Need better packaging (was Re: SQLite module for Python 2.5)

Brett C. wrote:

...

Jim Fulton wrote:

...
M.-A. Lemburg wrote: ...

...
Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I think you hit the nail on the head. For myself, I think we want to make it *much* easier to install 3rd-party tools. I also think we need to make it *much* easier to *update* the modules that come with Python without having to wait for a new Python release (much as it is easy to update packages in a linux distribution).

OK, so is everyone

So far, we're far from everyone. :)

...

just talking about coming up with our own version of CPAN, or does this also include replacing Distutils?

I see this building on or working with distutils.

...

And for those of you not in the Mac world, MacPython already has something called packman that downloads 3rd-party apps and installs them by running their setup.py scripts.

Right, I refered to that in my note. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org

Brett C.

22 Oct 22 Oct

6:14 a.m.

New subject: Need better packaging (was Re: SQLite module for Python 2.5)

Jim Fulton wrote:

...

Brett C. wrote:

...
Jim Fulton wrote:

...
M.-A. Lemburg wrote: ...

...
Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I think you hit the nail on the head. For myself, I think we want to make it *much* easier to install 3rd-party tools. I also think we need to make it *much* easier to *update* the modules that come with Python without having to wait for a new Python release (much as it is easy to update packages in a linux distribution).

OK, so is everyone

So far, we're far from everyone. :)

I have a tendency of talking too broadly when it comes to referring to others as a group.

...

...
just talking about coming up with our own version of

...
CPAN, or does this also include replacing Distutils?

I see this building on or working with distutils.

OK.

...

...
And for those of you not in the Mac world, MacPython already has something called packman that downloads 3rd-party apps and installs them by running their setup.py scripts.

Right, I refered to that in my note.

Oops. Sorry. Been doing a networking assignment this week and it has thoroughly fried my brain. -Brett

A.M. Kuchling

21 Oct 21 Oct

12:34 p.m.

On Thu, Oct 21, 2004 at 10:16:01AM +0200, M.-A. Lemburg wrote:

...

Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I keep casting glances at PEP 206 (http://www.python.org/peps/pep-0206.html), Moshe's "Batteries Included" PEP, and thinking it should be updated. PEP 206 suggests something quite simple, a script that downloads a bunch of listed packages and added them to the Python core to make a "sumo distribution". The core once included an Extensions/ subdirectory for things that were automatically built as part of Python's build process. Perhaps we could add some extensibility -- e.g. Python's setup.py will automatically check for subdirectories in Extensions/ and run their setup.py scripts. --amk

Thomas Heller

1:18 p.m.

"A.M. Kuchling" <amk@amk.ca> writes:

...

On Thu, Oct 21, 2004 at 10:16:01AM +0200, M.-A. Lemburg wrote:

...
Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

I keep casting glances at PEP 206 (http://www.python.org/peps/pep-0206.html), Moshe's "Batteries Included" PEP, and thinking it should be updated. PEP 206 suggests something quite simple, a script that downloads a bunch of listed packages and added them to the Python core to make a "sumo distribution".

The core once included an Extensions/ subdirectory for things that were automatically built as part of Python's build process. Perhaps we could add some extensibility -- e.g. Python's setup.py will automatically check for subdirectories in Extensions/ and run their setup.py scripts.

It may be a crazy idea, but the more I switch platforms the more I have the impression that getting software via cvs (or svn, maybe) is the easiest thing. Couldn't pypi record the needed info for this? Thomas

A.M. Kuchling

1:22 p.m.

New subject: Library packaging (was: SQLite module)

On Thu, Oct 21, 2004 at 03:18:08PM +0200, Thomas Heller wrote:

...

It may be a crazy idea, but the more I switch platforms the more I have the impression that getting software via cvs (or svn, maybe) is the easiest thing.

It wouldn't help Windows users much. Sensible platforms that come with CVS/SVN usually have their own packaging mechanisms.

...

Couldn't pypi record the needed info for this?

It could be added. I'd like to add support for DOAP (http://usefulinc.com/doap) to PyPI at some point, which does include such information; it would make sense to add things that are in DOAP but not currently in PyPI. --amk

Carlos Ribeiro

6:02 p.m.

New subject: Fat vs lean distribution (was: SQLite)

On Thu, 21 Oct 2004 10:16:01 +0200, M.-A. Lemburg <mal@egenix.com> wrote:

...

Maybe we should have this discussion on more general grounds: do we really want a fat Python distribution or should we focus more on making installation of third-party tools easier ?

Let me toss my .02 on this. A lean distribution has one distinct feature: it not only allows, but _forces_ the programmer to make explicit choices about what extensions to use. Easy of installation from the network is not a big problem IMHO -- I particularly see no problem with downloading packages for my own development machine -- but *packaging for distribution* is much more important to allow for easy deployment. On the other hand, fat distributions tend to force people (consciously or not) to use only the standard library, if only to avoid problems when deploying the software, and even when the standard option is sub-optimal (MS JET is just one of many examples available). In this sense, I think that there is a market for a lean Python distribution, and a market for a fat distribution. In this scenario, the standard Python distribution should be kept more-or-less as it is -- with some polish, of course, and perhaps including a few extra modules (maybe even Pysqlite?). But it would still be fairly small, comparing with other language distributions and commercial products. The fat distribution (in my opinion) is a job for a commercial company. Perhaps someone as ActiveState could do it, or some of the commercial IDE makers. This distribution could include a bigger selection of modules and packages, including full client-server databsae engines, GUI libraries, UI designers, a report library -- the kind of stuff that makes the life of VB or Delphi programmers easier. The base language would still be the same, but the environment would be richer in features. Best for some developers, not so good for others. In this scenario, more than one company could offer their own 'framework', built on Python and using a different selection of libraries. The main problem here is that market for such frameworks has reduced considerably over the past decade, partly due to Microsoft's nearly absolute dominance (In this sense, it was an admirable feat that Sun managed to go so far with Java). In economic terms, it's not as attractive as it was a few years back, which may explain why such offering is still to materialize. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Gerhard Haering

8:03 a.m.

On Wed, Oct 20, 2004 at 07:05:05PM +0200, M.-A. Lemburg wrote:

...

Gerhard Haering wrote:

...
[...] I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.

SQLite is a gem and PySQLite works great, but I don't see why we should start adding third-party tools of this size (>38k LOC C code) to the standard Python distribution. [...]

I never had the faintest thought of merging the SQLite source tree into the Python one.

...

Perhaps you we should consider adding only the Python interface and then ship a DLL with the Windows installer like we do for expat and the Sleepycat DBM ?!

That's what I was proposing. -- Gerhard

Carlos Ribeiro

20 Oct 20 Oct

5:08 p.m.

On Wed, 20 Oct 2004 18:29:05 +0200, Gerhard Haering <lists@ghaering.de> wrote:

...

So, what would you like to see? "import sqlite", "import embsql", or "pypi.install('pysqlite')" ?

Warning: I'm still pretty inexperienced in the python-dev issues. I'm still figuring out how to behave, and how to become a good member of the community. So what follows is a personal opinion, and I hope not to be breaking any community laws by doing so. I'm a enthusiastic user of pysqlite, which I found to solve almost all my development needs without the administrative burden of 'real' rdbms. No only that, but it also has proven more than good enough for actual deployment in several situations. Quite a surprise, in fact. I sincerely believe that a standard rdbms in the standard Python library would greatly improve Python's acceptance and usability in several common scenarios. I also believe that pysqlite has what it takes to be included as such standard module. I also believe that this discussion should be held at the main Python list. There are many active Python developers that don't follow the python-dev lists that may have some interest on this topic.

...

One problem I see is that even the new PySQLite will grow and try to wrap much of the SQLite API that are not directly related to the DB-API. If such a thing is too complicated/big for the standard library, then maybe it would be better to produce a much simpler PySQLite, especially for the Python standard library that leaves all the fancy stuff out. My codename would be "embsql".

This is a big issue. If the sqlite library supports a bigger and richer API _as part of the standard Python library_, then everyone else (Python's end users and developers) will naturally expect that all other rdbms will support the same API. Have you posed this question to the DB-SIG people? -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Barry Warsaw

5:20 p.m.

On Wed, 2004-10-20 at 13:08, Carlos Ribeiro wrote:

...

This is a big issue. If the sqlite library supports a bigger and richer API _as part of the standard Python library_, then everyone else (Python's end users and developers) will naturally expect that all other rdbms will support the same API.

I don't agree. Every one of the FOSS RDBMs has a different C API so I wouldn't expect the Python module wrapping the native APIs to be anything but a more-or-less straight exposure of that -- with some liberties taken to Pythonify the APIs where appropriate, e.g. PyBSDDB has a done a very good job there. The common API is the DB-API, although it would be nice if there were more commonality amongst the various implementations. I do agree that further discussion probably belongs on other lists. -Barry

Gerhard Haering

21 Oct 21 Oct

8:05 a.m.

On Wed, Oct 20, 2004 at 01:20:25PM -0400, Barry Warsaw wrote:

...

I do agree that further discussion probably belongs on other lists.

Good idea. I'll repost my message on python-list. -- Gerhard

Gerhard Haering

8:50 a.m.

On Wed, Oct 20, 2004 at 02:08:32PM -0300, Carlos Ribeiro wrote:

...

[...] If the sqlite library supports a bigger and richer API _as part of the standard Python library_, then everyone else (Python's end users and developers) will naturally expect that all other rdbms will support the same API.

I don't believe people are so stupid. We should then clearly mark DB-API methods in the documentation, and the rest as nonstandard extensions. If you think it is a real problem, the nonstandard methods can even get a leading underscore. -- Gerhard

Barry Warsaw

20 Oct 20 Oct

5:13 p.m.

Background: I've had a lot of occasions to use SQLite and PySQLite recently and I've been so happy with the results that it is a very serious contender for the default storage in Mailman 3. This would replace BSDDB which the current code base uses. In fact I think the bsddb module support in Python's stdlib makes for a good comparison with Gerhard's proposal. On Wed, 2004-10-20 at 12:29, Gerhard Haering wrote:

...

- Uses iterator-style SQLite 3.x API: sqlite3_compile, sqlite3_step() etc. This way, it is possible to use prepared statements, and for large resultsets, it requires less memory, because the whole resultset isn't fetched into memory at once any longer.

Cool. BTW, this makes me wonder whether it might be time to work on a DBAPI 3 that takes advantages of some of the more recent developments in Python. Apologies if such an effort is already underway and I haven't seen it.

...

- Completely incompatible with the SQLite 0.x/1.x API: I'm free to create a much better API now.

It's both fun and scary to be able to "do it right this time" :)

...

- "In the face of ambiguity, refuse the temptation to guess." - PySQLite 1.x tries to "guess" which Python type to convert to. It's pretty good at it, because it queries the column type information. This works for, I'd say 90 % of all cases at least. But as soon as you use anything fancy like functions, aggregates or expressions in SQL, the _typeless_ nature of SQLite breaks through and it will tell us nothing about the declared column type (of course, because the data is not coming from a database column).

I'd also like to see something in the middle. For example, I know what the intended types of my columns are, so I'd like to be able to set up a mapping of converters and pass that to PySQLite. I'd get the best of both worlds then -- explicit type conversion (because it always uses my mappings) and automatic type conversion (because I'll get the target type back from PySQLite directly without having to apply conversions myself everywhere).

...

I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.

I'm for it. Again, because we have batteries-included support for BSDDB, I see no reason why we can't also have batteries-included support for SQLite. Both are embedded databases, so if you've got the libraries and headers laying around, it should be a snap to configure and build the module.

...

Especially those of you that have used PySQLite in the past, do you have any suggestions that would make the rewrite a better candidate for inclusion into Python?

A few. Having used the Python bindings for MySQL also, there are a few things in its interface that I'd like to see in PySQLite (but please correct me if they're there but I missed them!). The converter idea comes from mysql-python. I also liked their use of a 'how' argument to the fetch methods, which allowed me to retrieve the row as a dictionary. Very handy! I'm sure there are other nice features that I've forgotten about.

...

One problem I see is that even the new PySQLite will grow and try to wrap much of the SQLite API that are not directly related to the DB-API.

That's a /good/ thing, not a problem! :)

...

If such a thing is too complicated/big for the standard library, then maybe it would be better to produce a much simpler PySQLite, especially for the Python standard library that leaves all the fancy stuff out. My codename would be "embsql".

So, what would you like to see? "import sqlite", "import embsql", or "pypi.install('pysqlite')" ?

Personally, I'd like to see both a DB-API interface and a full-blown wrapping of the SQLite API. From what I can tell, it would be much smaller than the BSDDB wrapper, so it's potential size or complication doesn't bother me. i-might-even-be-a-guinea-pig-for-ya-ly y'rs, -Barry

Gustavo Niemeyer

5:22 p.m.

Hello Gerhard,

...

- Uses iterator-style SQLite 3.x API: sqlite3_compile, sqlite3_step() etc. This way, it is possible to use prepared statements, and for large resultsets, it requires less memory, because the whole resultset isn't fetched into memory at once any longer.

I'm anxiously waiting for the 3.x-based version! [...]

...

- "In the face of ambiguity, refuse the temptation to guess." - [...] So I decided to change the default behaviour and make PySQLite typeless by default, too. Everything will be returned as a Unicode string (the default might be user-configurable per connection).

I'm wondering if it would be possible to introduce a mechanism allowing one to *explicitly* set column conversion functions at query time. This would avoid having to manually convert rows on every access. [...]

...

So, what would you like to see? "import sqlite", "import embsql", or "pypi.install('pysqlite')" ?

Even though I'm a big fan of sqlite and pysqlite, my personal feeling is that SQL databases in general are better delivered as add-on modules. -- Gustavo Niemeyer http://niemeyer.net

Gerhard Haering

21 Oct 21 Oct

8:57 a.m.

On Wed, Oct 20, 2004 at 02:22:24PM -0300, Gustavo Niemeyer wrote:

...

...
[Me:] So I decided to change the default behaviour and make PySQLite typeless by default, too. Everything will be returned as a Unicode string (the default might be user-configurable per connection).

I'm wondering if it would be possible to introduce a mechanism allowing one to *explicitly* set column conversion functions at query time. This would avoid having to manually convert rows on every access. [...]

In fact, this feature will be in the new API. If the "guess mode" is active, then you will only have to set converters for columns that the "guesser" doesn't get right. For example: cu.typecasts = {"mycolumn": int} cu.execute("select a, b, c, func(a) as mycolumn from t1") This will replace the "-- types" hack, which will only return if there's real demand for it, for example if a ZOPE DA needs it. -- Gerhard

7283

Age (days ago)

7291

Last active (days ago)

List overview

Download

47 comments

21 participants

participants (21)

A.M. Kuchling
Barry Warsaw
Bill Janssen
Bob Ippolito
Brett C.
Carlos Ribeiro
Dennis Allison
Fred L. Drake, Jr.
Gerhard Haering
Gerhard Haering
Gregory P. Smith
Gustavo Niemeyer
Ian Bicking
Jim Fulton
M.-A. Lemburg
Oleg Broytmann
Paul Moore
Phillip J. Eby
Skip Montanaro
Stuart Bishop
Thomas Heller

SQLite module for Python 2.5

Gerhard Haering

Dennis Allison

Carlos Ribeiro

Gregory P. Smith

Oleg Broytmann

Carlos Ribeiro

Oleg Broytmann

Carlos Ribeiro

Carlos Ribeiro

Gustavo Niemeyer

tags

participants (21)