relational database?

M.-A. Lemburg mal at lemburg.com
Wed Apr 3 08:05:37 EST 2002


Gerhard Häring wrote:
> 
> * M.-A. Lemburg <mal at lemburg.com> [2002-04-02 19:49 +0200]:
> > SAP DB is fast, full-featured, open-source and very stable.
> 
> Open-source it sure is, but that doesn't mean that mere mortals can
> compile it or even port it to not currently supported platforms like
> FreeBSD.

That's true. Their compile system is rather complicated, due to many
files being generated on-the-fly and various third-party tools
which massage the code into compilable bits -- Python is among
them, so at least this part should be comprehensible ;-)
 
> > Note that Unicode support in mxODBC is available both as emulation
> > (mxODBC does the conversion to e.g.  UTF-8) and using native support
> > (the ODBC driver has to support Unicode).
> 
> Very interesting. This Unicode stuff isn't specified in a new draft of
> the DB-API, is it?

No.
 
> I'm planning to add Unicode (and perhaps non-default 8-bit charset)
> support to pyPgSQL. In my first tries, I found that there are really two
> different problems:
> 
> - getting Unicode strings as parameters of execute into the database,
>   like:
> 
>   cursor.execute("insert into test(col) values('%s'" % \
>         u"someUnicodeString")
> 
>   This doesn't look difficult.

I think that this part is really user-level programming. What's
more interesting is:

cursor.execute('insert into test (col) values (?)', (u"JapaneseText",))

There are three possibilities to choose from:
1. the DB can handle Unicode, then pass through the data as-is
2. the DB doesn't handle Unicode, but does use a predefined code-page,
   then convert the Unicode data to that code-page
3. the DB doesn't handle Unicode, and doesn't have a predefined code-page,
   then convert the Unicode data to a connection specific encoding

1. and 3. are implemented in mxODBC 2.0. 2. is currently not 
implemented since it requires fetching the code page data from the
database which is not always possible.
 
> - getting Unicode out of the database: My first idea was to add a flag
>   to the connection constructor that indicates wether to returns strings
>   as Python strings or Unicode strings. Any better ideas?

Nope. That's how mxODBC does it too. Connections and cursors
have a .stringformat attribute which can be set to haev mxODBC
return 8-bit strings, mixed 8-bit/Unicode (depending on the
information returned by the DB) or all Unicode.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/




More information about the Python-list mailing list