[DB-SIG] Type code mappings: expanding the type objects
Kevin Jacobs
jacobs at penguin.theopalgroup.com
Thu Jan 8 07:07:38 EST 2004
On Thu, 8 Jan 2004, M.-A. Lemburg wrote:
> Kevin Jacobs wrote:
> > On Thu, 8 Jan 2004, Federico Di Gregorio wrote:
> >> 3/ BOOLEAN is under NUMBER (this will probably give some problems to
> >> postgresql and other db using 't' and 'f' but there is really no
> >> simple solution and Python *do* use 1 and 0 for True and False (at
> >> least untill 2.3)
> >
> > SQL 92 and 99 are very clear that booleans and integers are not
> > interchangeable. The Python semantics should not be the driving factor
> > here.
>
> The question here is what you want to do with the inheritence
> information. Most (if not all) database modules return booleans/bits
> as 1/0 or True/False, so NUMBER would make sense if you're interested
> in what the application will see.
>
> SQL would make them a subclass of STRING.
I've in the process of moving to a different city, so I don't have my SQL99
and SQL200x drafts handy, but SQL seems to be moving rapidly in the
direction of having boolean types distinct from both STRING and NUMBER.
e.g., the "official" boolean literals are TRUE and FALSE, though various
backends implement varying degrees of backward compatibility with other
representations like 't' and 'f' for PostgreSQL. However, use of BOOLEAN
columns can only increase as more database vendors are enhancing their
products to be more standards compliant.
> > I'm somewhat sceptical about the WCHAR versions of STRING types. Python has
> > native support for unicode strings, so why maintain this artificial
> > distinction? The better solution would be to augment the column description
> > format to include character encoding information.
>
> Unicode and strings *are* different in Python and also handled
> differently in databases, so the distinction makes a lot of sense,
> e.g. strings are often subject to an encoding specified by the
> database, while Unicode does not have these deficiencies.
I agree, so long as we add language that stipulates that CHAR and VARCHAR
are always returned as undecoded str types with encoding information
available (unless we go farther and add support for string output encodings
too). Several "unicode-enabled" drivers, if I remember correctly, will
return unicode strings automagically if the default data encoding is not
ASCII.
> Add fields to .description is problematic. Applications tend
> to use tuple unpacking to access the tuples in that list and
> adding fields would break this.
>
> In general, using tuples for things can may be extended in
> the future is a bad idea. For DB API 3.0 we should probably
> switch to a list of ColumnDescription objects instead.
In the mean time, maybe we should support an extended_description
dictionary? A list seems unnecessarily -- well... -- linear.
-Kevin
--
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (440) 871-6725 x 19 E-mail: jacobs at theopalgroup.com
Fax: (440) 871-6722 WWW: http://www.theopalgroup.com/
More information about the DB-SIG
mailing list