[DB-SIG] Should Binary accept unicode string?
mal at egenix.com
Fri Jan 15 11:15:52 EST 2016
On 15.01.2016 16:52, Mike Bayer wrote:
> On 01/15/2016 09:47 AM, M.-A. Lemburg wrote:
>> On 12.01.2016 13:59, INADA Naoki wrote:
>>> Hi, all.
>>> I found DB-API 2.0 defines Binary() as Binary(string).
>>> What the string means?
>>> On Python 2, should Binary accept unicode?
>>> On Python 3, should Binary accept str?
>> The Binary() wrapper is intended to provide extra information
>> for the database module and marks the intent of the user to have
>> the input parameter be bound to the binding parameter as
>> binary rather than text (e.g. VARBINARY rather than VARCHAR).
>> For Python 2, you'd probably use something like Binary=buffer.
>> On Python 3, Binary=bytes or Binary=bytearray seem like a natural
>> The choice of possible input parameters for Binary() is
>> really up to the database module author.
> I still don't understand this philosophy of pep-249. Allowing DBAPIs
> to arbitrarily decide how strict / loose they want to be for
> user-defined data passed to even very well known datatypes has a
> negative impact on portability. It means that code I write for one
> DBAPI will fail on another. Is it your view that databases and DBAPIs
> are so fundamentally different, even for basic things like
> unicodes/bytes, that attempting to provide for portability is hopeless?
> Why even have a pep-249 if I should expect that I have to rewrite my
> whole application when switching DBAPIs anyway?
> Obviosly, full portability between DBAPIs and databases is never going
> to be possible. But for easy things where a pro-portability decision is
> clearly very feasible, like, "do / don't accept a unicode object for a
> bytes type", why can't a decision be made?
I think you are misunderstanding the purpose of the Binary() helper:
This was added as a portable way to tell the database interface
to bind data as binary to the parameter, nothing more.
Since some database modules rely on the Python type of the
input parameters to tell whether to bind as binary,
character, numeric, etc., but did have the distinction between
binary and text data in Python 2, as we now do in Python 3,
the Binary() wrapper was added to make the distinction clear.
The types Binary() allows as input are not part of the DB-API,
just like we don't make any comments about the allowed input
types for any other parameter type the database interface
This adds flexibility and makes it possible to create
interface modules which support a great deal more than
just a few standard Python data types.
Back to the choices I mentioned for Binary():
In Python 2, buffer() does allow unicode objects
on input, and what you as result corresponds to the binary
representation of the unicode object as used by Python.
In Python 3, bytes() requires to be more specific and you
need to provide an encoding. The result is an encoded binary
version of the text input.
Both are reasonable choices for a Binary() wrapper and
the result is easy to detect as "bind me as binary data"
for the database module.
It is not uncommon to convert text data to
binary data for storage, esp. when dealing with larger
blobs you just want to manage and not work on, or when
you want to preserve it in exactly the same form you pass
it to the database (without any implicit normalizations,
surrogate conversions, warnings, etc.).
Professional Python Services directly from the Experts (#1, Jan 15 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the DB-SIG