[Python-porting] Details about the psycopg porting

Tue Jan 25 01:24:11 CET 2011

On Mon, Jan 24, 2011 at 4:21 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Daniele Varrazzo <daniele.varrazzo at ...> writes:
>> >> the data (bytes) from the libpq are passed to file.write() using
>> >> PyObject_CallFunction(func, "s#", buffer, len)”
>> >
>> > You shouldn't use "s#" as it will implicitly decode the buffer to unicode.
>> > Instead, use "y#" to write bytes.
>>
>> Yes, the #s is a leftover from before the conversion: I just have to
>> decide whether it's better to always emit bytes and break on text
>> files or if to check for the file capability. Because text mode is the
>> default for open() I think the former would be surprising: I'll go for
>> the second option if not overly complex (seems trivial if
>> PyTextIOBase_Type is available in C without the need of importing
>> anything from Python, annoying otherwise).
>
> No, you'll have to import. The actual TextIOBase ABC is declared in Python.
> (see Lib/io.py if you are curious)

Annoying, then :) Will give it a try.

>> >> bytea fields are returned as MemoryView, from which is easy to get bytes
>> >
>> > Is this because it is easier for you to return a memoryview? Otherwise it
> would
>> > make more sense to return a bytes object.
>>
>> In Py2 bytea is converted to buffer objects, passing through a "chunk"
>> object implementing the buffer interface. so yes, MemoryView is a more
>> direct port.
>
> Well, does it point to some external memory managed by pgsql itself? Otherwise
> bytes or bytearray would still be a better choice IMO (as in better-known and
> more practical). In 3.x there's no confusion between 8-bit strings and unicode
> strings, so use of an obscure type such as buffer() shouldn't be necessary.

Reviewing the code, the buffer object was probably used initially
because the memory is handled by the libpq. I will have a talk with
some heavy user of the bytea types (I am not, but people such as the
gnumed developers are) about what would be best choice for the library
users.

I want to avoid to introduce unnecessary changes for Py2 users, so the
buffer should stay unless we decide there are better options and it's
time for an uncompatible change. Having a radically different
interface for Py3 I fear would be a problem for people migrating from
Py2.

Thank you very much.

-- Daniele