[Python-Dev] Decoding incomplete unicode

M.-A. Lemburg mal at egenix.com
Tue Jul 27 23:43:28 CEST 2004

Martin v. Löwis wrote:
> M.-A. Lemburg wrote:
>> I like the idea, but don't think the implementation is
>> the right way to do it. Instead, I'd suggest using a new
>> error handling strategy "break" ( = break processing as
>> soon as errors are found).
> Can you demonstrate this approach in a patch? I think it
> is unimplementable: the codec cannot communicate to the
> error callback that it ran out of data.

The codec can: the callback gets all the necessary information
and can even manipulate the objects being worked on.

But you have a point: the current implementations of the
various encode/decode functions don't provide interfaces to
report back the number of bytes read at C level - the codecs
module wrappers add these numbers assuming that all bytes
were read.

The error callbacks could, however, raise an exception which
includes all the needed information, including any state that
may be needed in order to continue with coding operation.

We may then need to allow additional keyword arguments on the
encode/decode functions in order to preset a start state.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Jul 27 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

More information about the Python-Dev mailing list