[Patches] [ python-Patches-1443155 ] Incremental codecs for CJKCodecs

SourceForge.net noreply at sourceforge.net
Sat Mar 18 16:24:28 CET 2006


Patches item #1443155, was opened at 2006-03-04 19:45
Message generated for change (Comment added) made by doerwalter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1443155&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Modules
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Hye-Shik Chang (perky)
Summary: Incremental codecs for CJKCodecs

Initial Comment:
Here's a supplemental patch for SF #1436130 to
implement CJKCodecs part of the Incremental codec
specification. This patch is implemented in an
interface of Walter's fourth patch on #1436130. Please
test this whether it agrees the design.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-18 16:24

Message:
Logged In: YES 
user_id=89016

What other interpretation of the final parameter can we use
that doesn't make it completely useless? What about the
following: "If final is true the codec must encode/decode
the input completely and must flush all buffers. If this
isn't possible (e.g. because of incomplete byte sequences on
decoding) it must raise an exception (unless prevented by an
error handler)"?

----------------------------------------------------------------------

Comment By: Hye-Shik Chang (perky)
Date: 2006-03-16 13:32

Message:
Logged In: YES 
user_id=55188

1) Because CJKCodecs had an internal stateful framework, I
implemented just an interface using it for IncrementalCodec.
It treats final=True as a simple `flush' message(which
doesn't reset or terminate the codec). The behavior is quite
useful for real-time stream processing such as sockets and
tail log watchers. If we disallow that, such usages may
require its own sequence detectors.

For "to reset or not" issue, I think we can simply follow
how iconv does.  iconv doesn't reset the internal state for
iconv(ic, NULL, NULL, ..).


2) Aah.  I didn't notice that .errors is a part of public
API.  The current CJKCodecs can't support it easily yet. 
I'll fix it and upload a updated patch soon.  Thank you for
your review!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-15 13:08

Message:
Logged In: YES 
user_id=89016

The patch doesn't apply cleanly (conflicts in
Lib/test/test_multibytecodec.py and Tools/unicode/Makefile).
Could you update the patch?

I haven't looked at the C code to closely yet.

Two notes: 1) The tests often call incencoder.encode() or
incdecoder.decode() again after the method has been called
with final=True before. I'm not sure that this should be
allowed. If we allow it, it should be documented in what
state the codec is after calling with final=True (probably
it should be back to the initial state (i.e. like calling
reset())). 2) It seems to me that it isn't possible to
change the error handling during the lifetime of a codec.

Anyway, thanks for the quick patch.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1443155&group_id=5470


More information about the Patches mailing list