[Patches] [ python-Patches-1436130 ] Incremental codecs

SourceForge.net noreply at sourceforge.net
Wed Mar 15 13:14:55 CET 2006


Patches item #1436130, was opened at 2006-02-21 20:32
Message generated for change (Comment added) made by lemburg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1436130&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: Python 2.5
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: Incremental codecs

Initial Comment:
This patch extends the codec machinery to add
incremental codecs: stateful codecs that don't use a
stream API. It adds the following stuff: a class
codecs.CodecInfo (a subclass of tuple), that is used as
the return value of codecs.lookup();
codecs.IncrementalEncoder and codecs.IncrementalDecoder
(the basic interface classes),
codecs.BufferedIncrementalDecoder (a class that can be
used to implement decoders that must handle incomplete
input); codecs.iterencode() and codecs.iterdecode()
(generators that use the incremental codecs for
encoding/decoding an input iterable). On the C level
PyCodec_IncrementalEncoder() and
PyCodec_IncrementalDecoder() are added.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-15 13:14

Message:
Logged In: YES 
user_id=38388

It's only the coding style that looks a bit funny. 

Requiring CodecInfo objects is not a good idea: that way
you'd make it impossible to write codecs that work in both
Python 2.5 and 2.4.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-15 12:43

Message:
Logged In: YES 
user_id=89016

Checked in as r43045.

Now what do we do with the funny code in
encoding.search_function()? Of course we could always
*require* the search function to return a CodecInfo object.
(but only after the CJK codecs are updated, and even then we
should have some form of backwards compatibility).

----------------------------------------------------------------------

Comment By: Hye-Shik Chang (perky)
Date: 2006-03-15 12:28

Message:
Logged In: YES 
user_id=55188

1449471 isn't related to incremental codecs.  It includes a
simple patch to visual studio project file.

I think Walter is right person to review 1443155 whether it
conforms his interface design. :-) (Thank you in advance!)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-15 12:13

Message:
Logged In: YES 
user_id=38388

The patch looks OK, accept for some minor glitches such as
this mess :-) ...

+    if not isinstance(entry, codecs.CodecInfo):
+        if not 4 <= len(entry) <= 7:
+             raise CodecRegistryError,\
+                  'module "%s" (%s) failed to register' % \
+                   (mod.__name__, mod.__file__)
+        if not callable(entry[0]) or \
+           not callable(entry[1]) or \
+           (entry[2] is not None and not
callable(entry[2])) or \
+           (entry[3] is not None and not
callable(entry[3])) or \
+           (len(entry) > 4 and entry[4] is not None and not
callable(entry[4])) or \
+           (len(entry) > 5 and entry[5] is not None and not
callable(entry[5])):
             raise CodecRegistryError,\
-                  'incompatible codecs in module "%s" (%s)' % \
-                  (mod.__name__, mod.__file__)
+                'incompatible codecs in module "%s" (%s)' % \
+                (mod.__name__, mod.__file__)
+        if len(entry)<7 or entry[6] is None:
+            entry += (None,)*(6-len(entry)) +
(mod.__name__.split(".", 1)[1],)
+        entry = codecs.CodecInfo(*entry)

Nevertheless, it can be cleaned up after checkin, so please
go ahead with it.

Regarding the idna.py patch, I think you should create a new
patch item for it and assign it to Martin.

Thanks.

Neal, I don't have time to review the two CJK patches.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2006-03-15 09:01

Message:
Logged In: YES 
user_id=33168

MAL, do you have any more issues with this patch?  Should it
be assigned to Martin?

MAL, Walter, can you review these patches 1443155 1449471
which I think are related?  Should they go in?

The first alpha is coming up soon and I'd like to get these
patches in ASAP.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-03 18:39

Message:
Logged In: YES 
user_id=89016

This fourth version of the patch removes the changes to 
Lib/encodings/idna.py (only the addition of the 
IncrementalEncoder/IncrementalDecoder and the changed 
getregentry() remain). This patch to idna.py probably only 
makes sense once this patch is in.

> Is it possible to make IncrementalEncoder/Decoder
> instances iterable per-se (without the need to go
> through the helper functions iterencode/iterdecode) ?

For IncrementalEncoder/Decoder to be iterable it would have 
to have some iterable from which it gets the input. But 
this has the same limitation as the stream API: The user is 
forced to provide the input as a service that the 
encoder/decoder uses, which requires support for a certain 
API. The only change would be that now it's an iterator API 
instead of a stream API.

The incremental codecs invert the call logic: The user no 
longer has to provide a callback service to the codec, but 
calls the codec directly. This gives much more flexibility.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-03 00:03

Message:
Logged In: YES 
user_id=38388

Very nice ! 

This is a much better approach than the feed style path you
wanted to take previously.

Minor nits:

Please separate out the non-related changes to the IDNA
codec into a new patch and assign that to Martin for review.

Is it possible to make IncrementalEncoder/Decoder instances
iterable per-se (without the need to go through the helper
functions iterencode/iterdecode) ?

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-01 15:47

Message:
Logged In: YES 
user_id=89016

This third version of the patch fixes the bug when the
iterator in iterencode() or iterdecode() is empty and
updates the docstring in encodings/__init__.py.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-02-28 13:08

Message:
Logged In: YES 
user_id=89016

This second version of the patch enhances
codecs.iterencode() and codecs.iterdecode(), so that
additional keyword arguments are passed through to the
Incremental(De|En)coder constructor.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1436130&group_id=5470


More information about the Patches mailing list