[Patches] [ python-Patches-1436130 ] Incremental codecs
SourceForge.net
noreply at sourceforge.net
Wed Mar 15 13:14:55 CET 2006
Patches item #1436130, was opened at 2006-02-21 20:32
Message generated for change (Comment added) made by lemburg
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1436130&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: Python 2.5
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: Incremental codecs
Initial Comment:
This patch extends the codec machinery to add
incremental codecs: stateful codecs that don't use a
stream API. It adds the following stuff: a class
codecs.CodecInfo (a subclass of tuple), that is used as
the return value of codecs.lookup();
codecs.IncrementalEncoder and codecs.IncrementalDecoder
(the basic interface classes),
codecs.BufferedIncrementalDecoder (a class that can be
used to implement decoders that must handle incomplete
input); codecs.iterencode() and codecs.iterdecode()
(generators that use the incremental codecs for
encoding/decoding an input iterable). On the C level
PyCodec_IncrementalEncoder() and
PyCodec_IncrementalDecoder() are added.
----------------------------------------------------------------------
>Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-15 13:14
Message:
Logged In: YES
user_id=38388
It's only the coding style that looks a bit funny.
Requiring CodecInfo objects is not a good idea: that way
you'd make it impossible to write codecs that work in both
Python 2.5 and 2.4.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-15 12:43
Message:
Logged In: YES
user_id=89016
Checked in as r43045.
Now what do we do with the funny code in
encoding.search_function()? Of course we could always
*require* the search function to return a CodecInfo object.
(but only after the CJK codecs are updated, and even then we
should have some form of backwards compatibility).
----------------------------------------------------------------------
Comment By: Hye-Shik Chang (perky)
Date: 2006-03-15 12:28
Message:
Logged In: YES
user_id=55188
1449471 isn't related to incremental codecs. It includes a
simple patch to visual studio project file.
I think Walter is right person to review 1443155 whether it
conforms his interface design. :-) (Thank you in advance!)
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-15 12:13
Message:
Logged In: YES
user_id=38388
The patch looks OK, accept for some minor glitches such as
this mess :-) ...
+ if not isinstance(entry, codecs.CodecInfo):
+ if not 4 <= len(entry) <= 7:
+ raise CodecRegistryError,\
+ 'module "%s" (%s) failed to register' % \
+ (mod.__name__, mod.__file__)
+ if not callable(entry[0]) or \
+ not callable(entry[1]) or \
+ (entry[2] is not None and not
callable(entry[2])) or \
+ (entry[3] is not None and not
callable(entry[3])) or \
+ (len(entry) > 4 and entry[4] is not None and not
callable(entry[4])) or \
+ (len(entry) > 5 and entry[5] is not None and not
callable(entry[5])):
raise CodecRegistryError,\
- 'incompatible codecs in module "%s" (%s)' % \
- (mod.__name__, mod.__file__)
+ 'incompatible codecs in module "%s" (%s)' % \
+ (mod.__name__, mod.__file__)
+ if len(entry)<7 or entry[6] is None:
+ entry += (None,)*(6-len(entry)) +
(mod.__name__.split(".", 1)[1],)
+ entry = codecs.CodecInfo(*entry)
Nevertheless, it can be cleaned up after checkin, so please
go ahead with it.
Regarding the idna.py patch, I think you should create a new
patch item for it and assign it to Martin.
Thanks.
Neal, I don't have time to review the two CJK patches.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2006-03-15 09:01
Message:
Logged In: YES
user_id=33168
MAL, do you have any more issues with this patch? Should it
be assigned to Martin?
MAL, Walter, can you review these patches 1443155 1449471
which I think are related? Should they go in?
The first alpha is coming up soon and I'd like to get these
patches in ASAP.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-03 18:39
Message:
Logged In: YES
user_id=89016
This fourth version of the patch removes the changes to
Lib/encodings/idna.py (only the addition of the
IncrementalEncoder/IncrementalDecoder and the changed
getregentry() remain). This patch to idna.py probably only
makes sense once this patch is in.
> Is it possible to make IncrementalEncoder/Decoder
> instances iterable per-se (without the need to go
> through the helper functions iterencode/iterdecode) ?
For IncrementalEncoder/Decoder to be iterable it would have
to have some iterable from which it gets the input. But
this has the same limitation as the stream API: The user is
forced to provide the input as a service that the
encoder/decoder uses, which requires support for a certain
API. The only change would be that now it's an iterator API
instead of a stream API.
The incremental codecs invert the call logic: The user no
longer has to provide a callback service to the codec, but
calls the codec directly. This gives much more flexibility.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2006-03-03 00:03
Message:
Logged In: YES
user_id=38388
Very nice !
This is a much better approach than the feed style path you
wanted to take previously.
Minor nits:
Please separate out the non-related changes to the IDNA
codec into a new patch and assign that to Martin for review.
Is it possible to make IncrementalEncoder/Decoder instances
iterable per-se (without the need to go through the helper
functions iterencode/iterdecode) ?
Thanks.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-01 15:47
Message:
Logged In: YES
user_id=89016
This third version of the patch fixes the bug when the
iterator in iterencode() or iterdecode() is empty and
updates the docstring in encodings/__init__.py.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2006-02-28 13:08
Message:
Logged In: YES
user_id=89016
This second version of the patch enhances
codecs.iterencode() and codecs.iterdecode(), so that
additional keyword arguments are passed through to the
Incremental(De|En)coder constructor.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1436130&group_id=5470
More information about the Patches
mailing list