[I18n-sig] Proposal: Extended error handlingforunicode.encode
M.-A. Lemburg
mal@lemburg.com
Mon, 08 Jan 2001 16:52:14 +0100
Martin,
what is the point of these endless discussions about use-cases
(which you seem esp. fond of ;), design vs. API, Walter's proposal
and whether or not the codec design covers more general cases than
just encoding and decoding from and to Unicode ?
These discussions don't get us anywhere.
To summarize:
* the codec design was discussed at length early last year
* the design was chosen after many useful suggestions from people
who know what codecs have to deal with (e.g. Andy, Fredrik
(from the PIL-perspective BTW)) and others
* the design is written down in Misc/unicode.txt
* extending the design is OK, breaking APIs is not
* extending the design by adding parameters is OK, extending
the design by switching on parameter type is not
* I have no problem with extending the design
* Walter's proposal breaks the Unicode C API in untolerable ways;
I agree that the general idea is worth persuing though and
Walter's proposal has some good ideas into that direction
So where are we heading ?
* I will start to code a new error treatment option 'xml-escape'
which can then also be used as basis for other escape techniques
which might be of general use (e.g. 'unicode-escape')
* we should start thinking of ways to extend the existing C API
to allow providing a context object to the encoder/decoder. I've
already made a few suggestions into that direction; more are to
come once I find more time to work on this; other suggestions
are, of course, welcome too
* the new error handler extensions will be a post-2.1 feature
* a PEP is needed for the design (most people don't read endless
threads like these to catch up)
What the PEP should include:
* a proposal for extending the Unicode C API to provide an
extra context object to the encoder/decoder functions (which
are otherwise stateless)
* a hook for StreamWriters/Readers to use as standard error
handler in case 'callback' is used as error handling option
* the Python APIs .encode() and unicode() should be extended
by a third optional argument: the context object
* all builtin codecs should be extended to handle the new
scheme
* Codec.encode and .decode APIs should allow a context object as
additional optional argument; default should be None
* the changes must be 100% backward compatible, both at C
and at Python level
--
Marc-Andre Lemburg
______________________________________________________________________
Company: http://www.egenix.com/
Consulting: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/