[I18n-sig] Test Suite for the Unicode codecs

M.-A. Lemburg mal@lemburg.com
Sat, 01 Apr 2000 20:43:05 +0200

Andy Robinson wrote:
> On Sat, 01 Apr 2000 00:15:53 +0200, you wrote:
> >I would like to add some more testing to the mapping codecs
> >in the Python encodings package. Right now I can only test
> >for round-trips of lower character ordinal ranges and even
> >those tests fail for a couple of encodings.
> >
> >Does anyone have access to some reference test suite for
> >these mappings ? The mapping codec is probably not the
> >cause for these errors. Perhaps the maps themselves
> >aren't of high enough quality or maybe some mappings
> >just cannot provide round-trip safety...
> >
> I can't give specifics off the top of my head, but mappings not giving
> round trips is quite common, especially with corporate character sets.
> We always handled this by framing questions differently and saying
> 'what is the subset of a map that gives a full round-trip, and which
> bits of my data fall outside it', and trying to get some printed code
> chart to show the results; then you can quickly see if the results
> make sense.  If you have that knowledge, you could then build
> assertions into a python-only test suite.

That would be great of course... but how do we get native
script readers for all those code pages ?
> For testing, I think the best approach is to compare output to another
> well-known mapping utility.  The most convenient I know of is
> uniconv.exe from http://www.basistech.com/ - not Open Source and
> Windows-only, but it is a straightforward goal for us to write a
> uniconv.py that perfectly mimics its behaviour.

Ok, I've just downloaded it (it's a bit hidden as Demo of
their C++ Unicode class lib) and will give it a try next week.
> I'm in the middle of a 'work crisis' at the moment, and I know I'm not
> really pulling my weight.  Does anyone have a few hours to help out
> with testing?  If so I could outline the kind of test program that
> would help us quickly validate the existing mappings, and help with
> any new ones.
> Marc-Andre, do you have any preferences for where a test suite and
> bunch of add-on tools live?  Do you want something which fits into the
> standard distribution, or can we handle it outside?

Hmm, tests for the builtin codecs should live in Lib/test
with the output in Lib/test/output. Tools etc. are probably
best placed somewhere into the Tools/ directory (e.g. the
gencodec.py script lives in Tools/scripts). Perhaps we need
a separate Tools/unicode if there are going to many different

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/