[Python-Dev] Unicode charmap decoders slow

M.-A. Lemburg mal at egenix.com
Wed Oct 5 22:45:18 CEST 2005


Martin v. Löwis wrote:
> Walter Dörwald wrote:
> 
>>OK, here's a patch that implements this enhancement to 
>>PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939
> 
> Looks nice!

Indeed (except for the choice of the "map this character
to undefined" code point).

Hye-Shik, could you please provide some timeit figures for
the fastmap encoding ?

>>Creating the decoding_map as a string should probably be done by 
>>gencodec.py directly. This way the first import of the codec would be 
>>faster too.
> 
> 
> Hmm. How would you represent the string in source code? As a Unicode
> literal? With \u escapes, or in a UTF-8 source file? Or as a UTF-8
> string, with an explicit decode call?
> 
> I like the current dictionary style for being readable, as it also
> adds the Unicode character names into comments.

Not only that: it also allows 1-n and 1-0 mappings which was part
of the idea to use a mapping object (such as a dictionary) as basis
for the codec.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list