[Python-Dev] Unicode charmap decoders slow
"Martin v. Löwis"
martin at v.loewis.de
Wed Oct 5 00:08:45 CEST 2005
Walter Dörwald wrote:
>> This array would have to be sparse, of course.
>
>
> For encoding yes, for decoding no.
[...]
> For decoding it should be sufficient to use a unicode string of length
> 256. u"\ufffd" could be used for "maps to undefined". Or the string
> might be shorter and byte values greater than the length of the string
> are treated as "maps to undefined" too.
Right. That's what I meant with "sparse": you somehow need to represent
"no value".
> This might work, although nobody has complained about charmap encoding
> yet. Another option would be to generate a big switch statement in C
> and let the compiler decide about the best data structure.
I would try to avoid generating C code at all costs. Maintaining the
build processes will just be a nightmare.
Regards,
Martin
More information about the Python-Dev
mailing list