
Hye-Shik Chang wrote:
If the encoding optimization can be easily done in Walter's approach, the fastmap codec would be too expensive way for the objective because we must maintain not only fastmap but also charmap for backward compatibility.
IMO, whether a new function is added or whether the existing function becomes polymorphic (depending on the type of table being passed) is a minor issue. Clearly, the charmap API needs to stay for backwards compatibility; in terms of code size or maintenance, I would actually prefer separate functions. One issue apparently is people tweaking the existing dictionaries, with additional entries they think belong there. I don't think we need to preserve compatibility with that approach in 2.5, but I also think that breakage should be obvious: the dictionary should either go away completely at run-time, or be stored under a different name, so that any attempt of modifying the dictionary gives an exception instead of having no interesting effect. I envision a layout of the codec files like this: decoding_dict = ... decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict) I think it should be possible to build efficient tables in a single pass over the dictionary, so startup time should be fairly small (given that the dictionaries are currently built incrementally, anyway, due to the way dictionary literals work). Regards, Martin