Damien Diederen <dd <at> crosstwine.com> writes:
I couldn't figure out a way to get rid of it short of multi-#including "templates" and playing with the C preprocessor, however, and have the nagging feeling the latter would be frowned upon by the maintainers.
There is a precedent with xmltok.c/xmltok_impl.c, though, so maybe I'm wrong about that. Should I give it a try, and see how "clean" the result can be made?
Keep in mind that json is externally maintained by Bob. The more we rework his code, the less easy it will be to backport other changes from the simplejson library.
I think we should either keep the code duplication (if we want to keep fast paths for both bytes and str objects), or only keep one of the two versions as my patch does.
Provided one of the alternatives is dropped, wouldn't it be better to do the opposite, i.e., have the decoder take bytes as input, and the encoder produce bytes—and layer the str functionality on top of that? I guess the answer depends on how the (most common) lower layers are structured, but it would be nice to allow a straight bytes path to/from the underlying transport.
The straightest path is actually to/from unicode, since JSON data can contain
unicode strings but no byte strings. Also, the json library /has/ to output
ensure_ascii is False. In 2.x:
json.dumps([u"éléphant"], ensure_ascii=False) u'["\xe9l\xe9phant"]'
In any case, I don't think it will matter much in terms of speed whether we take one route or the other. UTF-8 encoding/decoding is probably much faster (in characters per second) than JSON encoding/decoding is.