On 11/04/2009 6:12 PM, Antoine Pitrou wrote:
Martin v. Löwis<martin<at> v.loewis.de> writes:
Not sure whether it would be significantly faster, but yes, Bob wrote an accelerator for parsing out of a byte string to make it really fast; IIRC, he claims that it is faster than pickling.
Isn't premature optimization the root of all evil?
Besides, the fact that many values in a typical JSON object will be strings, and must be encoded from/decoded to unicode objects in py3k, suggests that accepting/outputting unicode as default is the laziest (i.e. the best) choice performance-wise.
I don't see it as premature optimization, but rather trying to ensure the interface/api best suits the actual use cases.
But you don't have to trust me: look at the quick numbers I've posted. The py3k version (in the str-only incarnation I've proposed) is sometimes actually faster than the trunk version: http://mail.python.org/pipermail/python-dev/2009-April/088498.html
But if all actual use-cases involve moving to and from utf8 encoded bytes, I'm not sure that little example is particularly useful. In those use-cases, I'd be surprised if there wasn't significant time and space benefits in not asking apps to use an 'intermediate' string object before getting the bytes they need, particularly when the payload may be a significant size.
Assuming the above is all true, I'd see choosing bytes less as a premature optimization and more a design choice which best supports actual use. So to my mind the only real question is whether the above is true, or if there are common use-cases which don't involve utf8-off/on-the-wire...