String performance regression from python 3.2 to 3.3
Neil Hodgson
nhodgson at iinet.net.au
Sat Mar 16 18:00:32 EDT 2013
Steven D'Aprano:
> So while you might save memory by using "UTF-24" instead of UTF-32, it
> would probably be slower because you would have to grab three bytes at a
> time instead of four, and the hardware probably does not directly support
> that.
Low-level string manipulation often deals with blocks larger than
an individual character for speed. Generally 32 or 64-bits at a time
using the CPU or 128 or 256 using the vector unit. Then there may be
entry/exit code to handle initial alignment to a block boundary and
dealing with a smaller than block-size tail.
For an example of this kind of thing, see find_max_char in
python\Objects\stringlib\find_max_char.h which can examine a char* 32 or
64-bits at a time.
24-bit is likely to be a win in many circumstances due to decreased
memory traffic. a 12-bit implementation may also be worthwhile as the
low 0x1000 characters of Unicode contains Latin (with extensions),
Greek, Cyrillic, Arabic, Hebrew, and most Indic scripts.
Neil
More information about the Python-list
mailing list