String performance regression from python 3.2 to 3.3
nhodgson at iinet.net.au
Sat Mar 16 23:00:32 CET 2013
> So while you might save memory by using "UTF-24" instead of UTF-32, it
> would probably be slower because you would have to grab three bytes at a
> time instead of four, and the hardware probably does not directly support
Low-level string manipulation often deals with blocks larger than
an individual character for speed. Generally 32 or 64-bits at a time
using the CPU or 128 or 256 using the vector unit. Then there may be
entry/exit code to handle initial alignment to a block boundary and
dealing with a smaller than block-size tail.
For an example of this kind of thing, see find_max_char in
python\Objects\stringlib\find_max_char.h which can examine a char* 32 or
64-bits at a time.
24-bit is likely to be a win in many circumstances due to decreased
memory traffic. a 12-bit implementation may also be worthwhile as the
low 0x1000 characters of Unicode contains Latin (with extensions),
Greek, Cyrillic, Arabic, Hebrew, and most Indic scripts.
More information about the Python-list